Saturday, December 14, 2013

Ada 2005 access types, part I

If you've read the previous series of articles, then you are up to date with how Ada95 deals with so-called "general" access types.  You've got your accessibility checks that fence a local access value inside its type's scope, and you've got your anonymous access values that let the access value out on a leash.  Inside the fence, you can run free and do whatever you want, except you can't get out.  If you want to go out, you can only do so on a leash, in which case you are extremely restricted in your movement.  Put together, you've got a nice neat "either/or" system that ensures your pointers never get lost or hit by a car.

Fast forward a decade to Ada 2005.  One of the major updates in the language, warranting an entire chapter in the rationale, was the expansion of the anonymous access type.  Unfortunately, history has not exactly been kind to this 'feature', and many (most?) programmers treat them with disdain.

But the Ada05 AAT's suffer much of the same fate of Ada95 AAT's: poor documentation and worse understanding.  Just like Ada95, Ada05 anonymous access types are special-purpose features for special-purpose jobs, and yet they are not described this way.  Ada05 actually suffers worse, since there are no less than three new special kinds of access values, all serving different purposes, all of which are different than the Ada95 versions, and all of which are lumped together as a single topic.  Mix in the fact that they all have identical syntax, and you've got a lot of confused programmers.

And that really is the downfall of Ada's AAT's: they all have the same syntax, but all serve completely orthogonal purposes.  And while this might be elegant and simple from a language design standpoint, having the same thing mean different things in different contexts is a recipe for disaster.  But I digress. 

Not counting the "null exclusions", which are straightforward and won't be discussed, Ada05 introduced three new kinds of anonymous access types.  Note that these are the names that I've given them, and not necessarily what the ARG says about them:
  1. The anonymous access component.
  2. The anonymous function return.
  3. The anonymous access to subprogram parameter.
Again, to the layperson, all three seem identical to both each other and the Ada95 access parameter and access discriminant.  Each involves using an "in line" access type declaration instead of a named access type.  But don't be fooled: each one is there for a specific reason, each of which will be discussed in upcoming posts.

First up: the anonymous access component.  The brochure version of this feature is that now you can use anonymous access types in record components, array elements, and stand-alone objects.  But why?  To understand why this feature was added (and when, exactly, you should consider using it), like the cast-able access parameters before them, we have to once again talk about object oriented programming.

For those who deal in the domain of OOP, a fundamental tenant is that child types can be freely substituted for parent types.  This is implemented in Ada via the 'classwide' type, which represents not only the type in question, but any of the types that derive from it.  For example, if C extends P, then any code that requires a P'Class can freely use a C or a P, instead of just a P.  This is the quintessential "is-a" relationship.

This setup works well, except for one issue: the same cannot be said of pointers.  For example, let's suppose a serial killer with a penchant for software engineering is interested in modeling his victims.  This is neatly decomposed (ha!) into a class hierarchy:

type Victim is interface;
type Victim_Ptr is access all Victim'Class;
procedure Dismember (This : Victim);

type Hooker is new Victim with (...)
type Hooker_Ptr is access all Hooker;

type Vagrant is new Victim with (...)
type Vagrant_Ptr is access all Vagrant

And, like any good classwide programmer knows, we can freely use Vagrants or Hookers as a parameter or object of Vicitim'Class.  Bully abstraction!

But the type conversion rules for pointers are not so easily duped; there are no "classwide pointers" (thought perhaps life would be different if there were!).  Consider the following code:

Dead_Body : aliased Hooker := ...
p1 : Hooker_Ptr := Dead_Body'Access;
p2 : Victim_Ptr := Dead_Body'Access;
p3 : Victim_Ptr := p1;
p4 : Victim_Ptr := p2;
p5 : Victim_Ptr := Victim_Ptr(p1);

Quiz time: which of these statements are legal, and which aren't?  Anyone who is well-versed in the Ada95 access type conversion rules (<cricket, cricket>) will be able to identify that, oddly, it is p3 that will fail the conversion.  This is anathema to an OOP programmer, since quite obviously they all point to the same damn object.

Note that both p1 and p2 are initialized via the access attribute, which is legal since the compiler knows that our Dead_Body is both a Hooker and a Victim'Class.  p4 is legal, since both types are Victim_Ptrs, but p3 fails since, technically speaking, the types don't match, which the rules require.  We can always force the issue, as in p5, but people generally agree that casting is bad.

This is unfortunate, because using a "pointer to a hooker" as a "pointer to a Victim'Class" is no less unsafe or illegal than using a "hooker" and a "victim'Class", which is the underpinning of all OOP.  Sure we can cast it, but casting is a red flag for a conversion that might fail, and in cases like this, it is perfectly safe.

The pessimists in the crowd will say that proper typing in the first place will solve this conundrum.  That is, figure out whether we need a 'victim' or a 'hooker', and use the appropriate type.  But where this might work in simple situations, this isn't possible in general.  Let's suppose our serial killer expands his system like so:

type Burial_Record is
   record
      Body : Victim_Ptr;
      Location : Lat_Long;
      Date_Interred : Date;
   end record;

procedure Bury_Body (Body : Burial_Record);

type Hooker_Array is array (Positive range <>) of Hooker_Class_Ptr;

procedure Violate_Bodies (x : Hooker_Array);

We want to track all our victims, presumably in some sort of set or container, so that we might disinter them later as needed.  Similarly, we might also want to do strange, awful things with the dead bodies of the hookers. 

But now we are in real trouble.  Because we want them nested inside records or arrays but the items are indefinite, we need to use pointers.  And in Ada95, that means named pointers.  But now we've painted ourselves into a corner, since either way we choose, we will have an unnecessary cast forced upon us.  If we allocate a dead hooker as a Hooker_Class_Ptr, we will be able to violate it, but not bury it.  Similarly, if we allocate it as a Victim_Ptr, we can bury it but not violate it (and rightly so; a classwide victim might be a vagrant, and only a twisted psychopath would violate a dead vagrant).

So then, why not just change the rules to say implicit conversion is okay?  Alas, such changes, after the fact, are not quite so simple.  The hiccup here is that incompatible types are what make overloading work.  For instance, suppose we have something like this in Ada95:

function F return Victim_Ptr;
function F return Hooker_Ptr;
p : Victim_Ptr := F;

Since these types are incompatible, the compiler knows that we really want to call the first version of F.  But if we suddenly make Hookers implicitly convertible to classwide Victims, this code is now ambiguous because either call is valid, and backwards compatibility is broken.  So another way must be found.

If you remember the discussion of access parameters and access discriminants, the trick behind them was that we could use substitute any access type of varying accessibility level for the actual value, and the lack of a name meant that we couldn't copy it.  The expectation was that all these different types would be necessary in order to abide by the accessibility rules; otherwise, you could just use named subtypes to achieve the same affect.

But looking at things in a more general sense, nothing says the accessibility levels have to be different.  We could have two (incompatible) integer pointer types, both declared at the library level (with identical accessibility), and an access parameter would work equally well with either.  Again, there wouldn't be much point, since would could just as easily (and perhaps more readably) define them as subtypes and use named values, since accessibility is not coming into play.

And this is exactly the ability we are looking for.  For instance, some hypothetical function that, instead of taking a named Victim_Ptr, accepts an access parameter to a Victim'Class does almost exactly what we want.  We can pass in Hooker_Ptrs, Vagrant_Ptrs, Victim_Ptrs, and any other type we can come up with.  The downside is that we can't copy it, which is of course what the whole system was set up to do.

And so in Ada05, the ability to use anonymous access values was added to both record components, array elements, and stand-alone objects, but with one crucial difference: nothing is stopping you from copying them.  If you recall the Ada95 anonymous access types, the rules were carefully setup so that once you converted a named type to an anonymous type, you were barred from every copying them (which was, after all, the point).  Ada05 anonymous access types, on the other hand, are designed to let you get the implicit conversion, but continue to copy them.

Let's revisit the example from above and see how AAT's help the problem.  Instead of using named pointers in our record and array, let's use anonymous types:

type Burial_Record is
   record
      Body : access Victim'Class;  --AAT
      Location : Lat_Long;
      Date_Interred : Date;
   end record;

procedure Bury_Body (Body : Burial_Record);

type Hooker_Array is array (Positive range <>) of access Hooker'Class; -- AAT

procedure Violate_Bodies (x : Hooker_Array);

It's mostly the same, except we've removed the named types and replaced them with 'inline' declarations, similar to access parameters or discriminants.  But this changes everything!  If we allocate a Hooker_Ptr, we can put it into either the array or the structure, without having to cast or convert anything.  However, all engineering is a trade-off, and we still lose the ability to convert it back to it's original type after making it anonymous (which means that within the two procedures, the values are only compatible with other AAT's.

But there is a key distinction between Ada 95 and Ada 2005.  In Ada95, the AAT rules were designed to allow you to work with shorter-lived objects, i.e. those where the accessibility level was deeper.  But the rules for Ada05 AAT's are not only different, it's practically the reverse: the type level of anonymous access values is defined to be the same level as where it's declared.

This means, just like named types, the object has to live at least as long as the type, or else you will fail the accessibility check.  Presuming our types and functions from above are the library level, you will never be able to pass a local object to either, only other library level objects (or allocations).  But within those function, you can copy them, put them into sets, what have you.  This is in stark contrast to access discriminants and parameters, which allow local objects, but prevent copying.

The great irony in this is that whereas access parameters and discriminants are there to help you work with shorter-lived objects, access components and elements have rules to prevent you from using them with local types.  To a layperson, an access discriminant and access component look exactly alike, yet are fundamentally different, and there for different purposes.  All this adds to the 'expert-friendliness' of the language, to both the expert's and layperson's disdain.

So before you blindly start using anonymous access types everywhere to save you the trouble of using a named type, ask yourself do I really need one?  If you are not trying to resolve the conundrum of having to needlessly convert access values between their concrete types and classwide parent's type, then the answer is probably no, and you probably need to be using a proper named type.

But at the same time, don't be one of those programmer's who summarily decides that anonymous access types are evil.  Like everything in Ada, they are there to solve a problem; just because they don't solve your problem doesn't make them bad.  Properly used, there are situations where only an anonymous access type will do.  But like any other tool, using them for jobs they are not designed for is likely to break the tool and the project.  Except for stabbing hookers with screwdrivers: that works every time.

No comments:

Post a Comment