Red Meat, Gin and Ada: December 2013

Saturday, December 14, 2013

Ada 2005 access types, part I

If you've read the previous series of articles, then you are up to date with how Ada95 deals with so-called "general" access types. You've got your accessibility checks that fence a local access value inside its type's scope, and you've got your anonymous access values that let the access value out on a leash. Inside the fence, you can run free and do whatever you want, except you can't get out. If you want to go out, you can only do so on a leash, in which case you are extremely restricted in your movement. Put together, you've got a nice neat "either/or" system that ensures your pointers never get lost or hit by a car.

Fast forward a decade to Ada 2005. One of the major updates in the language, warranting an entire chapter in the rationale, was the expansion of the anonymous access type. Unfortunately, history has not exactly been kind to this 'feature', and many (most?) programmers treat them with disdain.

But the Ada05 AAT's suffer much of the same fate of Ada95 AAT's: poor documentation and worse understanding. Just like Ada95, Ada05 anonymous access types are special-purpose features for special-purpose jobs, and yet they are not described this way. Ada05 actually suffers worse, since there are no less than three new special kinds of access values, all serving different purposes, all of which are different than the Ada95 versions, and all of which are lumped together as a single topic. Mix in the fact that they all have identical syntax, and you've got a lot of confused programmers.

And that really is the downfall of Ada's AAT's: they all have the same syntax, but all serve completely orthogonal purposes. And while this might be elegant and simple from a language design standpoint, having the same thing mean different things in different contexts is a recipe for disaster. But I digress.

Not counting the "null exclusions", which are straightforward and won't be discussed, Ada05 introduced three new kinds of anonymous access types. Note that these are the names that I've given them, and not necessarily what the ARG says about them:

The anonymous access component.
The anonymous function return.
The anonymous access to subprogram parameter.

Again, to the layperson, all three seem identical to both each other and the Ada95 access parameter and access discriminant. Each involves using an "in line" access type declaration instead of a named access type. But don't be fooled: each one is there for a specific reason, each of which will be discussed in upcoming posts.

First up: the anonymous access component. The brochure version of this feature is that now you can use anonymous access types in record components, array elements, and stand-alone objects. But why? To understand why this feature was added (and when, exactly, you should consider using it), like the cast-able access parameters before them, we have to once again talk about object oriented programming.

For those who deal in the domain of OOP, a fundamental tenant is that child types can be freely substituted for parent types. This is implemented in Ada via the 'classwide' type, which represents not only the type in question, but any of the types that derive from it. For example, if C extends P, then any code that requires a P'Class can freely use a C or a P, instead of just a P. This is the quintessential "is-a" relationship.

This setup works well, except for one issue: the same cannot be said of pointers. For example, let's suppose a serial killer with a penchant for software engineering is interested in modeling his victims. This is neatly decomposed (ha!) into a class hierarchy:

type Victim is interface;
type Victim_Ptr is access all Victim'Class;
procedure Dismember (This : Victim);

type Hooker is new Victim with (...)
type Hooker_Ptr is access all Hooker;

type Vagrant is new Victim with (...)
type Vagrant_Ptr is access all Vagrant

And, like any good classwide programmer knows, we can freely use Vagrants or Hookers as a parameter or object of Vicitim'Class. Bully abstraction!

But the type conversion rules for pointers are not so easily duped; there are no "classwide pointers" (thought perhaps life would be different if there were!). Consider the following code:

Dead_Body : aliased Hooker := ...
p1 : Hooker_Ptr := Dead_Body'Access;
p2 : Victim_Ptr := Dead_Body'Access;
p3 : Victim_Ptr := p1;
p4 : Victim_Ptr := p2;
p5 : Victim_Ptr := Victim_Ptr(p1);

Quiz time: which of these statements are legal, and which aren't? Anyone who is well-versed in the Ada95 access type conversion rules (<cricket, cricket>) will be able to identify that, oddly, it is p3 that will fail the conversion. This is anathema to an OOP programmer, since quite obviously they all point to the same damn object.

Note that both p1 and p2 are initialized via the access attribute, which is legal since the compiler knows that our Dead_Body is both a Hooker and a Victim'Class. p4 is legal, since both types are Victim_Ptrs, but p3 fails since, technically speaking, the types don't match, which the rules require. We can always force the issue, as in p5, but people generally agree that casting is bad.

This is unfortunate, because using a "pointer to a hooker" as a "pointer to a Victim'Class" is no less unsafe or illegal than using a "hooker" and a "victim'Class", which is the underpinning of all OOP. Sure we can cast it, but casting is a red flag for a conversion that might fail, and in cases like this, it is perfectly safe.

The pessimists in the crowd will say that proper typing in the first place will solve this conundrum. That is, figure out whether we need a 'victim' or a 'hooker', and use the appropriate type. But where this might work in simple situations, this isn't possible in general. Let's suppose our serial killer expands his system like so:

type Burial_Record is
   record
      Body : Victim_Ptr;
      Location : Lat_Long;
      Date_Interred : Date;
   end record;

procedure Bury_Body (Body : Burial_Record);

type Hooker_Array is array (Positive range <>) of Hooker_Class_Ptr;

procedure Violate_Bodies (x : Hooker_Array);

We want to track all our victims, presumably in some sort of set or container, so that we might disinter them later as needed. Similarly, we might also want to do strange, awful things with the dead bodies of the hookers.

But now we are in real trouble. Because we want them nested inside records or arrays but the items are indefinite, we need to use pointers. And in Ada95, that means named pointers. But now we've painted ourselves into a corner, since either way we choose, we will have an unnecessary cast forced upon us. If we allocate a dead hooker as a Hooker_Class_Ptr, we will be able to violate it, but not bury it. Similarly, if we allocate it as a Victim_Ptr, we can bury it but not violate it (and rightly so; a classwide victim might be a vagrant, and only a twisted psychopath would violate a dead vagrant).

So then, why not just change the rules to say implicit conversion is okay? Alas, such changes, after the fact, are not quite so simple. The hiccup here is that incompatible types are what make overloading work. For instance, suppose we have something like this in Ada95:

function F return Victim_Ptr;
function F return Hooker_Ptr;
p : Victim_Ptr := F;

Since these types are incompatible, the compiler knows that we really want to call the first version of F. But if we suddenly make Hookers implicitly convertible to classwide Victims, this code is now ambiguous because either call is valid, and backwards compatibility is broken. So another way must be found.

If you remember the discussion of access parameters and access discriminants, the trick behind them was that we could use substitute any access type of varying accessibility level for the actual value, and the lack of a name meant that we couldn't copy it. The expectation was that all these different types would be necessary in order to abide by the accessibility rules; otherwise, you could just use named subtypes to achieve the same affect.

But looking at things in a more general sense, nothing says the accessibility levels have to be different. We could have two (incompatible) integer pointer types, both declared at the library level (with identical accessibility), and an access parameter would work equally well with either. Again, there wouldn't be much point, since would could just as easily (and perhaps more readably) define them as subtypes and use named values, since accessibility is not coming into play.

And this is exactly the ability we are looking for. For instance, some hypothetical function that, instead of taking a named Victim_Ptr, accepts an access parameter to a Victim'Class does almost exactly what we want. We can pass in Hooker_Ptrs, Vagrant_Ptrs, Victim_Ptrs, and any other type we can come up with. The downside is that we can't copy it, which is of course what the whole system was set up to do.

And so in Ada05, the ability to use anonymous access values was added to both record components, array elements, and stand-alone objects, but with one crucial difference: nothing is stopping you from copying them. If you recall the Ada95 anonymous access types, the rules were carefully setup so that once you converted a named type to an anonymous type, you were barred from every copying them (which was, after all, the point). Ada05 anonymous access types, on the other hand, are designed to let you get the implicit conversion, but continue to copy them.

Let's revisit the example from above and see how AAT's help the problem. Instead of using named pointers in our record and array, let's use anonymous types:

type Burial_Record is
   record
      Body : access Victim'Class; --AAT
      Location : Lat_Long;
      Date_Interred : Date;
   end record;

procedure Bury_Body (Body : Burial_Record);

type Hooker_Array is array (Positive range <>) of access Hooker'Class; -- AAT

procedure Violate_Bodies (x : Hooker_Array);

It's mostly the same, except we've removed the named types and replaced them with 'inline' declarations, similar to access parameters or discriminants. But this changes everything! If we allocate a Hooker_Ptr, we can put it into either the array or the structure, without having to cast or convert anything. However, all engineering is a trade-off, and we still lose the ability to convert it back to it's original type after making it anonymous (which means that within the two procedures, the values are only compatible with other AAT's.

But there is a key distinction between Ada 95 and Ada 2005. In Ada95, the AAT rules were designed to allow you to work with shorter-lived objects, i.e. those where the accessibility level was deeper. But the rules for Ada05 AAT's are not only different, it's practically the reverse: the type level of anonymous access values is defined to be the same level as where it's declared.

This means, just like named types, the object has to live at least as long as the type, or else you will fail the accessibility check. Presuming our types and functions from above are the library level, you will never be able to pass a local object to either, only other library level objects (or allocations). But within those function, you can copy them, put them into sets, what have you. This is in stark contrast to access discriminants and parameters, which allow local objects, but prevent copying.

The great irony in this is that whereas access parameters and discriminants are there to help you work with shorter-lived objects, access components and elements have rules to prevent you from using them with local types. To a layperson, an access discriminant and access component look exactly alike, yet are fundamentally different, and there for different purposes. All this adds to the 'expert-friendliness' of the language, to both the expert's and layperson's disdain.

So before you blindly start using anonymous access types everywhere to save you the trouble of using a named type, ask yourself do I really need one? If you are not trying to resolve the conundrum of having to needlessly convert access values between their concrete types and classwide parent's type, then the answer is probably no, and you probably need to be using a proper named type.

But at the same time, don't be one of those programmer's who summarily decides that anonymous access types are evil. Like everything in Ada, they are there to solve a problem; just because they don't solve your problem doesn't make them bad. Properly used, there are situations where only an anonymous access type will do. But like any other tool, using them for jobs they are not designed for is likely to break the tool and the project. Except for stabbing hookers with screwdrivers: that works every time.

Sunday, December 1, 2013

License to Program

First, the disclaimer: I am not a lawyer, nor am I giving legal advice. Consult an actual lawyer before distributing your program. Secondly, all this is about life in these United States, so international visitors should refer to their own laws and customs.

Software is a funny thing, in that you almost never actually buy it. Sure, you might go to Best Buy and plunk down some of your hard-earned cash for a disc full of code (or, if you are a hip cool kid, download it from Steam), but ironically, exchanging money for goods doesn't necessarily mean you are buying anything.

Most of the software in the world today is licensed. Some of it comes with expensive, restrictive licenses, and some of it comes with inexpensive, permissive licenses. More to the point, almost all of these licenses are different, and create fun and exiting conundrums when you try and mix and match them. The bottom line is that almost nobody has any idea what they can or can't do with a piece of software, and programmers have it the worst.

So to clear all this up, we have to fire up the wayback machine to 1787, when the U.S. constitution gave congress "the power to promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries", which we know today as copyright. Informally, if you create something new and original, the government says that you get exclusive rights to decide who gets a copy of it and when.

But one of the things people don't often realize is that copyright gives just as much protection to the person receiving the copy, as well as the author himself. It's not just that the author gets to decide who does and doesn't receive a copy; after someone does get a copy, what the receiver does with that copy is his business.

Let's say I write a book. The law says nobody can copy that book except me, so assuming I want to make some money, I make five copies of it. This is legal, since I own the copyright. Now, I sell (in a legal sense) one of those copies to you. You are of course prohibited from making copies of it, which is the whole point, but there is a whole world of possibilities of non-copy things you can do, and I as the author can't stop you. You can cross out every word containing a 'y', you can use it to prop up a wobbly table leg, you can rip out every page and sew a dress from them, you can give it to your mother for a Christmas gift, you can whack your mouthy wife over the back of the head with it, or you can burn it in the center of town to express your outrage over my egregious use of four letter words. Just so long as you don't make a copy, well, that book is yours to do with as you please. I, as the author, can't step in and stop you.

And this is how copyright worked for a few hundred years. Admittedly though, a lot of the copyright enforcement took place simply because, well, things were just inherently difficult to copy. Copying a book in the 1700's involved copying each word with a quill pen, a task so arduous that it was easier to just buy the book yourself. You could always go to the trouble of buying a printing press, but the investment and typesetting costs were so high that you would have to start selling your bootleg copies on a large scale to recoup your expenses. And once you start your high-profile book bootlegging operation, it's a lot easier for The Man to hunt you down and throw your ass in jail. And so copyright naturally just enforced itself, whether it was books, or records, or paintings.

And then one word changed everything: tape. In the late 1970's and early 80's, the cassette tape market began to take shape, and suddenly the copyright landscape became very murky. Remember how once you buy a (legal) copy of something, that copy was yours to do with what you please? That includes me buying a book, and then lending it to you to read, and then you returning it to me when you're done. After all, no copy has been made, and so no crime has been committed. By extension, I am also allowed to lend you my copy for a fee, a process we all know as renting. Renting was perfectly legal under the copyright law, since when you have my copy, I don't, and once you give it back, you don't have it anymore. Nice and neat.

But this simple idea became much more scary idea for the music industry once tape was introduced. Now, I could legally purchase an album, and then legally lend it to you for a small fee, you could illegally make a copy of it simply by pressing a button and return it, leaving me free to legally rent it out to the next customer. The cops can't hassle me, because I haven't broken any laws, and the cops can't hassle you, because they have no reason to suspect you so long as you stay in the privacy of your home, don't gloat about your crimes all around town, and don't start selling large quantities of bootlegs. I make a bunch of money, you save a bunch of money, and the music industry takes it on the chin. There were, in fact, actual businesses that would "rent" you a cassette album, a dual-deck tape player, and a blank tape all at the same time (with, of course, a wink and nudge), and under the copyright law, everything was above board.

Obviously, music industry fat-cats were not going to stand for this, and so in 1984 congress passed the Record Rental Amendment of 1984. In essence, this simply changed the copyright law to prohibit outright the renting of a sound recordings. You could still sell them, you could still give them away, and you could still lend them to friends for free, but you couldn't charge money for it. And this worked well enough, since the aforementioned "music rental stores" became illegal, and the fat-cat lawyers could sue them into oblivion.

Of course, 1984 was right around the same time a brand new, copyrightable medium was about to explode on the world. And this type of work was not just easy to copy; it was required to be copied to be used. After all, I have to copy those five floppy disks of data to my hard drive in order to play DOOM, which is technically copyright infringement. But in classic congressional lack-of-any-forward-thinking, this newfangled "software" was completely ignored, and all the same problems that cassettes had were going to be repeated for software.

However, software lawyers were certainly well aware of the trials and tribulations of the music industry lawyers, and knew that without some sort of protective measure, nothing was going to stop people from renting out software (or even just lending it for free) without ensuring it was actually uninstalled. After all, once DOOM is installed on my hard drive, the disks just sit in the box, and it's much for fun to lend them to a friend so we can play over the network. So to avoid the same problem of "software rental stores" lending out the same legally purchased copy of a program over and over, software lawyers resorted to the one thing that trumps copyright law: contract law.

Suppose, like before, that I write a book. But now suppose I don't just want to sell five copies, I want to sell five million copies. This means a printing press, operators to run the printing press, entire forests worth of paper, 55-gallon drums of ink, expensive typesetting, book binding, covers, boxes to put them in, trucks to ship the boxes, drivers to run the trucks, and even after all that, I still have to find people and stores to buy the damn things. This is an expensive proposition, especially when all I really want to do is write books and cash the checks.

So most industries have a sort of middle-man known as a publisher or distributor. I do the creative work of writing the book, and they do the heavy lifting of printing them, moving them, and selling them. But right off the bat, we see that this setup is technically illegal. As the author, I have exclusive right to make copies of that book, which means that the publisher can't, even if I want to let him.

Obviously, this is not the intent. I want to be able to give the publisher conditional permission to make copies, so long as certain conditions are met. For instance, I let them make as many copies as they want with the condition that they give me half the money they make from selling them, and they agree to that so long as I promise to write them three more books in the next five years, or what-have-you. Here, the publisher and I enter into a contract that we both agree to, and so long as the contract is above-board and negotiated in good faith, I can "give up" my copyright protection to allow someone else to make copies. The key is that I haven't sold them a copy of my book, I have conditionally allowed them to use the original work. Other people still can't make copies of it, and the publisher can make copies only so long as they honor the original contract.

In any case, some software lawyer who I'm now sure lives in a much larger house than I, had a wonderful, awful idea: let's use the same loophole between the publisher and the end user! Now, instead of the consumer buying a copy of the software, they are simply conditionally allowed to use the software. Now, all bets are off: any "condition" to which the user agrees is nice and legal, no matter what the copyright laws say. After all, if they don't like the conditions, well, they don't have to buy the software. And so begat the nefarious end-user license agreement.

And for awhile, this was a mostly benign way to prevent legalized software piracy. Most EULA's basically said nothing more than "you can use this software however you want, except rent it out or give copies away". If you decided to start a software rental shop, you were breaking the "contract" that you implicitly signed by ripping open the package, and the lawyers had legal standing to sue your ass. But as should have been expected, things quickly spiraled way out of control. Lawyers realized that nobody was actually even reading the damn things, let alone weighing whether or not the license was too restrictive, and bit by bit, these licenses began to strip away more and more of the users rights. Most users, of course, didn't care.

But at the same time copyright wasn't protecting programmers wishing to make money, it was also not protecting programmers who had no desire to profit at all. Suppose for a moment that I write a neat little simulation library for some academic paper I am writing, and I want to share it with all my other scientist buddies to verify my results and spur future work. Copyright gets right in the way, because while I as the author can give copies away (for free) to whomever I want, those on the receiving end can't distribute it further, despite me being alright with it.

My only option in this case is to release my code in the public domain, which means I give up all claims to ownership in any way. This serves my purpose, except not everyone is as honorable as my scientist buddies; once in the public domain, nothing is stopping evil software corporation X from getting hold of a copy, and then turning around and selling it as part of their project (under a restrictive license, no less).

So like fighting fire with fire, people began to use the same EULA trick as the evil software corporations, but with a twist: instead of restricting your rights, it actually broadened them. You could make as many copies as you wanted, with the condition that you distribute them with the same license. The recursive nature of this style of license, in essence, ensured that your free software stayed free software.

And this is pretty much where we are today. Software you "buy" comes with a longwinded EULA that is normally chock-full of fun surprises, like only being able to legally use your OEM copy of Windows on the computer it came with, not being able to do performance tests and publish the results, promising the company your first born son, and so on. On the other hand, "free" software licenses let you use the software however you want, with the condition that your redistributions remain free, even if part (especially if part) of a larger software program.

What's confusing about both types of licenses is that your rights are a quality of the license, and not inherent to the software itself. Suppose I write some software, and I license it to both Joe and Bob. In the license Joe and I agree to, I charge him $100 and make him promise not to give it to anyone else, but in the license for Bob I charge him nothing and let him distribute as many copies as he pleases. Now imagine Steve comes along; it's illegal for Joe to give him a copy, and I can sue him into next week if he tries, however Bob can give him a copy for free, and yet it's the same damn software. Just because person 'A' got it for free, doesn't mean he can give it to you for free. Moreover, just because person 'A' paid a high price for the software doesn't necessarily mean he is restricted from distributing it to a friend. You can do anything you want as long as the license says its okay.

All this is fine for a stand-alone program, but things start to get incredibly hazy when you start talking about software components. Remember, software is normally built out of other software, which means an inherent part of purchasing it is to make copies and sell it. This is, of course, exactly what copyright is trying to prevent.

Let's say I write a library of functions that can search strings in constant time, and I want to sell it and make a shitzillion dollars. On the one hand, I obviously want copyright to apply, since I don't want customers making bootleg copies and selling them to friends. But on the other hand, the idea is that the customer will use that code within the context or a larger program, and sell it to friends. Copying and selling my work is both prohibited and required at the same time!

Now the license needs some sort of complex language like "It's okay to copy this code and distribute it so long as it's in binary form and within some other program in binary form, but not otherwise". Unless, of course, we are using dynamic linking, which only makes matters worse. The programmer I sell my DLL to has to be free to distribute it as part of his program (or else it won't work), but just by being present on the customers system means it's there for other programs to use. So now, someone can purchase the software that comes with my DLL, and write software that uses it without paying me! He can legally sell his part of the code that uses my DLL, but not the DLL itself, and just hope the other user has my DLL installed from something else.

So just use a free license, right? Well, not so fast. If using a freely-licensed software means you can't sell it, all the billion dollar companies who are actually trying to make money will probably not use your library. And programmer's, being creatures of habit, will stick to the same tools they are familiar with, and your free software will just live forever in the niche underworld. So now you create two licenses: you can freely use libraries so long as you don't sell the end product, or you can pay (usually through the nose) for a license that does allow you to sell your work. Same library, two different sets of rules!

Perhaps the most interesting part about EULA's, free or restrictive, is that nobody is really quite sure if they are even legal. People have the tendency to assume that contracts are enforced like an episode of Law & Order, where things are very black and white, the special guest star is always guilty, and the dad from Dirty Dancing always has a smart quip. In reality, there are no cops, no juries, and for that matter not even really any laws. The judge decides what's right and wrong in a totally subjective way, and most of that is based on which lawyer has the nicer tie. Even worse, for every court case where a judge has upheld the verdict you want, there is another court case somewhere else where another judge has struck it down. Moreover, just because the company doesn't feel like paying the lawyer with the nice tie $1000 an hour to sue the 30-year old pirating software in his parents basement, doesn't make it any more or less legal.

Somewhat ironically, the whole licensing point is now moot, because way back in 1990, Congress passed the Software Rental Amendments Act of 1990, which did for software what the 1984 amendment did for cassettes: you can no longer rent software (except for cartridge-style video games). So the whole idea of finding a loophole to prevent people from renting software is now antiquated, and frankly so is the idea of a loophole to combat the loophole.

So the point of all this is that you better read your license, and more importantly make sure the vendor you buy from reads the licenses too. You will be surprised by what it says, and there's a pretty good chance you're breaking the law. This, in turn, leads to all sorts of interesting mind-blowing legal conundrums, where free code under a proprietary license (i.e. example code distributed with an SDK), has been erroneously included in a public domain library, that was legally included in a GPL library, that has been erroneously included in a proprietary library, that was sold to someone to use in a proprietary program! That's a true story, with an unhappy ending.

Or, on the other hand, if you are not going to read your license, make damn sure your lawyer has the nicest tie in the courtroom.