Tuesday, March 4, 2008

Where java stopped

Yesterday I explained my problems with garbage collection. I don't think garbage collection is bad or any, I just think it isn't being used properly. GC was invented in the late 50's for LISP, the first of high level programming languages. Lambda calculus required that the memory is managed by the system. Having freed the programmer from memory management, Lisp and is brethren enabled the development of true high level features such as dynamic typing, higher-order functions, closures, macros and continuations. These are exactly the feature that give those languages their incredible power.

That is what doesn't make sense in Java and derivatives: those really powerful features are missing in Java. Java is mostly lacking the features that absolutely require a GC. Guy Steele once said "We were after the C++ programmers. We managed to drag a lot of them about halfway to Lisp". Considering the conservativeness of the industry it's understandable they stopped halfway (the step from C to C++ was even smaller). That kind of makes Java a middle level language; a watery compromise that fails to offer the best of both worlds.

Both high and low level languages have their niches, but what about the middle level languages? My intuition tells me their proper niche should be way smaller. It's hard to say what will be Java's successor, but I'm pretty certain of two things. It will not be a Java derivative and it will take the second half step, if not more.

Monday, March 3, 2008

Garbage collection revisited

Garbage collection must be one of the most misunderstood features of programming languages. GC has existed for 50 years, yet a lot of languages have not adopted it. One of the most important commonly cited advantaged of Java and similar languages over languages such as C++ is garbage collection. Having used a number of Java applications I'd dare to say most have memory problems though. They feel extremely bloated. I've seen a few programs improve drastically when some expert started to optimize the memory usage. Most Java programs can be lean enough to be usable, but it will take a lot of effort. I imagine it is a major disillusion for a lot of Java programmers to find out they haven't found a memory panacea after all.

One could see all variables and their resources as a directed graph. In the most elementary programs this graph will be a tree and in such cases manual memory management is trivial. Real programs aren't this simple. Having said that, most programs are not random or unpredictable. Usually substantial pieces of the graph are trees on their own. Acyclical graphs and even most cyclical ones can be solved using reference counting and similar techniques. If you understand the resource usage pattern of your program, you can usually solve it without resorting to garbage collection. However this often requires planning and thinking ahead of time. A GC on the other hand is able to manage any graph, mostly without help from the programmer, but does not guarantee to do so efficiently. So the question when do you need a GC? reduces to when can't I know the pattern? To date, I've only come across one concrete and common example that really can not be solved semi-manually: programming languages themselves. The reason for this is obvious: it is inherent to them that you don't know ahead how they will be used.

It is undisputed that GCs make it easier to write programs, but this issue makes me wonder if it also is easier to write good programs. Java programmers are happy to think they don't have to think about memory, but it turns out they will have to think ahead (though less than C or C++) if they want decent performance. On the server side that doesn't really matter that much, but in client-side programs it does.

In the end, it is a matter of trade-offs. Both approaches can solve most problems. I don't believe a GC by itself gives us better programs than RAII management. Having said that, garbage collection makes business sense in a LOT of situations. Companies don't earn money by building the best program they could, they earn money by selling a finished program, good or not.