When deciding a best programming language, what you look for is not technical differences, but essential differences—differences in programming paradigms for instance. Now here’s a tech post for you. I assume you understand the technical terms involved and so will only talk of why they are bad or good, without giving any definitions.
So far, each language I have come across, absolutely sucks. (Oh well, ok, I don’t know python yet.) Here is my perspective, most of the ideas are of my own and may be incorrect.
There are mainly two aspects that determine the goodness of a language: the “local” aspects, and a “global” picture (a.k.a design aspects).
Of C++, and the problem of bloat
Let me start with C++: traditionally my language of choice. It is great for contest coding, and contrary to popular belief, it is actually the local aspects that make C++ so great. If you haven’t programmed a large scale application, then you might disagree with me and might tell me that C++ is great for its design aspects.
When writing a small function, I can use STL and templates to reduce the number of lines of code drastically, and write very neat elegant code. Yes, it’s brilliant.
But lets look at the design aspect. Sure, object oriented programming is a fantastic concept. But the way C++ does it is painful. Firstly C++ is very bloated. (I am talking of the language itself, not the runtime. For example, I can say that C++ is more bloated than Java.) The additional bloat can confuse a programmer: how should he (oh, or she) design his application, and in the confusion he ends up using operator overloading to create supposedly neat modules.
Operator loading is sometimes very helpful (for operators like =,+,- etc., for objects for which it makes sense). However, sometimes library designers like using fancy operators for doing some routine function, just because they can. ostream’s overloading of << operator has been one of the most conspicuous examples. I would definitely prefer a .put() method instead.
(Another argument against C++, which is not related to the language as such, is the difficulty in writing shared libraries using the language. For example, it is next to impossible to export member functions in a way that C can understand. Of course, you can create C bindings: for every member function create an exported global function. Hmm, kinda defeats the purpose of using C++ in the first place. But I definitely have done that in the past when writing rediffbol-prpl. I don’t know know how easy or hard it is to export C++ classes to, say, python classes.)
Of Java
Java, in terms of the language structure, is excellent. However the over-reliance on garbage collection creates bloated executables. Oh wait, I should say bytecode. So a small digression about bytecode. As a FOSS enthusiast, I find the concept of a precompiled bytecode daunting. See, I usually never run precompiled code from anywhere, unless I am cent percent sure of the source. A precompiled bytecode is trying to advocate a concept of code hiding: if you want to share the code, you might as well create portable code and compile it for each of the environments you are going to run it on. If you are talking of web applets, oh well, bytecode does a great job.
So we are back to talking of Java The Language. A big flaw in Java design, according to me, is the way every object has to be initialized and assignment is like copying pointers.
ClassName a = new ClassName () ; ClassName b ; b = a ;
Here, b and a both point to the same object. This is also different from how basic datatypes like int, or char works. It feels even nastier when creating an array of elements — you first create an array of “pointers” and then fill each element with the object of your wish.
Perhaps this is OK — if you have trained yourself to see it this way, but this is not my case.
I prefer the PHP way of doing it, where by default it creates a new object of the same type (and a different syntax for actually creating references).
Another issue I have with Java, is that I don’t like the behavior of Java generics. I can’t, for example, do List, I have to do the more painful List. There were some other issues with generics that I have come across in the past, but my memory fails me right now (garbage collection at work).
In any case Java is a much more well designed language than C++.
Of PHP and Strict Variables Declaration
Coming to PHP, a major design flaw in PHP is that variables need not be defined. Sure it’s a scripting language, but I certainly don’t like my typing errors to be translated into an all new, uninitialized, hard to trace, variable.
That, and the absence of variable scopes. So,
if ( ... ) {
$a = "sfsdfs";
// do some work with $a
}
echo $a ;
will output “sfsdfs”. Right, this is directly related to needing a variable to be defined before use. So we would like to be,
if ( ... ) {
var $a = "sfsdfs";
// do some work with $a
}
echo $a ;
And this should ideally throw a compile error. Again you can argue that this is not very practical with PHP with the level of reflection it provides (and for webapps, I’ve seen reflection been used quite effectively), but it can’t be a good language until this is fixed. In fact it will be all the more interesting to enforce the variable type, like string $a, or int $a or MyClass $a, instead of just var $a. Because in most cases when you write something like $a = … , you better know what that right hand side is going to evaluate to. Otherwise, you are going to find yourself debugging nasty Bohrbugs, which could have been caught at compile time in a well typed language. But in general, for a scripting language, PHP does a decent job.
Another (among lots) issue with PHP is the standard libraries that go with it. It neither follows object oriented conventions, nor is it completely procedural. So it is not too hard to find code which mixes both the coding styles.
C, I told you!
What about C? Ah, now here is a classic language!
Contrary to whatever I had believed previously (you might have heard me arguing against the use of C in the past), C can produce very neat, extremely elegant code. Add GLib to it, and you’ve got some amazing power!
Sadly, with great power comes great responsibilities. So now you have to be taking care of freeing memory, manually calling destructors, and so on and so forth. Although it is not bad as it sounds if you already have a good design in place. For example, make sure all your functions are very small: now call a cleanup code at the end of the function with a small goto. Or always have a concept of who owns a malloc-ed piece of code, and don’t try to do silly tricks to bypass that — don’t get tempted by efficiency.
Of function pointers, virtual functions and callbacks
Well written code in C is almost the easiest to understand. A small digression: I was tracking some C-code recently to see how it worked, walked down the code and eventually reached a line like this:
(*variable->type->function) (... );
These are the C equivalent of virtual functions. Virtual functions are painful, for example, in this piece of code, I have no clue where this is jumping to next. Unless I know what type is. But now I have to locate where type got defined.
Sadly, virtual functions can’t be avoided. Function pointers are something that should be used carefully, and only in a very well structured design. With the exception of function pointers, it is much more easier to understand the flow of a C program.
So then, of C++’s handling of virtual functions: though it might seem the correct thing to do, it can very often be a PITA. It becomes even more painful if you accidentally mistype a function name in the derived class (so then the virtual function doesn’t get derived, and the new function is never used). Java’s interfaces seem more correct alternative to using derived classes with virtual functions. But IIRC, Java assumes all functions are virtual by default. PITA.
I have read about a few proposals to fix the misstyped virtual function definition problem, that could probably make it to C++0x.
Another thing that sets apart C from C++, is that: most modern applications are event driven. So you write callbacks, and callback handlers, you can signal events etc. The way I see it, callbacks in C++.. umm, just does not fit the flow. It fits elegantly in C though. (I’ve seen C++ code registering functions as callbacks. Hmm, in my personal opinion, if callbacks are to be used in an object oriented language, you should be registering an object, and not a function.)
Of Memory Handling
So what is the best way of memory handling? I argued against Java’s garbage collector. So do I like manually deleting all objects in C?
As of right now, Garbage collector seems like the way to go. Manual code clean up is messy, and often not possible. Need an example? Lets go back to callbacks. Suppose I set a network callback on an object, and in between the object is destroyed. Now the callback is called with an invalid object. This would not have been the case if a garbage collector were doing the cleanup job.
Yet, you can argue that in this particular case, I can use reference counting to get the job done. As soon as the reference count gets to zero, delete the object. It works in this case, but is flawed if used as a general strategy. (Consider three objects. A–>B, B–>C, and C–>B. Now delete the A–>B link. Now both B and C are not reachable, but both have non-zero reference counts.)
Still, we see reference counting being used for its performance benefits.
(In fact, sometimes you cannot be sure of the number of references if the callback is being called from another library, in which you cannot do manual reference incrementing. This was a major issue I had with rediffbol-prpl.)
In the particular case of callbacks, there is another solution. As far as I can see (and I may be shortsighted), the only situations where you can “lose track” or your variables are in callbacks or in multithreaded applications. For callbacks, we need some test to see if the object is valid or not. Libpurple, for instance, does the following nasty check to see if a PurpleConnection is still valid:
#define PURPLE_CONNECTION_IS_VALID(gc) (g_list_find(purple_connections_get_all(), (gc)) != NULL)
Aha, so we store a list of all the globally declared connections, and to test if a given connection is valid, just check to see if it is the list. Hmm, so what if after the connection got destroyed, you did a malloc to create another connection, and this time the connection got placed at exactly the same address where your first connection was. If a callback is called with the pointer to the older connection, then heck, it’s still valid — but it’s not the same connection.
A more correct solution, would be to depend on a unique identifier for each connection object. So for each connection we would give a connection ID. We globally keep a list of all the valid connections as before (or if the number of them are very large, keep a map from the ID’s to the connections), and this time, instead of passing a pointer to the connection, pass around the unique ID. The callback can now correctly verify the validity of an ID and retrieve the object if required. Because of the uniqueness property, it can never get invalid.
Only, it might have been useful to have such a scheme integrated into GLib. (Or is it already?)
And as for threads, you need reference counting. Each thread keeps exactly one reference. It should not keep reference counters for each sub-reference it makes.
I would guess that, with this, it is not really necessary to have the garbage collection overhead. I would certainly love to hear a counterpoint.
Of GObjects
Maybe it is time we looked at GObjects as the future solution. It is not neat and elegant (in terms of number of lines of code!), but it — as far as my current reading-up-of-documentation-goes — it does everything correctly. It uses reference counting as an memory management aid, and doesn’t rely completely on it. Also, in some sense, you manually write down the virtual table, so there is no scope of spelling typos like you can get in Java or C++. It is (supposedly) easy to create bindings for other languages from GObjectified code. Another beauty is signal handling, a discussion of which is out of the scope of this post. ![]()
And so I also took a look at Vala. So far, it definitely looks good, and I am keen on using it for some future application once it is stabilized enough. From what I see, it tries to give all the power of GObjectified code (in fact, the vala compiler just converts Vala code into GObjectified C-code) in a syntactically object oriented language. Again, the language itself isn’t perfect, and has many of the shortcomings described above.
Of C++0x
Lots of nice additions, but seriously — it’s increasing the bloat. Particularly I like the concept of “Concepts” (one of the more fundamental additions in the global design aspects), and of lambda functions (from the local aspects).
(P.S. While I am stopping here for today, this can at best be considered a draft. I am certain there are lot more rants that I am forgetting to jot down, but I will update this as and when I remember them.)
Filed under: Uncategorized





Perl is popular with system administrators who use it for an infinite number of automation tasks. Php Program Language
I keep coming back to C.
But…, where is the CPAN for C? That is the real pity. That is the real pity. There’s no reason it couldn’t exist. But, it doesn’t.
It turns out that the cpan for C, is here, though it is as yet an infant:
http://ccan.ozlabs.org/