Saturday, October 17, 2009

Five Things I Hate About C++

A few years ago, the "five things I hate about my favorite programming language" went around. I think it originated with Brian D. Foy's post on Perl. I like his reasoning: if you can't think of five things you don't like about it, you probably don't know enough about it to advocate for it.

Peter Siebel's recent post about the opinions of the folks he interviewed for Coders at Work made me remember it again. While I often reach for Python as the top tool in my toolbox these days, I've been writing C++ for most of my career, so I thought I'd take a crack at C++ first.


5. No consistent ABI

C++ doesn't define a standard application binary interface (a standard for how the binaries produced from source code are laid out or linked together). If you're writing code to link against a pre-built library, then unless you're using the same version of the same compiler, you can't guarantee that your code will work correctly. (Technically, C doesn't either, but for practical purposes, though, C is in much better shape, mostly because C++'s features provide far more opportunities for implementations to disagree.)

The practical result is often that C linkage is considered "safe" and C++ linkage is considered "unsafe", which means that C linkage is the lingua franca for object-level interoperability, and no one really pushes for compatible C++ linkage--which in turn means that it doesn't happen. (While my experience is mostly in Windows programming, the situation seems to be better in the g++ and GNU/Linux world--as is often the case.)

4. Sorta-kinda safety

The first benefit of C++ over C for me wasn't object-orientation. It was that C++ seemed to be much better at catching the kinds of low-level programmer errors I tended to make back then. Class member protection, type-checked function parameters, exceptions that (unlike return codes) can't accidentally be ignored, constructors and destructors that are guaranteed to be called at the right time, improved casting operations--what's not to like?

The problem is that most of the safety features aren't really safe, they're just a little safer, often due to the desire for source code compatibility with C or concerns about run-time performance. You can probably argue that C++ is safer than C, but I believe it's "just safer enough" that C++ programmers get complacent.

Plus the interaction language of features makes it much easier to commit horrible, higher-level design mistakes that are harder to see when reading the code, particularly with things like non-trivial constructors and destructors, misused (and overused!) inheritance, and non-obvious method overrides.

3. Textual macros

The LISP world has had the "hygienic vs unhygienic macros" argument for a long time. In a (grossly oversimplified) nutshell, hygienic macros allow you to define new, reusable bits of language without worrying about the context in which they'll be evaluated. This makes for safer macro definitions, but precludes some very useful techniques that unhygienic macros allow--for example, enabling the code in the expanded macro use and affect variables in the context in which it's expanded.

But the C/C++ macro implementation makes LISP's unhygienic macros look like an Intel cleanroom. That's because they're not even really part of the C language syntax: they're just a simple, dumb textual replacement done in a preprocessing step, before compilation even occurs.

This feature inherited from C is so error prone that C++ added features like "inline" and namespacing to try to approximate the most common use cases for C macros, so that we wouldn't have to deal with them. It still didn't want to touch preprocessor macros for fear of breaking backwards compatibility, though, so now we have the worst of both worlds: a dangerous feature implemented outside the language syntax, with some of its bits duplicated in the language syntax, and guidance that says "sorry about the mess--here's some partial replacements that don't quite cover the gamut, but that's all you get. Have a nice day."

2. Worst-of-both-worlds standardization

C was born as an in-house development language in an AT&T lab in the late 1960s, and was used in anger almost from day one (for reimplementing the UNIX operating system). By the time standardization started, the language feature set was fairly solid and well-proven, and implementators already had real-world knowledge of the features.

C++, on the other hand, didn't go through this process. While C was designed as a language for implementing operating systems (and applications), C++ was designed as a language for implementing language features. It wasn't used (as far as I know) as the backbone of a single, well-known system in the way C was, so the language was free to evolve more divergently and more slowly.

Worse, the development of the language seems to have been driven by the the design and evolution of the specification, rather than by things tried and lessons learned in implementation. In some cases, features were added to the language specification before they were even implemented, in the hopes that smart compiler vendors would figure something out.

As a result, we have features that don't work like you'd expect (like std::vector or auto_ptr<>), features that don't interact well (like templates and class inheritance), and even features that, well, just don't work (like export, which was in the standard speculatively for years before its first attempted implementation, and which as far as I know has never been fully and correctly implemented by anyone).

On the other hand, while C was standardized after it had mostly stabilized, the C++ standardization process started while the language was still very much in flux. As a result, the core language is full of weirdnesses that are explainable only when you know the political situation at the time.

For example, the construction "virtual void foo() = 0;" is a pretty weird way to spell "pure virtual". In The Design and Evolution of C++, Bjarne Stroustrup reveals that the "=0" construction is there because he wanted to get pure virtual functions into the language specification, but a committee meeting was coming up soon, and he didn't think he could convince enough people to get behind adding a new "pure" keyword.

This leads right into...

1. C++ tries to be all things for all people

I think this one is the root of most of C++'s problems. C++ is and has always been a "more-is-better" language. If you like C, we'll make sure you like C++ by bending over backwards to make C code still work (except when it doesn't) and by making efficiency our top, err, one of our top-ten priorities. If you like object oriented programming, we've added classes and inheritance. Oh, multiple inheritance? Yep, we heard that works well, so we'll add it in there too. Parametric polymorphism? Multiple dispatch? Currying? Oh, hrm, we seem to have painted ourselves into a corner... but we can bodge most of that in with templates and partial template specialization. Oh, and guess what? We just figured out that you can use templates to do metaprogramming, so you get that feature for free! Free is good, right?

This results in two, mostly-correct perceptions:

1) C++ is a big grab-bag of language features, some of which are razor-sharp and don't really hang together coherently, but work great so long as you're really, really careful.
2) C++ is more-or-less better than C, so long as you stay with a "sane subset" of its features.

But what is that sane subset? That depends entirely on who you talk to, and the subset that they choose usually reveals more about their own priorities and experience than anything about the language itself.

All that being said, I still choose C++ (or my own trusted subset of it, at least) over C because of the convenience of constructors and destructors, the expressiveness of templates, and the confidence I get from RAII. I still choose it over Java because I don't need to worry about a runtime VM, because I can access platform-specific APIs and native libraries at will, and no checked-exception silliness.

But that doesn't mean I don't sigh a little every time I burn multiple days chasing down an intermittent memory leak, or that I don't steal a surreptitious glance at younger, better-looking languages with less emotional baggage from time to time.

1 comment:

Kevin H said...

I was wondering if you'd mention D. While I haven't really dealt with it to any great degree, it looks like a huge step up from C/C++ to me.