Monday, December 07, 2009

Five Pycon 2010 Talks I Need to See

Following the example of Catherine Devlin and Carl Trachte, I thought I'd put together a list of the five Pycon talks I need to see in 2010. But I couldn't--I struggled to get below a dozen. So here are the top five I need to see, plus the ones I'll probably kick myself for not seeing because they're undoubtedly going to be scheduled in the same slots as the top five:

1. Import this, that, and the other thing: custom importers (Brett Cannon)
This is an easy choice, because I'm about to be implementing one of these for work. Would have been be nicer if Pycon 2010 had been scheduled for September 2009, but I'll take what I can get.

2. Understanding the Python GIL (David Beazley)
Another easy choice. After reading lots of code and debugging thread issues in our embedded Python interpreter at work, I think have a decent grasp of the GIL implementation. Given David's mindbending generators tutorial last year and his GIL presentation from ChiPy, I expect this talk to be rich in things I will be disturbed to have learned.

3. Powerful Pythonic Patterns (Alex Martelli)
Alex's talk last year, Abstractions as Leverage, was curiously satisfying. He didn't present any facts I hadn't already heard or read, but his presentation made some new connections for me (in a "My God, it's full of stars!" way).

4. Threading is Not a Model (Joe Gregorio)
In the last few years, I've begun to see pervasive threading as a placebo more than a solution. To paraphrase JWZ, some people, when confronted with a problem, think, "I know, I'll spin up a new thread." Now they have two problems. In reality, they've usually created an unknown number of problems, bounded only at the lower end by the number two. I'm really interested in seeing what Joe brings to the discussion beyond the usual "threads, select(), or fork()" question.

5. Turtles All The Way Down: Demystifying Deferreds, Decorators, and Declarations (Glyf Lefkowitz)
I have a long history of utter contempt for the practice of using syntactic sugar to "re-define the language in order to provide a more concise, natural style" for a given purpose. Glyf says he "will try to convince you that all of this wonderful magic isn't all that weird". Sounds like a challenge. If you're not continually questioning your own biases, you're heading for a mental rut, so I'm going to try to attend this with an open mind (and probably leave with a thoroughly-bitten tongue).

These are the ones I will move heaven, earth, and lunch plans to see. The others I really want to attend are:

  • How Are Large Applications Embedding Python? (Peter Shinners). Totally relevant for work, but probably more elementary than I'd want.
  • What Every Developer Should Know About Database Scalability (Jonathan Ellis). Totally irrelevant for my current work, but I've had to work in this area in the past, so it's somewhat interesting, and I'm curious about what's changed lately.
  • Optimizations and Micro-Optimizations in CPython (Larry Hastings). Pure geeky personal interest.
  • New *and* Improved: Coming changes to unittest, the standard library test framework (Michael Foord). I'm not quite a test-driven development zealot, but I'm about as close as you can get without applying for membership.
  • Python Metaprogramming (Nicolas Lara). More pure geeky goodness.
  • Eventlet: Asynchronous I/O with a Synchronous Interface (Donovan Preston). I can't quite decide whether this is applicable to work or not, and there's only one way to find out.
  • Seattle: A Python-based Platform for Easy Development and Deployment of Networked Systems and Applications (Ivan Beschastnikh). I was quite disappointed by last year's sandboxing talk (the description didn't really let on that it was all about PyPy), so I'm hoping I can pick up more from this one.
  • Tests and Testability (Ned Batchelder). Probably more elementary-level than I'd like, but might have some good discussion.
  • On the Subject of Source Code (Ian Bicking). Another blue-sky talk by Ian? Yes, please.
  • Python's Dusty Corners (Jack Diederich). I have a feeling this will be like Doug Hellman's PyModule of The Week: 80% of it is "yeah, yeah, I knew that," and 20% is "oh, wow, how did I not know that?"
I wasn't terribly impressed by the tutorial list (other than the compiled Python one), so I'll probably pass on them, but the talks look even better than last year. See you in Atlanta!

Saturday, October 17, 2009

Five Things I Hate About C++

A few years ago, the "five things I hate about my favorite programming language" went around. I think it originated with Brian D. Foy's post on Perl. I like his reasoning: if you can't think of five things you don't like about it, you probably don't know enough about it to advocate for it.

Peter Siebel's recent post about the opinions of the folks he interviewed for Coders at Work made me remember it again. While I often reach for Python as the top tool in my toolbox these days, I've been writing C++ for most of my career, so I thought I'd take a crack at C++ first.

So:

5. No consistent ABI

C++ doesn't define a standard application binary interface (a standard for how the binaries produced from source code are laid out or linked together). If you're writing code to link against a pre-built library, then unless you're using the same version of the same compiler, you can't guarantee that your code will work correctly. (Technically, C doesn't either, but for practical purposes, though, C is in much better shape, mostly because C++'s features provide far more opportunities for implementations to disagree.)

The practical result is often that C linkage is considered "safe" and C++ linkage is considered "unsafe", which means that C linkage is the lingua franca for object-level interoperability, and no one really pushes for compatible C++ linkage--which in turn means that it doesn't happen. (While my experience is mostly in Windows programming, the situation seems to be better in the g++ and GNU/Linux world--as is often the case.)

4. Sorta-kinda safety

The first benefit of C++ over C for me wasn't object-orientation. It was that C++ seemed to be much better at catching the kinds of low-level programmer errors I tended to make back then. Class member protection, type-checked function parameters, exceptions that (unlike return codes) can't accidentally be ignored, constructors and destructors that are guaranteed to be called at the right time, improved casting operations--what's not to like?

The problem is that most of the safety features aren't really safe, they're just a little safer, often due to the desire for source code compatibility with C or concerns about run-time performance. You can probably argue that C++ is safer than C, but I believe it's "just safer enough" that C++ programmers get complacent.

Plus the interaction language of features makes it much easier to commit horrible, higher-level design mistakes that are harder to see when reading the code, particularly with things like non-trivial constructors and destructors, misused (and overused!) inheritance, and non-obvious method overrides.

3. Textual macros

The LISP world has had the "hygienic vs unhygienic macros" argument for a long time. In a (grossly oversimplified) nutshell, hygienic macros allow you to define new, reusable bits of language without worrying about the context in which they'll be evaluated. This makes for safer macro definitions, but precludes some very useful techniques that unhygienic macros allow--for example, enabling the code in the expanded macro use and affect variables in the context in which it's expanded.

But the C/C++ macro implementation makes LISP's unhygienic macros look like an Intel cleanroom. That's because they're not even really part of the C language syntax: they're just a simple, dumb textual replacement done in a preprocessing step, before compilation even occurs.

This feature inherited from C is so error prone that C++ added features like "inline" and namespacing to try to approximate the most common use cases for C macros, so that we wouldn't have to deal with them. It still didn't want to touch preprocessor macros for fear of breaking backwards compatibility, though, so now we have the worst of both worlds: a dangerous feature implemented outside the language syntax, with some of its bits duplicated in the language syntax, and guidance that says "sorry about the mess--here's some partial replacements that don't quite cover the gamut, but that's all you get. Have a nice day."

2. Worst-of-both-worlds standardization

C was born as an in-house development language in an AT&T lab in the late 1960s, and was used in anger almost from day one (for reimplementing the UNIX operating system). By the time standardization started, the language feature set was fairly solid and well-proven, and implementators already had real-world knowledge of the features.

C++, on the other hand, didn't go through this process. While C was designed as a language for implementing operating systems (and applications), C++ was designed as a language for implementing language features. It wasn't used (as far as I know) as the backbone of a single, well-known system in the way C was, so the language was free to evolve more divergently and more slowly.

Worse, the development of the language seems to have been driven by the the design and evolution of the specification, rather than by things tried and lessons learned in implementation. In some cases, features were added to the language specification before they were even implemented, in the hopes that smart compiler vendors would figure something out.

As a result, we have features that don't work like you'd expect (like std::vector or auto_ptr<>), features that don't interact well (like templates and class inheritance), and even features that, well, just don't work (like export, which was in the standard speculatively for years before its first attempted implementation, and which as far as I know has never been fully and correctly implemented by anyone).

On the other hand, while C was standardized after it had mostly stabilized, the C++ standardization process started while the language was still very much in flux. As a result, the core language is full of weirdnesses that are explainable only when you know the political situation at the time.

For example, the construction "virtual void foo() = 0;" is a pretty weird way to spell "pure virtual". In The Design and Evolution of C++, Bjarne Stroustrup reveals that the "=0" construction is there because he wanted to get pure virtual functions into the language specification, but a committee meeting was coming up soon, and he didn't think he could convince enough people to get behind adding a new "pure" keyword.

This leads right into...

1. C++ tries to be all things for all people

I think this one is the root of most of C++'s problems. C++ is and has always been a "more-is-better" language. If you like C, we'll make sure you like C++ by bending over backwards to make C code still work (except when it doesn't) and by making efficiency our top, err, one of our top-ten priorities. If you like object oriented programming, we've added classes and inheritance. Oh, multiple inheritance? Yep, we heard that works well, so we'll add it in there too. Parametric polymorphism? Multiple dispatch? Currying? Oh, hrm, we seem to have painted ourselves into a corner... but we can bodge most of that in with templates and partial template specialization. Oh, and guess what? We just figured out that you can use templates to do metaprogramming, so you get that feature for free! Free is good, right?

This results in two, mostly-correct perceptions:

1) C++ is a big grab-bag of language features, some of which are razor-sharp and don't really hang together coherently, but work great so long as you're really, really careful.
2) C++ is more-or-less better than C, so long as you stay with a "sane subset" of its features.

But what is that sane subset? That depends entirely on who you talk to, and the subset that they choose usually reveals more about their own priorities and experience than anything about the language itself.

All that being said, I still choose C++ (or my own trusted subset of it, at least) over C because of the convenience of constructors and destructors, the expressiveness of templates, and the confidence I get from RAII. I still choose it over Java because I don't need to worry about a runtime VM, because I can access platform-specific APIs and native libraries at will, and no checked-exception silliness.

But that doesn't mean I don't sigh a little every time I burn multiple days chasing down an intermittent memory leak, or that I don't steal a surreptitious glance at younger, better-looking languages with less emotional baggage from time to time.

Thursday, October 15, 2009

Buying or Building... Furniture

Most software developers are familiar with the "buy-or-build" question: is it more effective to find existing software and try to make it work in your situation, or to build it to your exact specification and take on the burden of maintaining it? But sometimes it comes up in other contexts.

Like office furniture.

My current project at work is winding down, and I'm rolling over to a new one. As part of the transition, I'm moving from my old, two-person office into the new team's bullpen environment.

It might seem like a poor trade, but this team chose to trade in their fairly nice offices because they valued the higher conversational bandwidth they got in a bullpen. Yes, it's a bit noisier, but most of the noise is project-related, and results in quicker and more complete information dispersal both among developers and between developers and SQA engineers (who also share the space).

The big win for me is that it reduces the barrier to pair-programming to the cost of mumbling, "Uh... can anyone take a look at this with me?" And we still have the offices for when we need to make a phone call or do an interview.

One of the stipulations on building out the bullpen was that we had to use existing furniture. Unfortunately, while our current furniture is nice (and somewhat pricey, from what I'm told), it's optimized for a one-person or two-person office. We each get a curvy desk, a table with attached bookshelf that fits the curvy desk as an extension, and a funky rolling file cabinet. But the curvaceousness of the furniture means that it only fits well in a few prescribed configurations--none of which match a bullpen where you want to pair-program!

So the current bullpen, built from curvy bits loosely jammed together, isn't big enough to hold more people. And naturally, the people who handle furniture and facilities wouldn't be terribly happy with us saying, "Oh, this expensive furniture is nice. Now would you mind finding some place in our already-filled building to store it, and buy us some additional expensive furniture just like it, but without curvy bits?"

So our manager/Scrum Master, being the pragmatist that he is, decided we should build our own. From scratch.

Actually, "scratch" in this case really means heavy, solid-core interior doors for tabletops, and prebuilt folding-table legs to hold them up. Assembly is trivial, the surfaces are generous, prefinished, and attractive, and the cost was just a fraction of what we'd have paid for non-curvy versions of our standard furniture (which keeps the facilities folks happy... or at least happier).

There are, of course, some drawbacks. Making single large pairing stations means that you have to choose a single table height. In our case, it was chosen for us by the height of the prefab table legs.

However, my current programming partner suffers from an unfortunate and tragic genetic defect that caused his growth to continue far beyond normal human levels (the medical term is, I believe, "freakishly tall"). I, on the other hand, boast a full 5'3" of height, which seems far more normal to me, all things being relative.

So our alternative solution was to just tear the bookshelves off two small tables (again, that storage problem!), and then use one for each person, moving the tables around when we need to. The works great if the tables have cool adjustable legs like ours do.

(Mine, of course, is the station on the right.)