Aftermarket Pipes

Tuesday, October 16, 2018

RIP Paul Allen.

"So, what did I just buy?"

That was my first impression of Paul Allen. I was a 20-year-old who had just gone from intern to tech lead when Paul bought the startup I was working at. The other programmers had departed, the founder cashed out to become an artist, and I was left to run the project--the first software video editor on Windows (before Premiere).

He had called me into his office for a demo. I fumbled through, avoiding the parts that I knew tended to crash. He asked a lot of deep, technical questions, most of which I didn't have a great answer for (intern here!), but he did it with a knowing kindness, a kind of "older brother who knows a lot".

Working for Asymetrix, I got to do a lot of things that were above my pay grade (and honestly above my ability). "I'm investing in this company--go and see if their tech is real." (Ok.) "Go to this CD pressing plant--see if they're actually capable of shipping ToolBook." (Umm. wha?)

Given that he was running half a dozen companies and a basketball team at the time, my face time with him was limited. It was usually "Paul wants to look into... can you do that?" But when I ran into him casually at work, he always had real questions about the projects that revealed how closely he tracked things behind the scenes.

When I told my boss I was resigning to move back to Pennsylvania (my fiancée didn't like Seattle), I received an interesting "counter": Paul had a stake in Cardinal, a modem manufacturer in Lancaster. I could work for them (on paper) while still working on the video editor for Asymetrix. I took them up on it immediately. (I only ever set foot in the place to sign the paperwork--we jokingly called it "employee laundering.") I don't know how much direct influence he had on that arrangement, but it seems like the kind of thing he'd have done.

I never saw him after that, but I think we could do with more people like Paul in tech. I'll miss him.

Wednesday, October 11, 2017

This is why we don't put changelogs in source files any more.

This is a piano key. It belongs to a piano that we inherited from my sister-in-law, who inherited it from the school where she worked, who inherited it from a church sometime in the 1940s.  We think it's about 90 years old.

The previous owners had the tuner sign and date this key when they maintained the piano. This is convenient, because you already have the piano disassembled, and you're looking right at the key anyway, so might as well. And then when you've got the piano disassembled because you're maintaining it, and you're looking at the key, you can see when you had it maintained last.

However, the time when you really need to know "when did I get the piano maintained" isn't when you have the piano disassembled: it's when you're thinking about getting the piano maintained again.  The real place you want it is in a calendar or a journal, outside the piano.

Likewise, you generally care about the change history of your source when you're looking at the source control tool, trying to understand how it changed over time--not when you're down in the bowels of the code.  

I mean, it's neat to have something to read while you have the piano disassembled, but that's not really the best place to store the data.

Wednesday, May 16, 2012

It's probably been done.

I started programming in junior high, writing BASIC code on a Commodore 128. At the time the only learning resource I had was the owner's manual and some issues of Compute!'s Gazette, so I was almost completely self-taught. I don't think I had actually heard the words "computer science".

If you've never programmed in BASIC of that era, understand that:

There were no "functions"; a program consisted of a list of steps, executed linearly, with some control structures sprinkled in and plenty of GOTO jumps
Variables were all global
Variable names were limited to two letters

There was a limited bit of "subroutine" support: GOSUB was like a GOTO that remembered where it came from, and a RETURN would start executing from the next line after the GOSUB. But since all variables were global, subroutines couldn't be reentrant.

So I got the clever idea that instead of using variables in a GOSUB routine, you could use arrays, and then have a variable that kept track of how many times you'd entered the subroutine. Sure, you had to define the maximum number of times you expected to GOSUB before you started returning, but at least you didn't have to remember whether or not the variable you were about to mutate was going to munge a previous call.

It was brilliant.

And when I got to college I learned that I'd "invented" the call stack and recursive functions. Rather a let-down, that.

So, grab your 8502 and load up a JMP 2011, when I wrote a blog post on "Stages of Competency":

After doing "this programmer thing" for a few years now, I've noticed a pattern in how I acquire skills and techniques. It's surprisingly consistent, and consists of these stages:

I thought it was pretty remarkable how consistently I saw the progression from "awareness" to "familiarity" to "functional understanding" to "understanding" to "competence" in myself and in others.

Pretty insightful, huh?

Turns out that pattern is a simplified subset of an educational classification system called Bloom's Taxonomy.

And it was first proposed in 1956.

So, yeah, that insight you thought you just had? It's probably been done.

Friday, November 11, 2011

3 Simple Rules That Will Make You a REAL Superstar Developer

In my experience there are two kinds of "rock star" software developers. There's the Neil Peart rock star developer, who combines a natural blessing of talent and intelligence with a relentless work ethic and humble attitude, and over time becomes the developer that people not only want to hire, but want to be. And then there's the "prima donna" rock star developer, who combines a modicum of raw talent with sheer attitude and self-promotion into the programming equivalent of a hotel-room-trashing, sex-and-drugs-and-rock-and-roll tabloid icon (I won't name an equivalent musician--use your favorite example).

Last year a tweet by Zed Shaw pointed me to a brilliant piece of satire called 3 Simple Rules That Will Make You a 'Superstar' Developer that gave three simple rules and two deeper principles for becoming a hard-living, Type 2 rock star programmer. It's remarkably concise and accurate. But what struck me is how close the three rules and two principles are to rules and principles for becoming a real Professor on the Programming Drums.

"Prima Donna" Rule 1: Write lots of code.

Have to fix a small bug in an area someone else has written? Don't waste time trying to understand it or their motivations for writing it that way. Just rewrite the lot as you think it ought to work. Call it refactoring if anyone asks.

"Neil Peart" Rule 1: Read lots of code.

You will spend more of your career reading code than writing code. Learn how to do it well. That means doing it a lot. Read code even when you don't absolutely have to, and understand it deeply even when you think a shallow once-over will tell you all you need to know. Have a spare half hour? Read the last couple checkins from other people on the team, even if you don't need to. You will learn more about the system faster, you might find issues earlier, and you will probably learn something or see a technique you didn't know about. Have a spare afternoon? Find an open source project and start reading code. Copiously reading both awful code and good code will help hone your internal sense of the difference.

Prima Donna Rule 2: Write your code quickly.

Touch lots of files, and include every one of them in the ChangeLog. Don't worry about accidentally introducing hard-to-find bugs; they'll actually help you later on, as long as they're actually hard to find. Avoid introducing trivial bugs.

Neil Peart Rule 2: Finish your code quickly.

"Done" is a Boolean state, and work isn't done until you would be surprised to have to revisit it again in a few weeks. Minimize your personal work-in-progress.

Don't let 90%-finished tasks rot outside source control in a local directory, because you will forget the details. Don't check something in, thinking "this will do for now; we'll get to hardening it later", because you *will* forget the details. And it's not finished until it's tested to your team's standards, documented to your team's standards, and understood well enough that if you get hit by a bus on your way home tonight, someone else can take your place.

Prima Donna Rule 3: Don't take time to document your code.

And don't add little comments explaining potential pitfalls in modifying some of the less clear statements you've introduced. You don't need them--you wrote the code.

Neil Peart Rule 3: Document your code with a single-minded purpose.

Obvious code with only as much documentation as is needed is the Holy Grail. Undocumented and unclear code is as bad, but over-documented code can be worse because it becomes a crutch ("so what if the code is ugly: that's why I commented it!").

You already know that when you write code, you put your reputation on the line that it is correct. But when you document code, you put your reputation on the line that not only is it correct and sufficient now, but it
will be correct and sufficient when someone looks at it down the road. So minimize your risk of shame. Boilerplate comments, rambling exposition, comments that duplicate the code, and commented-out code
left in "just in case" are signs of laziness hiding behind the mantra of "comments are good; more must be better!"

Behold, the Underlying Principles

Strikingly, each set of rules emerge from one technical principle and one social principle, and each set of principles is a mirror image of the other:

The Prima Donna Technical Principle: You're 10x as productive when you're working on code you wrote as on code you didn't write.

So, maximize your opportunity to work in code that you wrote, no matter the consequences.

The Neil Peart Technical Principle: You're 10x as productive when you have full awareness and mastery of your environment.

Yes, you can achieve that by always working on your own code. That will mean being so prolific that all the problems you're fixing are your own creations.

Or you can achieve it by having a deep and total understanding of as much of your team's project and tool set as possible. And the cleaner a design or a development process is, the more of it you can fit in your head at once.

The Prima Donna Social Principle: You win The Game by improving your reputation to superstar guru levels.

Your programming ability is judged by how much code you write, how quickly you finish features and fix critical bugs and how often your insights are necessary to solve problems.

The Neil Peart Social Principle: Optimize your life for value, not perceived ability.

Your value to your project and your team is only partially related to your programming ability (perceived or real). It's directly proportional to your ability to add value to your project and your team. The more deeply you understand your project, your team, your code, and your tools, the more value you can add. Conversely, any technical debt you create will be repaid either by you or those who follow you, and servicing that debt reduces the mental "capital" you have available toadd value.

In many ways, the "prima donna" and "Neil Peart" principles differ only subtly. Maybe that's why it's so easy to find yourself on one path, when you really think you're on the other.

Tuesday, September 06, 2011

Case Study: Python as Secret Weapon for C++ Windows Programming

One of my favorite features of Python is its interactive shell. If you want to try something, you type in the code and try it immediately. For someone whose first coding environment was the equally-immediate Applesoft Basic, this is just as natural. But if your introduction to programming was C, C++, or Java, the benefits might not be apparent, especially if you're trying to do exploratory coding in one of those languages.

So I'm going to walk through a recent experience as a case study.

The Problem

At work we develop a Windows program that talks to certain devices via serial cables. Those devices also come in wireless Bluetooth flavors, and we connect to them via a "virtual serial port". To the program running, it looks as if the Bluetooth device is plugged into a real serial port, because all of the wireless connectivity is abstracted away by Windows. These devices are unidirectional--they transmit data to the Windows program, which passively reads it.

If you power off and restart one of these wired devices, it will start chattering away at the Windows program with hardly a hiccup--our program never even sees a disconnect. However, we noticed that this didn't happen with the Bluetooth devices: powering down one of those requires the Windows app to reconnect. But the Windows app didn't even seem to get any notification that the device disconnected. So how do you solve this chicken and egg problem?

Research

It had been a while since I'd done any actual hardware serial programming, so I started with some documentation, and remembered that the RS-232 serial spec included a line called DCD, or Data Carrier Detect (also called RLSD, for Receive Line Signal Detect). Back in the dinosaur days, this signal meant that your modem was connected to the remote modem, and was able to start communicating back and forth.

Sure enough, a search brought up the right bit of Win32 API documentation, which told me how to detect an RLSD change on a physical serial port using the SetCommMask and WaitCommEvent calls. The question now became "does the Microsoft virtual serial port for Bluetooth support RLSD"?

Exploration

At this point I could have started up Visual Studio, created a scratch project, written a couple dozen lines of C++ code, compiled and linked, fixed the compile errors, compiled and linked again, run the program, fixed the inevitable errors that the compiler didn't catch, and then had my answer.

But I'm too impatient to wait for Visual Studio to start up, too lazy to write C++ when I don't have to, and I have the hubris to think I can come up with something better than the obvious solution. Programmers are funny like that.

So instead, I cranked up DreamPie.

Secret Weapon #1: DreamPie

DreamPie is, very simply, my favorite cross-platform interactive Python interpreter. It began life as a fork of Python's built-in IDLE command shell, and from there it's never looked back. It has excellent interactive completion for packages (so you can type "from sys import s" and get a list of "stdin, stdout, stderr").

Even better, it does completion when you're typing file paths in arbitrary strings. I use this a lot to get to modules I'm trying to test: "import os,sys; sys.path.append('c:/src/'" gives me a list of all the directories in in "c:/src".

It also has a slick separation of (typed) input and (generated) output, and a neat "copy only code" feature that makes it perfect for "try this code interactively, and when it works the way I want it, yank it into the actual source file" exploration.

DreamPie works pretty much the same on both Linux and Windows systems. It's reputed to work well on Mac systems, too, but I don't use them for day-to-day development.

So where I'd normally crank up the Python command interpreter for interactive exploratory coding, I usually reach for DreamPie instead.

But what I needed to explore now was the Windows API as called from C++, not Python.

Secret Weapon #2: ctypes

ctypes is a "foreign function interface" (FFI) that's been part of Python since version 2.5. An FFI is just a way to call code that isn't written in your current programming language. In our case, the functions I wanted to call in order to test out serial port notification are in the kernel32.dll library, which is part of Windows. ctypes makes this really easy. Well, easy if you happen to have the Windows API documentation and all of the correct C header files handy, and if you know exactly what you're looking for:

>>> import ctypes

... file_mode = 0x80000000 # GENERIC_READ from <winnt.h>

... open_existing = 3 # from <winbase.h>

... buffer = ctypes.create_string_buffer(100)

... bytes_read = ctypes.c_ulong(0)

... hfile = ctypes.windll.kernel32.CreateFileW(r'\\.\COM17', file_mode, 0, None, open_existing, 0, None)

... ctypes.windll.kernel32.ReadFile(hfile, buffer, 100, ctypes.byref(bytes_read), None)

... buffer.value

0: b'\r\n052100746029\r\n'

>>>

Hooray. We can call the Win32 API functions to open the serial port and read from it, just like we would from C code.

But... that's an awful lot of crap to remember and type. I had to know exactly the C code I wanted to write. I had to know the Windows API well enough to find the constants and the functions to call. I had to know the ctypes API well enough to wire up Python to the C return values via ctypes buffers.

What a chore. Did I mention I'm lazy?

ctypes is the universal adapter--it can connect Python code to anything. But if you're specifically looking to call the Windows API, there's an even better tool:

Secret Weapon #3: PyWin32

PyWin32 predates ctypes, but it has a similar goal: gluing Python to something else. In this case, something else is specifically the entire Win32 API. PyWin32 consists of about two dozen modules, for example, "win32print" for printing, or "win32gui" for window handling, which wrap a good portion of the Win32 API.

The documentation is rather Spartan, but if you know the Win32 API side, you can map those calls to the PyWin32 modules without too much pain. The 8-line, hard-to-remember ctypes example above turns into just four lines of simpler code using PyWin32:

>>> import win32file # for CreateFile

... import win32con # for constants

... hfile = win32file.CreateFileW(r'\\.\COM17',

... win32con.GENERIC_READ,

... 0,

... None,

... win32con.OPEN_EXISTING,

... 0,

... None)

... win32file.ReadFile(hfile, 50, None)

0: (0, b'\r\n052100746029\r\n')

The Final Secret Weapon

My actual exploratory DreamPie session to see if Window's virtual Bluetooth serial port supported RLSD looked like this:

>>> import win32api, win32file, win32con

>>> hfile = win32file.CreateFileW(r'\\.\COM17', win32con.GENERIC_READ | win32con.GENERIC_WRITE, 0, None, win32con.OPEN_EXISTING, 0, None)

>>> win32file.GetCommMask(hfile)

0: 0

>>> win32file.SetCommMask(hfile, win32con.EV_RLSD)

Traceback (most recent call last):

File "", line 1, in

win32file.SetCommMask(hfile, win32con.EV_RLSD)

AttributeError: 'module' object has no attribute 'EV_RLSD'

>>> win32file.SetCommMask(hfile, win32file.EV_RLSD)

>>> win32file.GetCommMask(hfile)

1: 32

>>> win32file.EV_RLSD

2: 32

>>> win32file.WaitCommEvent(hfile)

3: (0, 32)

>>>

This is an actual copy of the DreamPie buffer from my test session, mistakes and all. This is what really happened when I tried to figure out if RLSD would work:

I typed up the code to open the serial port, which I knew should succeed, and it did.
I looked up the Win32 API call to get the "event mask", or the set of events that were being watched on the serial port handle, and saw that it was "GetCommMask". I blindly typed "win32file.GetCo", and lo and behold, DreamPie brought up a list of completions, which assured me that GetCommMask was there.
The Win32 API said that GetCommMask returned its result in a buffer passed into the call. Knowing that PyWin32 usually does a pretty good job of hiding return buffers, I decided to just try calling it with the input parameter, and got back zero. That made sense, if the serial port wasn't being monitored for events.
So I decided to push my luck: if GetCommMask worked, SetCommMask should work, too. A quick peek at the documentation, and... hrm. win32con didn't contain the "EV_RLSD" constant I was looking for to monitor the RLSD signal.
Well, I could have just typed the exact value (0x020) from the Windows docs... or I could just retype the line and use PyWin's autocompletion to see if win32file has the constant. I typed "win32file.EV_", and I had my answer. Then a quick re-test of GetCommMask() showed that the value was set.
The API docs claimed that WaitCommEvent should wait for one of the masked events to occur, and then return which one occurred. But the documentation showed that it took another of those return buffers. Thinking that PyWin32 might help me here, too: I typed "win32file.WaitCommEvent(hfile)", and the call appeared to block.
So I powered down the device, and within a few seconds, I was rewarded with the return value from WaitCommEvent: (0, 32). Aha. This meant that the Windows API version of WaitCommEvent returned 0 (for success), and that the return buffer contained 32, or EV_RLSD.

I included all the steps, including the mistakes, to show the last secret weapon: flexibility. Be willing to bounce back and forth between the documentation, the code you think should work, and the feedback you get both from the code under test and the tools you're using--and be willing to change your mental model based on that feedback.

In reality, this whole test took under five minutes from "Hmm... I wonder if I can use the DCD signal" to "Aha, looks like it works! Time to test it in C++." To be honest, I didn't even type out the whole ctypes version while testing--I started on it, realized that I'd have to look up and type all the constants by hand, then restarted DreamPie to jump over to PyWin32. Remembering that you can switch tools on the fly keeps you from getting stuck in ratholes that aren't directly related to the task at hand.

Flexibility is the key to fast and efficient exploratory coding. Using an interactive language like Python with a good set of support tools and libraries can be a secret weapon for speeding up exploratory coding--even when your target language is C++.

Wednesday, August 10, 2011

Stages of Competency

After doing "this programmer thing" for a few years now, I've noticed a pattern in how I acquire skills and techniques. It's surprisingly consistent, and consists of these stages:

0: Awareness

I've heard of the technique and can regurgitate a definition and a couple of use cases. I can probably pass a really bad phone screen (and in my experience, most of them are).

1: Familiarity

It's intrigued me enough that I've read up on it. I've probably looked at some code that uses it, and I can pick it out of a crowd, but I still mentally skip over it when reading its code (a bad habit that makes it harder to get past this phase).

At this point, if I were asked "what is X" or "how does X work" in an interview, I can probably pass the question, as long as there isn't a followup involving coding or something like "what are the pitfalls of using X over the long term", which is why I don't use questions like that in interviews anymore!

2: Functional Understanding

At this point I've either had to work with someone else's code that uses it, or else I've gone through an article that shows how to use it. I don't mentally skip over it anymore, and I can debug and modify it with some difficulty. Importantly, I can tell someone else what it's doing, but I will probably get embarrassed if I try to get into the details or (worse) debug it with them.

But I can't usefully synthesize anything with it. I get to a point in code and think, "Ah, this is a good place to use X!". Two hours (or more) later, I have bruises on my forehead from bashing it into the desk, I'm thinking "THIS CAN NOT BE THAT HARD", and I start wondering why I don't stay with the subset of techniques I know like the back of my hand. That's really tempting.

I get stuck in stage 2 a lot. I was there with C++ template metaprogramming for about five years, and I'm still there right now with Python metaclasses.

3: Understanding

After several frustrating episodes in stage 2, I do exactly the same thing in another context and... it makes sense. It works. I don't believe it, so I tweak things that should make it break, and it breaks in predictable ways. And I can reverse the tweaks and have it work again, predictably.

At this point, I always have the same three internal questions: a) do I really understand this? b) how did I not really understand this before? and c) what am I missing? I get uncomfortable not knowing how I know something.

Then all is well until I try to teach it to someone else, and we end up in another multi-hour WTF session.

What I've really learned at this stage is a single "groove" that works. As long as I don't deviate too much from the way I've used the technique, everything is fine. I think that subconsciously I know the limitations of that "groove", so I don't tend to make the little changes that expose the rough corners of my understanding. When I'm working with someone else, they have different edges to their own understanding. That's when I get this "uh-oh" feeling that tells me I really don't know what's going to happen when we do this.

Absent working with other people, I still think I understand it, which is a dangerous bit of self-delusion, and the biggest reason I'd rather work with a team than solo.

4: Competence

I don't know how I get here either, except maybe via repetitions of stage 3. In fact I don't usually notice even getting to this stage. The sign is usually that I'm having to do something outside the "groove" of my usual use of a technique, and that little "uh-oh" goes off, and then... it still works. Or else someone asks me about what would happen in a nasty corner case, and what comes out of my mouth is a better explanation of the details than I thought I could come up with.

This is also the point at which I finally feel comfortable writing about the technique, showing someone else how to use it, or trying to extend or modify it. The irony of it is that unless I do those things earlier, when I don't feel competent to do so, I tend not to get to this stage.

The funniest thing about this model is that if I look at code I've written in the past, I can usually pick out where I was on the scale when I wrote it. Again, I can't say exactly what the "tells" are, but when I get to stage 4 on something and look back at earlier code, I can think "ahh, ok, I was stuck in stage 2 at the time, and the places this code will break are probably X, Y, and Z."... and they usually are.

Forget owner's manuals--I wish brains came with source code. This progression would make a lot more sense then.

Wednesday, October 20, 2010

Switchpy

One of the consequences of the 2.x-to-3.x Python changeover is that I need to keep both versions around for a while on my Windows dev workstation.

Actually, strike that: I need to keep many versions around:

2.5.4, because that's the earliest version we support at work for some internal tools
2.6.6, because one particular internal tool jumped the gun and started using the "with" statement before we migrated to...
2.7, because that's what we're migrating those internal tools to (slowly)
3.1.2, because that's what we're targeting for new development
A "special" 3.1.2, which mimics the version we've modified for use in our embedded devices
The most recent 3.2 alpha, for testing
A 3.2 trunk install, for testing patches

Virtualenv doesn't exactly do what I want: you have to install it from within an already-installed version of Python, and it doesn't support Python 3 yet (although there is a fork that does). Plus it doesn't handle anything other than environment variables--it doesn't understand Windows' defaults.

Ned Batchelder wrote a neat script that does some of that, but again, it doesn't handle everything.

So starting from Ned's script, I came up with switchpy:

Supports Windows Python versions from 2.5 up to 3.2
Changes the local PATH environment in the current shell (via the same batchfile trick as mpath)
Updates the Registry-based associations (via code from Ned's script)
Pings Explorer so that if you run "python.exe" from the Start | Run command, it notices the update
Automatically reads installed official versions from the Registry, so you can say "switchpy 31" instead of "switchpy c:\python31"

So now, testing scripts in multiple versions of Python is as easy as:

C:\src\myscript>switchpy 25
Switching to Python at C:\Python25\...
Python is now C:\Python25\

C:\src\myscript>py.test
============================= test session starts =============================
python: platform win32 -- Python 2.5.4 -- pytest-1.3.0
test object 1: C:\src\myscript

myscript\tests\test_script.py ...

========================== 3 passed in 0.03 seconds ===========================

C:\src\myscript>switchpy 31
Switching to Python at C:\Python31\...
Python is now C:\Python31\

C:\src\myscript>py.test
============================= test session starts =============================
platform win32 -- Python 3.1.2 -- pytest-1.3.1
test object 1: C:\src\myscript

myscript\tests\test_script.py ...

========================== 3 passed in 0.03 seconds ===========================

For now, you can find switchpy in the same bitbucket repo as mpath; if I add any more scripts, I'll probably end up making it a more general repo.