null device

Github Pages

Github pages offer a lot more freedom than

All Programmers Are Self-Taught

When I was a teenager I played high caliber baseball. I’m competitive to a fault and when I decide I want to be good at something, results usually follow. Now I’m a third year undergrad studying computer science. There’s something critically different between programming and sports though: A pitching coach teaches you how to pitch, but a CS professor doesn’t teach you how to code.

I was surprised that neither my TAs nor professors critiqued my code during my first year, but grew concerned after my second year. The assignments were larger and the problems tougher, but even after submitting some 2,000 lines of code in my data structures class, I never once had a comment on my code — if my program compiled and my unit tests demonstrated correctness, that was enough. It wasn’t until a small group project that I realized how ugly some code could really be, and I started asking questions about what good code really was.

But what is good code? I take a lot of time to make my code readable and self-documenting. I try to follow the UNIX philosophy on simplicity, that I should make my program work above all and only optimize if there’s a need. I keep in mind asymptotic complexity. I avoid threads unless I really need them. But I honestly don’t know if that’s good code or not (honestly, I think I’m a bad programmer).

I’m lucky enough to have worked with some students I think are great programmers, people who have interned at Microsoft, Google, Amazon and the like. Their opinion is generally the same, that most of what they learn is from self-reflection or picking up on other programmer’s habits. Even the ever-practical engineers share my sentiments. So here’s my claim:

All programmers are self-taught.

My education is giving me awesome tools: data structures, algorithms, database design, concurrent programming, network programming, agile development and different programming paradigms. But these are all tools, and even though they translate to much more efficient and smarter programming, you can still use them wrong — I’ve seen horrible code come from students who do exceptionally well in these classes.

I don’t think it matters if you study computer science, software engineering or get a college diploma, if you’re going to write code for a living, you’d better be ready to teach yourself.

Concurrent Programming is Scary

Read Hacker News, Stack Overflow and various programming blogs long enough and you’ll run up against the ever popular opinion that concurrent programming is hard, scary and dangerous. Most articles suggest avoiding it whenever possible. I’ve messed around with pthreads and Java Runnable classes but never really experienced the horror that can arise from concurrent programming. Well, I have now have my own little story that will make me appear wise and sage-like in a programming interview.

As part of my introductory systems class I had to implement parts of a user-level threading package. Most of the work had to do with writing reader/writer blocking locks using spinlocks. There was a small test program to check if the implementation was working and after 20 minutes or so I had things working. “No problem,” I thought, “I’m cruising.”

Condition variables came next, and the code was similar to blocking locks. CVs are associated with a monitor (blocking lock) but have some interesting functions associated with them: wait, notify and notify all. Calling wait on a condition variable will block the running thread, pushing it onto a waiting queue associated with that CV/monitor. Notify blocks the current thread and attempts to run a thread in the CV’s queue. At this point I slowed down and read through lecture notes and my monitor code very carefully. This is the “cerebral” part of the assignment.

My small test program ran as expected after a few tweaks, so I moved on to the last part of the assignment, using monitors and CVs to implement a synchronized stack. Again, the test to make sure everything was working was provided and consisted of a couple thousand push and pop operations in multiple threads. I had to write the push and pop functions that would wait and notify at appropriate times, i.e. it can’t pop when the stack is empty.

Deadlock! Deadlock everywhere. I spent three hours reviewing all my code, even backtracking into monitors and spinlocks trying to track down bugs. A few hundred printfs later I received an email from my professor, an update on the lab. There was an error in his code for push. I removed one line (which I thought looked odd but never looked at again) and my code ran just fine. All tests passed.

What did I get out of this assignment? Concurrent programming is scary. Debugging regular code is hard enough. It’s not a horror story worthy of HN but it gave me some insight. The assignment and fix were simple, and it was done in a lab environment. But if my professor hadn’t informed the class about the bug, how long would I have spent debugging my own code? Probably all night.

Someone Used My Software

As far as I know, no one has ever used my software except the markers in my CS classes and me. That’s because I spend almost all my time coding for assignments or hacking together small utility programs. Larger projects get lost after a week or two; they were either too ambitious or I got lazy. A web crawler I was writing last year got lost in GUI code and my chess engine ended up at the bottom of the heap after the first week of classes this term.

A couple weeks ago I happily wrote some C code after smashing my head against a type-checking assignment for a Lisp-like language. I wrote a program I was sure I would use, something that wasn’t necessarily elegant or unique, but served a purpose. The result was clines, a little command line program that provides data on C/C++ source files. Give it a source file and it reports how many lines are strictly comment lines, code lines or blank (whitespace) lines.

My roommate recently finished a term project in C++ and wondered to himself how many lines of code he had actually written (I think we were discussing comment verbosity). He used my program and suggested another feature that would be nice. I added it. It’s cool to make a simple, working program that’s useful, but it’s rewarding in a unique way when someone else finds your code useful. At the very least, it’s an interesting experience — even if it’s just a line counting program.

Kernel Coding Style

I came across the kernel coding style page while browsing open source security utilities. I’ve bounced between styles quite a bit this year, even from assignment to assignment, and it doesn’t help that I’ve used C, Java, Haskell, Racket, Prolog and PHP just this term. Fiddling with my vimrc file makes things worse, as I constantly find myself frustrated with alignment in other text editors and github. But I’d like to use a consistent coding style, even if it’s not kernel coding.

Brace placement varies more than anything amongst people I’ve coded with. Lately I’ve been putting every single brace on it’s own line, thinking this was the true way of C. Linus has shown me otherwise:

“as shown to us by the prophets Kernighan and Ritchie, […] put the opening brace last on the line, and put the closing brace first”

K&R are indeed prophets, so I’m sold on this. I’m not fond of using 8-space tabs, though. But there’s some interesting rationale behind that choice:

“if you need more than 3 levels of indentation, you’re screwed anyway, and should fix your program.”

I’m tempted to use 8-spaces just to remind myself of that. Favor simplicity, the core of UNIX programming. I guess WordPress can’t handle gist. Actually, there’s not much it can do. I guess I’d need my own blog for any real support. Anyway, take a look at the gist to see what I’m talking about.