Dan's Movements: April 2013

First off, I've taken a job, starting immediately after my finals, at the beginning of May. Although I am thankful for the education I've received at Roosevelt (and especially that at UIC), it's time to move on. Partly this is due to not having sufficient VA funding left for another full year, and largely it's due to needing an income. Student loans seem very manageable while they're deferred, but who can say what how fun they'll be when it comes time to start paying them down. I'm ambitious to get those eliminated in as short a time as possible. My hesitation fifteen years ago to incur debt (when university education was perhaps a third of its current price) is being rewarded today by a higher sticker price, and perhaps a less rosy outlook.

Secondly, since the job I've taken involves linux hosting, I have been trying to brush up on the gentoo universe. I've used gentoo in the past, it's a wonderful system, rather like FreeBSD's ports system, or the wonderful Homebrew suite for OS X. The installation is a little bit tedious, since you need to do many of the parts by hand, but less troublesome than say, Linux From Scratch, since the guide is directed to getting you finished quickly, rather than exploring the intricacies of building binutils and libc in several stages. Instead, Gentoo provides a base image (the stage3 tarball) and allows you to build up the minimum bootable system from that, without rebuilding the entire toolchain. It's certainly not a point and click install, and you need to know how grub and fdisk work, but the tradeoff is that nothing you don't need is installed, which is excellent for servers. If you know the ten packages you want, and which features you'll use, the small number of dependencies is automatically fetched and built to suit your needs. Very clean, but not if this is your first (or second) linux system.

The interview went very well, either I know what I'm talking about, or they aren't particularly selective in entry-level staffing. Optimistically, I'll assume some middle ground between the two, that I'm sufficiently qualified for what they need, and apparently educable enough to move forward.

Code-wise, I've been returning to Python after a year or two. This is stimulated a little by the cool things people continue to do in python, partly by its ubiquity in the world of automated system administration. There are still so many things which bother me about it on a linguistic level, but I think it's a good place to play. I'm trying to reinforce some of the smalltalk experience I had, where objects and their classes are nice friends to have, and high level is an understatement. It's a very different world from the code I did this spring for my algorithms class, which was a SAT solver in C++, but I basically wrote a C program with iostreams for input/output. That was fun, I used a fairly clever shallow copy and copy-on-write idea to speed up the program about five times, and reduce memory use to a constant level, depending on size. My initial, naive approach crashed on large problems after allocating blocks like a fool.

I checked out Code-Dojo, which allows for rapid test iteration. After last summers experience with Pharo Smalltalk, I have to say I am a confirmed unit test enthusiast. For Common Lisp I prefer fiveam, which has a pretty gentle learning curve. For Smalltalk, the support for SUnit is first class, and really part of any package: the packaging tools in Pharo automatically bundle tests with the code, and the Nautilus Browser has a handy jump to test feature that will, if needed, create a test case class, and populate it with a testTheMethodIWasJustLookingAt method, and change focus to the new test. Tests can be run from the browser or the TestRunner tool, and it's a great experience. In python, I've been leaning toward unittest, after a brief run in with py.test. Py.test was easier to get started with (just write a function beginning with the word test, and it's automatically discovered), but unittest seems more likely to be separable from the source code. My current setup is a vertically split pane in Emacs, with code on the left and tests on the right in separate files, and sometimes I'll leave the python process in a horizontal pane across the bottom (though this often annoys me too much when toggling buffers, so usually it's buried, and I just bring it into focus when running tests.)

So, in the last day, I rigged up a (probably inefficient) Sudoku problem solver, which is available on github, and build it bottom up either test first, or test immediately. I probably could have done a more efficient approach (I can't help thinking I check the whole puzzle way too many times), but it solves an empty puzzle in around 10s, and a puzzle from the paper rather quickly. Plus, it was challenging enough to force me to rediscover some of the warts in python, i.e. [0]*10 makes a list with 10 copies of 0 in it, but [[0]*10]*10 makes a list with ten references to the same list of zeros. Had a fun time failing tests with that bit of unexpected behavior. Also, I worked through some of the simple projects on code-dojo, the one hundred doors and anagram exercises. That was useful, and I hope to repeat it again. I once saw two people from Eighth Light doing live coding for the change counter exercise, one in Ruby, the other in Python, side by side, just making failing tests, and extending the code. Having first seen this in action, I came away from that experience wondering how I could go out and use this method in my programs, and have tried to approach projects with TDD in mind ever since.

Besides catching mistakes early, I enjoy the freedom it gives you to feel comfortable making substantial changes in the implementation while ensuring that the exposed interface is maintained. Verbose tests are much more friendly to me than a debugger, especially when the runtime and compiler aren't present in the debugger. GDB is a powerful tool, but looking at source, and touching it are separate affairs. In C, I find myself writing functions which are only useful inside the debugger, say, for formatting object inspect from a pointer (maybe there's a great way to automate this in gdb, but I haven't discovered it, yet).

I've looked at trying this in C++, but the Boost test system was overwhelming, and not exactly trivial to implement. In fact, until some well supported community best practice emerges in C++, I don't know that I'll use test suites there. The code overhead of implementing another class, populating it with tests, directing these to run in some manner, and instantiating it (does this go in a separate debug executable?) seems a little heavy, in a language environment where you are already drowning in details of the system, and the compiler backtraces are sometimes helpful, sometimes downright silly. Coming from Smalltalk and Lisp, the experience of highly syntaxy languages is a little jarring. I'm sure It's handy to be able to instantiate a dictionary literal, or access a slice of an array or list, but all these shortcuts glue together a little unpredictably (like the above list multiplication operator using shallow copies, rather than deep ones). Also, the major gain of lisp and smalltalk for me has been the environments. I can ask Pharo or Squeak where some bit of code is, and it will bring it right up for me. In emacs, with slime and sbcl (with source locations turned on), I can find every definition of any symbol with M-., and debug macroexpansions with C-c RET, and inspect and change objects in the debugger. It's not as pretty to look at as smalltalk, but the idea that the objects are alive, responsive, and ready to build your program is definitely invigorating. With Python, I think I'll need to see what the pdb or some other debugger offer me. It's definitely not the case that I can ask for the implementation details of list or dictionary from the system, and the python.org docs seem a little cursory when discussing the implementation details (linking module docs to source trees in git would be wonderful, and I'm not the first to ask for that). Although it's open source, and mainly written in itself, finding the source of some method in python is not automatic.

One last gripe, and it's a small glitch for me, is when I try to execute a python script from a buffer in emacs, it defaults to python -i. It would be wonderful if I learned how to override this to read the shebang line and act appropriately (i.e. call python or python3 as specified at the top). That probably just involves a little digging in the configuration (is that rope or pymacs handling this behavior?). The emacs integration is a little less tight than slime, which really tended to make emacs a superpowered ide for lisp, rather than an editor with some shell out capacities.

Dan's Movements

Wednesday, April 24, 2013

Go2Shell

Life and Code