At the interface of science and computing


One of the more frustrating parts of the last few years has been a lack of the kind of time required to learn new programming languages and re-learn stuff I had forgotten in my years on the road and not staying close to analytics and programming. I picked up some Ruby along the way, partly cause I liked the elegance of the language, and partly because it is really good at things I still do from time to time - launching and managing instances, and automating infrastructure. I still suck at it, but I can to launch an EC2 instance or two and can use Ruby-based static website generators. Works for me for the most part until I get frustrated at not being able to do things I could do in my sleep 6-7 years ago.

A language I have resisted over the years is Python. I didn’t love the syntax, hated the whitespace, and given that I had no time to properly learn the language I was more interested in, there was no room for Python. But there was always one reason I kept an eye on Python, scientific computing and analytics. While Ruby seemed to rule the roost for the devops crowd, Python has always been a darling of the science types, and I watched SciPy and Numpy with more than a tinge of jealousy, and I’ve long been an admirer of iPython.Then a colleague told me about Pandas.

Pandas is like R, but it is native Python, so lacks all the ugliness of R. It’s not as powerful as R today, but it was the final straw. I am going to teach myself Python, even if it means I never really become the Ruby guru I’ve always wanted to be. In my day to day life, there is a lot of opportunity for number crunching, data structures and analysis, and the more numerically oriented Python tools provide a powerful toolkit. I’ll still use Ruby for all the infrastructure management I do and hopefully some day find time to get really good with both languages. Given recent developments, not sure when that might be (maybe in another 17-18 years)