Posts Tagged ‘programming’

Java maps and sorting

Posted: 1 August 2009 in Uncategorized
Tags: , , , , ,

I’m always a little annoyed I have to implement sorting Map keys by their values myself in Java.  It seems like they should be a part of the standard Collections library or something.  Maybe they are and I just haven’t seen it?  My solution (gist) is based on feedback from Josh in the comments to a previous post. How does that look to you?

Advertisements

Fun with trees in Ruby

Posted: 20 November 2008 in Uncategorized
Tags: , , , , , , , ,

Like Java and unlike Python, Ruby does not support multiple inheritance.  Also there is no explicit way to create an interface.  One way Ruby lets you get around both problems is by allowing you to include a module in a class.  It’s not quite the same, but with the proper planning you can duplicate the functionality.  Of course, one question you should always ask yourself when trying to shoehorn something from one language into another is if you’re really going about it the right way.

One way of implementing a Java-like interface in Ruby is by creating a module containing the skeleton functions you want the implementing class to implement.

module A
  def method1() raise "not supported"; end
end

class B
  include A
  def method1
    puts "now implemented"
  end
end

Presto, module A is basically a Java interface.  Of course, whether a method has been implemented is not checked until runtime when the method is actually called.  Also if you mix in implemented methods alongside the interface methods, you have something very like an abstract class (minus the compile-time checking).

This came up when I was implementing a bunch of simple tree functions like finding the siblings of a node, finding the grandparent, the descendants, the leaves of a subtree, etc.  It seemed like these were things that should be implemented already.  And why not?  So I threw all of those methods into a module and made it like a Java abstract class.  All you have to implement is a method to call the parent of the current node (or return nil if there is none) and a method to get an Array of the children of the current node.  Your class can pull children from a database, a file, something more complex — it doesn’t matter.  Just implement those two methods and drop in the SimpleTree module and problem solved.

Since I’ve been having fun with gems, I made one for this and slapped it up on GitHub.  To get it, just type:

sudo gem install ealdent-simple-tree

Assuming that you have already done this as some point in the past:

gem sources -a http://gems.github.com

Feel free to extend it, modify it, contribute to it, etc. I’m using the BSD license, which is my current favorite.

Since Ruby is my new favorite toy, I thought it would be fun to try my hand at C extensions.  I came across David Blei’s C code for Latent Dirichlet Allocation and it looked simple enough to convert into a Ruby module.  Ruby makes it very easy to wrap some C functions (which is good to know if you need a really fast implementation of something that gets called alot).  Wrapping a C library is slightly harder, but not horribly so.  Probably most of my challenge was the fact that it’s been so long since I wrote anything in C.

Since the code is open source, I decided to release the Ruby wrapper as a gem on GitHub.  I chose GitHub over RubyForge, because it uses Git and freakin’ rocks.  But GitHub is a story for another day.  Feel free to contribute and extend the project if you’re so inclined.

A basic usage example:

require 'lda'
# create an Lda object for training
lda = Lda::Lda.new
corpus = Lda::Corpus.new("data/data_file.dat")
lda.corpus = corpus
# run EM algorithm using random starting points
lda.em("random")
lda.load_vocabulary("data/vocab.txt")
# print the topic 20 words per topic
lda.print_topics(20)

You can also download the gem from GitHub directly:

gem sources -a http://gems.github.com
sudo gem install ealdent-lda-ruby

You only need the first line if you haven’t added GitHub to your sources before.

I attended a Matlab training seminar yesterday with the dual topics of “Advanced Matlab Programming” and “Distributed and Parallel Computing.” Of the two, the Advanced section was more interesting, though my original motivation for going was the parallel computing part. In the morning, I felt like it was going to be a waste because my Matlab programming skills are weak, and if my advisor had not strongly suggested I attend, I might’ve skipped it. I’m glad he did, because it was surprisingly enjoyable and I felt like it was right on my level. This might be because programming in Matlab isn’t especially hard or different from other programming languages and I know enough to get by already. Or it might be because Matlab is becoming a little more like Python. (more…)

Just what value is there in getting a degree in Computer Science (CS)? Are new graduates competent programmers? Is that the purpose of a CS degree? Should companies be spending money to train new hires out of college in the programming languages and practices that they use?

Robert Dewar is a professor emeritus at NYU in computer science, and he believes that the status of software engineers in America is in danger due to general incompetence of new graduates. The long and the short of it is that after the dot-com bubble burst, and computer science enrollment at universities plummeted, schools restructured their programs to be more fun. Essentially, they were dumbed down. Specifically, the focus has shifted away from math and the theory of computation. Students are not taught a wide range of programming practices, but instead are trained to rely on large software libraries in a sort of “cookbook” approach. That is, students can assemble a solution to a known problem (in Java), but they are woefully undertrained for solving actual problems in the wild with “more practical” programming skills.

(more…)

Today is Donald Knuth‘s 70th birthday. If you haven’t at least heard of him, then you probably are not a programmer. I’ve heard several bloggers refer to him as a modern-day Alan Turing (who is widely considered the father of computer science). Knuth is sometimes referred to as the father of algorithmic analysis, so at the very least, his contributions to the field should definitely earn him a place of high regard.

While I’ve never read any of his books, I have used one of the tools he created quite extensively in the past two years: TeX. For those who’ve never had the pleasure of using TeX and seeing documents come out beautifully and professionally formatted with relatively little effort, you’re missing out. Some might argue that you’re missing out on hours of headaches for something you could do in Microsoft Word in 1 minute. I would argue back that while getting TeX to do exactly what you want can sometimes be hard, there are things you can do in TeX very easily that you will never, ever be able to do in Word. Try producing a lower case delta with a hat in Word. Unless you are lucky enough to have a font on your computer with it (and please send me a copy of that font if you do), you will be searching a long time.

There are many Knuth tributes out there from people with far more interesting stories than me. There was an even a call to post, issued by Jeff Shallit. Here are a few:

  • Recursivity – biographical notes and discussion of Knuth’s impact on his life (Jeffrey Shallit)
  • Computational Complexity – some observations about his achievements, his books, and TeX
  • Good Math, Bad Math – a lot about TeX if you’re interested
  • Geomblog – a discussion of something from the second volume of his book The Art of Computer Programming
  • Shtetl-Optimized – more in-depth observations of Knuth’s many contributions
  • in theory – more biographical info and background
  • 0xDE – a pretty remarkable Knuth tribute with some very interesting CS stuff, complete with exercises!
So today presents a great opportunity to learn more about a guy to whom all programmers owe a debt of gratitude.

Recommended Reading

Posted: 23 December 2007 in Uncategorized
Tags: , , , , , ,

I think this should be required reading for any novice programmer and probably even more so for established programmers. Agree with him or not, I think you’ll agree that Steve Yegge has some interesting things to say. My favorite quote:

“Bigger is just something you have to live with in Java. Growth is a fact of life. Java is like a variant of the game of Tetris in which none of the pieces can fill gaps created by the other pieces, so all you can do is pile them up endlessly.”

This is especially interesting to me as I just jumped on the IDE bandwagon.  I received a few interesting comments  on that post that are worth reading.  A minor theme was the fact that you just can’t handle a massive code base without some kind of IDE (Integrated Development Environment).  I have worked with a code base of about 20,000 lines of Java with no IDE and there were certainly challenges.  I have also worked with a code base of over 100k lines of C (not ++) and that was a pain in the butt.  Massive changes took me days to complete and then weeks to debug.  Having an IDE would have made it easier, but it also would have made it much larger.  It is so easy to bloat up code with every kind of get/set method and constructor there is, but many of them are never used.  Is that a bad thing or just good future planning?  There is definitely a trade off, and one that probably comes down on the side of bad thing more often than not.

In any case, it’s something I have to keep in mind as I go forward with my new project.