The Mendicant Bug

May 2024
S	M	T	W	T	F	S
	1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Mulled Beer

Posted: 17 September 2011 in Uncategorized
Tags: beer

I just tried my first mulled beer: BFM La Dragonne. While the weather is not quite appropriate yet, it is just beginning to get cool here in Atlanta. I heated it to about 55 degrees C and enjoyed it in a wine glass, though perhaps a tumbler would have been better. Also while the beer is labeled at 7%, someone scratched it out and wrote 4% by hand. Not sure what that’s about. I’m going to save a bottle for Thanksgiving and enjoy it again with family, when the weather is more appropriate. Christmas might be a better choice, but I don’t have that kind of patience.

Mulled Beer: BFM La Dragonne

Vultures love startups

Posted: 16 September 2011 in Uncategorized
Tags: startups, vultures

Great unexpectations

Posted: 13 September 2011 in Uncategorized
Tags: nokia, technology, windows 8

This week I came across two very appealing technologies from completely unexpected sources. Granted, these aren’t new, but I think they pose interesting challenges to the reigning awesome sauce (Mac OSX, iOS, Android).

Actual innovation in the operating system space from Microsoft? Windows 8 looks like it could be a real challenger.

And the Nokia N9. Just when I had written Nokia off for good, they produce a phone that looks like it’s actually worth looking twice at.

Simple Random Number Generator Gem

Posted: 24 July 2010 in Uncategorized
Tags: code, math, random number generation, ruby, rubygems, statistics

I just published the simple-random ruby gem, which is ported from C# code by John D. Cook. You can view the source on github or install the gem via rubygems:

gem install simple-random

The gem allows you to sample from the following distributions:

Beta
Cauchy
Chi Square
Exponential
Gamma
Inverse Gamma
Laplace (double exponential)
Normal
Student t
Uniform
Weibull

Simple examples:

require 'rubygems' require 'simple-random'

r = SimpleRandom.new r.uniform # => 0.127064087195322 r.normal(5, 1) # => 5.71972152940515

War on Attention Poverty

Posted: 14 July 2010 in Uncategorized
Tags: attention scarcity, daniel tunkelang, tunkrank

Daniel Tunkelang has posted his slides from his talk at AT&T Labs on TunkRank over at the Noisy Channel. Embedded below for your viewing pleasure:

TunkRank, Meet Tickery

Posted: 5 May 2010 in Uncategorized
Tags: fluiddb, tickery, tunkrank, twitter

Tickery is a rather awesome application of FluidDB that lets you explore Twitter in a number of ways. I mentioned previously in post on recent TunkRank improvements that TunkRank scores would soon be integrated with Tickery, and thanks to Terry Jones and his crew, the time is now!

Full disclosure: I’m a fan of FluidDB. I think it’s an awesomely useful technology and concept and I’m happy that TunkRank scores can be a part of it. One cool thing is that FluidDB’s permission system is designed so that even though Tickery is using TunkRank’s data, TunkRank still owns it. It can be revoked at any time if there was a reason to do so (not that I can imagine such a thing will ever be the case). Also, the data in FluidDB for Tickery and TunkRank are completely independent. Anyone else can come along and add a new set of data for mash-ups that would then use all three, without TunkRank or Tickery having to do a thing.

Playing around with Tickery

Now when you use the advanced search on Tickery, you can filter your results by TunkRank score, letting you do some interesting combinations on the data. For example, if I want to see who I’m following TunkRank scores greater than 50:

has twitter.com/friends/ealdent and tunkrank.com/score > 50

There’s lots to play around with there, especially when you start comparing the friends of various users. For example, if you wanted to know who Daniel Tunkelang (@dtunkelang) and I both follow who have TunkRank scores less than 20:

has twitter.com/friends/ealdent and has twitter.com/friends/dtunkelang and tunkrank.com/score < 20

Those people have something clearly in common, and it tells you something about the interests that Daniel and I share. I hope you check it out and let me know what you think.

Wordnik Gem

Posted: 12 March 2010 in Uncategorized
Tags: api, dictionaries, erin mckean, ruby, rubygems, wordnik

Erin McKean

I’ve had my eye on Wordnik for a while, since finding out the excellent lexicographer Erin McKean co-founded it. Wordnik is the most comprehensive dictionary in the known universe. Srsly!

They released an API a few months ago and I quickly threw together a gem wrapping it, based on HTTParty. Tonight I updated the gem for version 3 of the API and simplified it to just a single class with the bare essentials. You can perform pretty much all of the API calls and get a hash of the results. It’s nothing major, but will give you a chance to play around with the Wordnik API with almost no work on your part (aside from getting yourself a key). This change breaks backwards compatibility completely, sorry.

Example usage:

w = Wordnik.new("YOUR_API_KEY") w.define('gem') # => big hash with all the definitions w.examples('gem') # => example sentences using "gem"

You can grab the gem off of RubyGems or you can take a look at the source on github. As always, please let me know if you encounter any problems.

TunkRank Improvements

Posted: 17 February 2010 in Uncategorized
Tags: influence ranking, merb, mysql, postgresql, rails, redis, resque, ruby, tunkrank, twitter

Over the past few weeks, I’ve been working on a number of improvements to TunkRank that I will be rolling out tonight. First, I’ve secured a server to host it on, rather than my old Dell laptop, so reliability should improve and TunkRank is no longer a slave to dynamic DNS problems. Also, my cable company is less likely to hunt me down. TunkRank has gotten some increased attention over the past few weeks, including from Chris Dixon, CEO of the wonderful website hunch:

Twitter could fix the whole follower obsession by highlighting a more meaningful metric like TunkRank.

Awesome! So with this new version, there are a few changes that will immediately impact you, the end-user. I’ll go into the ones that affect you the most first, followed by some technical points of interest for those who care. Then I’ll conclude with a couple of hints at the future.

Changes to TunkRank

First and foremost, I have changed the main score that is reported. Previously I was using a percentile in the range (1-100). This got a lot of objections and created confusion. Partially because I consider the 100th percentile to be the “top-tier” of users, while standardized testing often reports the 99th percentile to mean you performed better than 99% of the population. Also, most people who actually care about their scores enough to use TunkRank are in the 95-100 percentile range, making more fine-grained comparisons difficult. Neal Richter even posted on his blog some suggestions for improving it (quite a while ago, now).

I took a page out of Neal’s book with the log scores, but I also put it in a range where the most influential twitter user (let’s call her MAX) will always have a score of 100. Your TunkRank Score™ is the ratio of the log of your raw score to the log of MAX’s score. So formulas aside, this means your TunkRank score is directly comparable to other users and is always in perspective of the maximum influence exerted by any user in the Twitterverse. Incidentally, comparing users with a difference of seven TunkRank score points means the user with the higher score is about twice as influential.

Accessing the API has also changed slightly, and I apologize to anyone actually using it at the moment. Basically, I am matching the API calls to more closely conform to the URLs used on the web side, and I’m returning more information with each call. TunkRank also supports XML responses in addition to JSON. You can find all of the documentation here.

Some Technical Notes

As part of the move, I’ve decided to transition from using Merb to Rails. My original decision to use Merb was partially as a learning exercise, but also because Merb appealed to me with its being lightweight. However, I often ran into roadblocks because some useful plugin wasn’t supported (or I couldn’t figure out how to make it work in the limited time I had). Sometimes the documentation for Merb was very good and sometimes it was absent altogether. Rails, on the other hand, has a substantial amount of documentation and people are always blogging about the best way to do things — which makes life as a developer much easier. Rails is my day job, so I knew I could transition quickly and easily.

I also migrated from MySQL to PostgreSQL. The main reason is that I love PostgreSQL — plain and simple. They both have their advantages, but MySQL gives me a sense of uneasiness I don’t have with PostgreSQL. I’ve managed to achieve some nice speed improvements as part of the redesign, though that is not to say that the same speed improvements wouldn’t have been possible with MySQL.

I’ve also adopted Resque as my background job-processing library. It is backed by Redis, an advanced key-value store that you can think of as a “data structures server.” The important thing for me is that Resque is fast, has a kick-ass web interface, and integrating with Rails is brain-dead easy.

The Road Ahead

I wrote before about the road ahead for TunkRank, and I have mostly held to it. I have many more ideas I want to expand on, including topic-sensitive influence rankings. I like the ideas in the recent WSDM paper (pdf) by Weng et al, but I have a few new ideas I’m eager to try out. TunkRank scores may also be integrated into Tickery in the near future, thanks to some discussions with Terry Jones of FluidDB. I’m excited!

Semantifi and the Deep Web

Posted: 6 February 2010 in Uncategorized
Tags: computational linguistics, natural language processing, search engines, search interfaces, semantic search engine, semantic web, wolfram alpha

At the Atlanta Semantic Web Meetup tonight, Vishy Dasari gave us a quick description and demo of a new search engine called Semantifi. They purportedly are a search engine for the deep web, meaning the web that is not indexed by traditional search engines because the content is dynamic. They are just in the very early stages, but have opened the site for people to play with and add data to via “Apps.” These apps are sort of like agents that respond to queries, returning results to some marshal process that decides which App will get the right to answer. Results are ranked by some method I wasn’t able to ascertain, but it reminded me of how Amy Iris works. These apps form the backbone of the Semantifi system, it seems, and they are crowdsourcing their creation. You can create a very simple app to return answers on your own data set in a few short minutes.

Perhaps more interesting is that they use a natural language interface in addition to the standard query sort of interface we’re all used to. Given the small amount of data currently available, I couldn’t really determine just how well this interface performs. It is based on a cognitive theory by John Hawks (sp?) that apparently states we think in terms of patterns. That’s very general and I haven’t been able to chase down that reference — and I forgot to ask Vishy for more info at the meetup. If someone can clear that up for me, I’d be grateful. The only seemingly relevant John Hawks I could find is a paleoanthropologist, so not sure. Anyhow, these patterns are what Vishy says the system uses to interpret natural language input. That may be a grandiose way of saying n-gram matching.

While Wolfram|Alpha is a computational knowledge engine™, Semantifi does not make that claim. Apps may compute certain things like mortgage values, but it’s not a general purpose calculator. However, Semantifi is looking at bringing in unstructured data from blogs and the like, that W|A ignores. It remains to be seen what that will look like, though. Also, users can contribute to Semantifi while W|A is a black box. In any case, they are making interesting claims and I look forward to seeing how they play out with more data.

Note: All of my observations are based on notes and memories of tonight’s presentation, so if I made any mistakes please post corrections in the comments or email me.

Unintentional HCIR commercial

Posted: 7 November 2009 in Uncategorized
Tags: advertising, commercials, faceted search, hcir

This commercial just caught my eye and made me think about faceted search.

The Mendicant Bug

About Me

Links

Comp Sci

Friends and Fun

@ealdent

Calendar

Archives

Tags

Site Statistics

Top Posts

Site Information

Random Crap