Scatterplots, jQuery, and Class Size

A couple of days ago, Nathan at FlowingData posted a parallel coordinates plot showing a relationship between student-to-teacher ratio and mean SAT scores. In the comments, a lot of people said that they would have rather seen scatterplots, and also mentioned that the percent of students taking the test could also have been a factor (due to selecting the higher-performing students to take the test). Like the weather, a lot of people complained about it, but no one did anything about it.

Since the data was readily available from the National Center for Education Statistics, I thought I’d download it and just make some quick scatterplots. Of course, this also seemed like a good opportunity to play around with flot, a really neat javascript library for generating plots and graphs.

The end result is here. I think it’s quite decent.

Before you say anything, I agree: it would be better if it had reference lines for the US average for X and Y axes, a color scale to indicate color values, and used something other than HTML tables for layout. If you cry out for them, I will add them.

Update 11/17/2009: I added a color scale and reference lines for the US Average, in order to have a little better showing in the Flowing Data competition. I’m still showing my poor layout skills, though.


Yahoo Pipes

Today I discovered Yahoo Pipes, a way to mix and filter input from the web. This made me pretty excited–for a while, I’ve had as a project in the middle of my TODO list, to find a way to rss posts only by a specific author. In particular, I wanted to have an RSS feed that looked at the RSS feed, but only kept posts by Liz Gorinsky. I assumed that this would involve writing a little script to go to this specific feed, parse out the entries, and then just display the ones which had Liz as the author: a good hour or so of work to do the parsing, RSS comprehension, and debugging to get it all to work properly.

However, in under ten minutes at Yahoo Pipes, I’d made this, which allows you not just to filter out posts by Liz, but allows anyone to create a new RSS feed to filter Tor’s feed for any author they input. It was all done with a very slick visual interface where I literally dragged and linked together the components without writing a single line of code.

This seems to be just scratching the surface of Yahoo Pipes’ capabilities, too: besides filtering, they’ve got widgets for location extraction, regular expressions, combining multiple feeds…I think this is going to be a pretty powerful tool.


No Comments


Perhaps in response to passing my dissertation defense (so that I can now look at numbers for fun rather than because it’s actually useful), I have finished modeling “How many a’s do people use in ‘Khaaaaaan!’?” Yep. After the idea was suggested by Kevin Martin, I just couldn’t help myself. If there was any doubt about my being a geek, I think those doubts can be pretty well laid to rest now.

No Comments

Iranian Election Updates!

I’ve finally had some time to finish updating my 2009 Iranian Election Analysis, to the point where I’ve changed the page on to point here. I’m kind of disappointed to find no real statistical evidence of fraud, but I do like the fact that by looking intensely at the data, you can find explanations–in particular, finding weaknesses in the 1st, 2nd, and final digit frequency analyses.


No Comments

Starting Up

Hello, all–this website (and the blog in particular) are going to be my way to bring together and present some of the projects and research I’m doing. When I put a new project up, I’ll post about it here as well. With luck (and time), I’ll also be able to use the blog to talk about interesting ideas which haven’t yet turned into projects, rather than having them consigned to the oblivion of my notepad.

I still have some work to do to get the css working the way I want it; in the meantime, please excuse the clutter.

No Comments