Author Archives: bpodgursky

Simple Boolean Expression Manipulation in Java

I’ve worked on a couple projects recently where I needed to be able to do some lightweight propositional expression manipulation in Java.  Specifically, I wanted to be able to: Let a user input simple logical expressions, and parse them into … Continue reading

Posted in Algorithms, Github, Open Source | Leave a comment

Taxi Loading at SFO

I usually avoid catching Taxis whenever possible, but when I arrived in SFO last week the trains were no longer running and I hadn’t arranged for a shuttle, so I ended up waiting in line to catch a Taxi.  The … Continue reading

Posted in Algorithms | 1 Comment

Updates to language vs income breakdown post

Thanks to everyone who commented and read through my post last night.  The post got a lot more attention than I expected (on hacker news and reddit at least).    Many comments both here and on those threads quite reasonably pointed out … Continue reading

Posted in Uncategorized | 18 Comments

Average Income per Programming Language

Update 8/21:  I’ve gotten a lot of feedback about issues with these rankings from comments, and have tried to address some of them here.  The data there has been updated to include confidence intervals. ——————————————————————————————————— A few weeks ago I described … Continue reading

Posted in Github, Open Source, Uncategorized, Visualization | 197 Comments

Using CoreNLP, d3.js, and dagre.js to visualize sentence parse trees

I’ve always been casually interested in the field of Natural Langauge Processing (NLP), a  field of computer science interested in extracting information from natural human language. I have no training or education whatsoever in the field so I’m not in a … Continue reading

Posted in Uncategorized | 10 Comments

Github Demographics

For the past couple weeks I’ve been working on a project to visualize and compare the demographics of popular GitHub organizations by combining data from the the RapLeaf and GitHub APIs.   By pulling emails from Git commit data and … Continue reading

Posted in Uncategorized | 8 Comments

Fast asymmetric Hadoop joins using Bloom Filters and Cascading

In a recent post for the Liveramp blog I describe how we use Bloom filters to optimize our Hadoop jobs: We recently open-sourced a number of internal tools we’ve built to help our engineers write high-performance Cascading code as the … Continue reading

Posted in Open Source | Tagged , | Leave a comment