Tag Archives: hadoop

Fast asymmetric Hadoop joins using Bloom Filters and Cascading

In a recent post for the Liveramp blog I describe how we use Bloom filters to optimize our Hadoop jobs: We recently open-sourced a number of internal tools we’ve built to help our engineers write high-performance Cascading code as the … Continue reading

Posted in Open Source | Tagged , | Leave a comment