Friday, July 13, 2007

YouTube Scalability

Very interesting discussion there by Cuong Do,
one of the original nine-man team of YouTube
and one of its two scalability software architects.

Saturday, June 2, 2007

MapReduce

Slide: MapReduce: Simplified Data Processing on Large Clusters
Wikipedia: MapReduce

Original Paper

Sample Uses of MapReduce in Google Source Tree

distributed grep,
term-vector per host,
document clustering
...

distributed sort,
web access log stats,
machine learning,
...

web link-graph reversal,
inverted index construction,
statistical machine translation,
...

Function as First Class

Can Your Programming Language Do This? - Joel Spolsky