joshuago’s data_mining Bookmarks
1
2
07 JUL 2009
02 JUL 2009
The RDBMS as we know it is simply not a scalable solution – its jack-of-all-trades nature leads to a fundamental weakness with scalability. Over the next few years, a number of solutions to our problems will emerge. A large portion of data analytics can be performed with SUM, COUNT, GROUP, and FILTER on denormalized data.
02 JUL 2009
If we examine the nontrivial-sized DBMS markets, it turns out that current relational DBMSs can be beaten by approximately a factor of 50 in most any market. A good overview of NoSQL alternatives for particular uses.
28 MAY 2009
The three sexy skills of data geeks are statistics (which requires study), data munging (which demands suffering), and visualization (which favors those with a knack for storytelling).
19 MAY 2009
The only metrics that entrepreneurs should invest energy in collecting are those that help them make decisions. Unfortunately, the majority of data available in off-the-shelf analytics packages are Vanity Metrics: they might make you feel good, but they don’t offer clear guidance for what to do.
30 APR 2009
08 APR 2009
Expect failures and embrace them. Fully automate your infrastructure deployments. Design your infrastructure so that it scales horizontally. Establish clear measurable goals -- for example, response time. Be prepared to quickly identify and eliminate bottlenecks. Then play whack-a-mole for a while, until things get stable.
1
2