joshuago’s data-mining Bookmarks

27 JAN 2012
solidDB and the secrets of speed

The most common in-memory database index strategy is called T-tree. IBM solidDB instead uses an index called trie (or prefix tree), which was originally created for text searching but turns out to be perfect for in-memory indexing.

20 JUN 2011
[Quora] Is Google's software infrastructure obsolete?

A stimulating discussion on Google's infrastructure for handling big data. One guy says it's hopelessly out of date. Others chime in and say it's easily 5-8 years ahead of open source alternatives.

29 JAN 2011
[Andrew Bosworth] Data Downfalls

Common pitfalls and temptations to avoid when designing a web user product.

06 JAN 2011
[52 Weeks of UX] The Local Maximum

You can use analytics and data-driven design to climb to the top of the current mountain, but it takes a creative leap and good judgment to spot a bigger mountain to climb.

23 FEB 2010
[Steve Blank] No Accounting For Startups

Startups need different metrics than large companies do. They need metrics to tell how well the search for the business model is going, and whether at the end of that search is the business model you picked worth scaling into a company. Or is it time to pivot and look for a different business model?

21 DEC 2009
[Smart Bear Software] Customer Feedback: 11 ways to get it

These techniques could be the difference between molding a product people will actually pay money for and going out of business with an idea you thought was perfect.

09 DEC 2009
[assertTrue] NoSQL Required Reading

If you're new to NoSQL, you'll want to do a bit of background reading.

05 AUG 2009
[VoxPopuLII] Tidying Up the Law

Imagine the economic value of knowing, with mathematical certainty, exactly what the law is. If organizations could calculate legal risk as efficiently as they can now calculate financial risk (recession notwithstanding), millions of dollars in legal fees could be rerouted toward economic growth.

02 JUL 2009
[The Road to Failure] Social Media Kills the Database

The RDBMS as we know it is simply not a scalable solution – its jack-of-all-trades nature leads to a fundamental weakness with scalability. Over the next few years, a number of solutions to our problems will emerge. A large portion of data analytics can be performed with SUM, COUNT, GROUP, and FILTER on denormalized data.