joshuago’s data-mining Bookmarks
The most common in-memory database index strategy is called T-tree. IBM solidDB instead uses an index called trie (or prefix tree), which was originally created for text searching but turns out to be perfect for in-memory indexing.
A stimulating discussion on Google's infrastructure for handling big data. One guy says it's hopelessly out of date. Others chime in and say it's easily 5-8 years ahead of open source alternatives.
Common pitfalls and temptations to avoid when designing a web user product.
You can use analytics and data-driven design to climb to the top of the current mountain, but it takes a creative leap and good judgment to spot a bigger mountain to climb.
Startups need different metrics than large companies do. They need metrics to tell how well the search for the business model is going, and whether at the end of that search is the business model you picked worth scaling into a company. Or is it time to pivot and look for a different business model?
These techniques could be the difference between molding a product people will actually pay money for and going out of business with an idea you thought was perfect.
If you're new to NoSQL, you'll want to do a bit of background reading.
Imagine the economic value of knowing, with mathematical certainty, exactly what the law is. If organizations could calculate legal risk as efficiently as they can now calculate financial risk (recession notwithstanding), millions of dollars in legal fees could be rerouted toward economic growth.
The RDBMS as we know it is simply not a scalable solution – its jack-of-all-trades nature leads to a fundamental weakness with scalability. Over the next few years, a number of solutions to our problems will emerge. A large portion of data analytics can be performed with SUM, COUNT, GROUP, and FILTER on denormalized data.