Who is this guy dude?

Who is the first author krub? 😉

Ratanachai Sombatsrisomboon, Yutaka Matsuo, and Mitsuru Ishizuka (2003). Aquisition of Hypernyms and Hyponyms from the WWW, in Proceedings of 2nd Int’l Workshop on Active Mining (AM2003), pp.7-13, Maebashi, Japan (in conjunction with Int’l Sympo. on Methodologies for Intelligent Systems), October, 2003.

What does it about?

Managing Gigabytes (Book)

Sometimes it’s more than just ‘search’. We may want it ‘faster’, and many times we want it ‘smaller’.

(And for the case of database/index size, smaller one is probably the faster one — less things to looking for.)

Managing Gigabytes: Compressing and Indexing Documents and Images by Ian H. Witten, Alistair Moffat, and Timothy C. Bell. (read reviews)

From the authors of the book, MG, an open-source indexing and retrieval system for text, images, and textual images.

Google File System

How to search things from a collection is one problem.

How to keep things (in a collection) for a searching is another problem.

And the latter one could be a really big problem, if you have to keep “3,307,998,701 web pages” like Google does.

Google File System: Technical paper, by Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. This is a technical paper that explains Google’s custom scalable cluster filesystem for storing their gigantic database of the entire Web across thousands of low-cost PCs.

From Google Weblog.