Not only web, it searches you books!
Amazon.com know what they have — books.
Not fancy enough? A9 also keeps history of search results and site visits.
Try. a9.com
imports from http://whatwewant.blogspot.com/
“whatwewant.www: people don’t want to search it. they just want to get it.”
Not only web, it searches you books!
Amazon.com know what they have — books.
Not fancy enough? A9 also keeps history of search results and site visits.
Try. a9.com
“New” web search service from Yahoo! with minimalism look.
And when look inside its functionality, sometimes I just thinking of Google 🙂
(it also comes with “cache” functionality)
Anyway, try it yourself, you may find that it may be more suitable to you than Google.
Who is the first author krub? 😉
Ratanachai Sombatsrisomboon, Yutaka Matsuo, and Mitsuru Ishizuka (2003). Aquisition of Hypernyms and Hyponyms from the WWW, in Proceedings of 2nd Int’l Workshop on Active Mining (AM2003), pp.7-13, Maebashi, Japan (in conjunction with Int’l Sympo. on Methodologies for Intelligent Systems), October, 2003.
What does it about?
by Mirella Lapata
(slides for COM3110 Text Processing class, Department of Computer Science, University of Sheffield)
breifly explains Google search, IR, issues in IR, indexing, inverted file, boolean model, vector space model, TF/IDF, term weighting, evaluation, precision, recall, and F-measure.
Sometimes it’s more than just ‘search’. We may want it ‘faster’, and many times we want it ‘smaller’.
(And for the case of database/index size, smaller one is probably the faster one — less things to looking for.)
Managing Gigabytes: Compressing and Indexing Documents and Images by Ian H. Witten, Alistair Moffat, and Timothy C. Bell. (read reviews)
From the authors of the book, MG, an open-source indexing and retrieval system for text, images, and textual images.
How to search things from a collection is one problem.
How to keep things (in a collection) for a searching is another problem.
And the latter one could be a really big problem, if you have to keep “3,307,998,701 web pages” like Google does.
Google File System: Technical paper, by Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. This is a technical paper that explains Google’s custom scalable cluster filesystem for storing their gigantic database of the entire Web across thousands of low-cost PCs.
This book [R. Belew. Finding Out About: A Cognitive Perspective on Search Engines and the WWW. Cambridge University Press, Cambridge, 2000.] investigates and try to describes IR from the cognitive perspective (what human/user think, percept, behave, ..).
“Everything you want and don’t want to know about Google” 🙂
Text Summarization, a home of MEAD (a public domain portable multi-document summarization system).
a summary of a collection of documents (which may comes from an automatic clustering) will help user decide if he/she wants to investigate that collection further or not — a time saving feature 🙂
Thumbshots.org. Featuring small picture of each webpage, so users have more clue if it a site they looking for or not.