Managing Gigabytes (Book)

Sometimes it’s more than just ‘search’. We may want it ‘faster’, and many times we want it ‘smaller’.

(And for the case of database/index size, smaller one is probably the faster one — less things to looking for.)

Managing Gigabytes: Compressing and Indexing Documents and Images by Ian H. Witten, Alistair Moffat, and Timothy C. Bell. (read reviews)

From the authors of the book, MG, an open-source indexing and retrieval system for text, images, and textual images. read more

Summarization for Search Engine

Talking about Document Clustering/Categorization/Classification, about ‘approach’ to aid user access to mountains of pages may be a Summarization.

Instead of just only page title, url, and few first (nonsense) paragraphs from the page.

Short summaries may help users to decide which pages are whattheywant and whattheydontwant.

นอกจากจะแบ่งกลุ่มเอกสารที่หามาได้ ให้หา(ต่อโดยผู้ใช้ว่าอันไหนจะเอา อันไหนไม่เอา)ง่ายๆ แล้ว read more

Information Retrieval (and related) research groups in Thailand


WhatWeWant is up online, thanks to keng. A blog about search engine, information retrieval, and those kind of stuffs.

My clothes are currently running around, play catching up each other in the washing machine.
It’s just almost 6am here, in Edinburgh. Still very dark. During this time of year, the Sun will rises on around 9am … and says bye-bye on around 4pm <gosh!> -_-“

Actually, just very lazy to woke up. But tomorrow 6pm I have to work at Thai restaurant, and my ‘uniform’ is not get washed yet. (Today I guess I will back to my flat late, have to finished practical part of my assignment — due tomorrow 5pm.) read more