NLTK-Lite is substantially simplified and streamlined version of NLTK (Natural Language Toolkit). NLTK is no longer supported.
NLTK-Lite is a new collection of lightweight NLP modules designed for maximum simplicity and efficiency. NLTK-Lite only covers the simple variants of standard data structures and tasks. Simplicity and efficiency are valued over generality and extensibility.
Key differences from NLTK:
- requires Python 2.4
- tokens are represented as strings, tuples, or trees
- all tokenizers are iterators; large tasks produce output as early as possible
- more emphasis on Python constructs instead of NLTK constructs
- default pipeline processing paradigm leads to more transparent code
- taggers incorporate backoff for smaller models and faster operation
- shorter names (e.g. tokenizer.RegexpTokenizer() becomes tokenize.regexp())
- tutorials are more easily maintained now with docutils and doctest
- contributed software is more easily incorporated
unrelated: Better, Faster, Lighter Java
2 responses to “(Better, Faster,) Lighter NLTK”
แต่ก่อนนึกว่า NLTK-Lite จะเหมือน TLE-Lite :-Pแบบนี้ต้องเอามาใช้บ้างละ :-)blog ท่าน bact' นี่แน่นไปสาระประโยชน์จริงๆ 😉
NLTK-Lite is much easier to learn and use. ตอนนี้ จะได้เล่น NLP บ้างสักที(ทำไมภาษาไทย ไม่มีแจกดีๆ แบบนี้บ้างนะ)