RubyGems.org

tokenizer

0.1.1

A simple multilingual tokenizer for NLP tasks. This tool provides a CLI and a library for linguistic tokenization which is an anavoidable step for many HLT (human language technology) tasks in the preprocessing phase for further syntactic, semantic and other higher level processing goals. Use it for tokenization of German, English and French texts.

installgem install tokenizer
Authors

Andrei Beliankou

4,294 total downloads 3,136 for this version
Owners

9183f2f97a2f44594196cab39cbe5928

Gemfile
gem 'tokenizer', '~> 0.1.1'
Versions
  1. 0.1.1 August 24, 2011 (8 KB)
  2. 0.1.0 May 18, 2011 (5.5 KB)
  3. 0.0.1.prealpha May 5, 2011 (4 KB)