Skip to content

toiro 0.0.2

Compare
Choose a tag to compare
@taishi-i taishi-i released this 13 Aug 14:22
· 61 commits to master since this release

This is the first release of this library.

Toiro is a comparison tool of Japanese tokenizers.

  • Compare the processing speed of tokenizers
  • Compare the words segmented in tokenizers
  • Compare the performance of tokenizers by benchmarking application tasks (e.g., text classification)

It also provides useful functions for natural language processing in Japanese.

  • Data downloader for Japanese text corpora
  • Preprocessor of these corpora
  • Text classifier for Japanese text (e.g., SVM, BERT)