Unsupervised text tokenizer for Neural Network-based text generation.
-
Updated
Mar 1, 2026 - C++
Unsupervised text tokenizer for Neural Network-based text generation.
Bai Du NLP:Fen Ci ,Ci Xing Biao Zhu ,Ming Ming Shi Ti Shi Bie ,Ci Zhong Yao Xing
SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
Thai natural language processing in Python
Unsupervised text tokenizer focused on computational efficiency
Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
CKIP Transformers
Kiwi(jineunghyeong hangugeo hyeongtaeso bunseoggi)
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
A Vietnamese natural language processing toolkit (NAACL 2018)
BERT for Multitask Learning
AdaSeq: An All-in-One Library for Developing State-of-the-Art Sequence Understanding Models
A Japanese tokenizer based on recurrent neural networks
Juman++ (a Morphological Analyzer Toolkit)
Cantonese Linguistics and NLP
Python API for Kiwi
Zhong Wen Wen Ben Fen Lei , Xu Lie Biao Zhu Gong Ju Bao (pytorch),Zhi Chi Zhong Wen Chang Wen Ben , Duan Wen Ben De Duo Lei , Duo Biao Qian Fen Lei Ren Wu ,Zhi Chi Zhong Wen Ming Ming Shi Ti Shi Bie , Ci Xing Biao Zhu , Fen Ci , Chou Qu Shi Wen Ben Zhai Yao Deng Xu Lie Biao Zhu Ren Wu . Chinese text classification and sequence labeling toolkit, supports multi class and multi label classification, text similsrity, text summary and NER.
This repository is archived! The maintained MeCab can be found https://github.com/shogo82148/mecab
A PyTorch implementation of the BI-LSTM-CRF model.
MONPA Wang Pai Shi Yi Ge Ti Gong Zheng Ti Zhong Wen Duan Ci , Ci Xing Biao Zhu Yi Ji Ming Ming Shi Ti Bian Shi De Duo Ren Wu Mo Xing
Add a description, image, and links to the word-segmentation topic page so that developers can more easily learn about it.
To associate your repository with the word-segmentation topic, visit your repo's landing page and select "manage topics."