-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
用户自定义词典中的词在分词时被切分的问题 #457
Comments
这个问题会在新版本中修复,目前您可以修改下代码 |
了解。如何修改?您是指我按照这个新的commit (12c6361) 去改一下我这边对应的代码就好了是吗?谢谢 |
A follow-up issue. |
初学LTP,但我经过跟踪代码分析,建议maximum_forward_matching.py的maximum_forward_matching函数以下代码做以下修改。修改后可以按照自定义字典进行匹配。但有无其他影响尚不知道。 |
你好,我有一个词典词表并希望以此词典为准对文本进行分词,结果发现词典中词被切分为了单字:
例如我的词典中有单字"计""算""机",同时也有词"计算机",却发现在加入此自定义词表后,我的文本中所有的"计算机"都被切分为三个单字,这显然不是期望的结果,粗略阅读您的代码,在加入用户自定义词典后确实是调用了trie树前向最大匹配,不知是哪里有问题?是否可以通过某些设定来改善?版本4.0.9,例子:
The text was updated successfully, but these errors were encountered: