GitHub - tnantoka/TinySegmenter.m: Super compact Japanese tokenizer in Objective-C

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
forRegexKitLite		forRegexKitLite
LICENCE.txt		LICENCE.txt
README		README
TinySegmenter.h		TinySegmenter.h
TinySegmenter.m		TinySegmenter.m

Repository files navigation

TinySegmenter.m -- Super compact Japanese tokenizer in Objective-C

HOW TO USE

	1. CocoaOniguruma
		Add all .h, .c and .m files of under "Classes".
		( For details, see http://github.com/psychs/cocoaoniguruma )

	2. TinySegmenter
		Add TinySegmenter.h and TinySegmenter.m files  under "Classes".
		Import the header file.
		    #import "TinySegmenter.h"
		
		# If use CocoaOniguruma as a Framework
			
			TinySegmenter.h
				// #import "OnigRegexp.h" <- comment out
				#import "CocoaOniguruma/OnigRegexp.h" <- uncomment

	3. Test
		TinySegmenter* segmenter = [ [ TinySegmenter alloc ] init ];
		NSArray* segs = [ segmenter segment: @"これはテストですよ" ];
		NSLog(@"%@", [ segs componentsJoinedByString: @"|" ]);
		// これ|は|テスト|です|よ


* for RegexKitLite

	1. RegexKitLite
		Add RegexKitLite.h and RegexKitLite.m under "Classes".
		Add the linker option "-licucore".
		( For details, see http://regexkit.sourceforge.net/RegexKitLite/ )

	2. TinySegmenter
		Add forRegexKitLite/TinySegmenter.h and forRegexKitLite/TinySegmenter.m under "Classes".
			：
			：
			：