Skip to content

izenecloud/ilplib

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Language processing libraries

Encapsulation for multi-languages purpose.

Features

  • Language identifier. The language identifier library is used to recognize languages, corresponding technical report could be accessed here.

  • Multi-language analyzerl. This is an encapsulation for icma and ijma. Language identifier is also used to decide which analyzer is used. Corresponding technical report could be accessed here.

  • Tokenizers. We also delivered several utility tokenizers for Chinese verticals. For search engine purpose, vertical portals always have different requirements on tokenization.

Dependencies

We've just switched to C++ 11 for SF1R recently, and GCC 4.8 is required to build SF1R correspondingly. We do not recommend to use Ubuntu for project building due to the nested references among lots of libraries. CentOS / Redhat / Gentoo / CoreOS are preferred platform. You also need CMake and Boost 1.56 to build the repository . Here are the dependent repositories list:

  • cmake: The cmake modules required to build all iZENECloud C++ projects.

  • icma: The Chinese morphological analyzer.

  • ijma: The Japanese morphological analyzer.

License

The project is published under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0

Releases

No releases published

Packages

No packages published

Languages