Skip to content

pavpen/gogo_lemmatizer

 
 

Repository files navigation

Lemmatizer from AOT group (http://lemmatizer.org/) with *just* autotools added.
... and patched to compile under CygWin. [--Pav]


INSTALLATION:
=============

# unless you have ./configure file (git repository?) launch `autoconf' first:w
$ aclocal
$ automake --add-missing --force
$ autoconf -f -i -v


GNU / Linux:
-----------

$ ./configure
$ make
$ sudo make install

lemmatizer will be installed to /usr/local/lemmatizer with aditional symlink to /usr/lemmatizer


CygWin:
------

To compile staticaly-linked executables, and library objects on CygWin, you
will need libpcre's shared objects for static linking.

To compile them, run your CygWin setup-x86.exe, or setup-x86_64.exe to install
the necessary packages.  Make sure you have the 'Src?' column ticked for
'libpcre1'.  Also, make sure to install autoconf, cygport, g++, zlib-devel,
libbz2-devel.

Once your packages are installed,:

$ cd /usr/src/pcre-*.src # Depending on the pcre version number.
$ cygport pcre.cygport prep
$ cygport pcre.cygport compile

After the libpcre build successfully completes, go back to your gogo_lemmatizer
directory.

$  ./configure --with-pcre-objs=/usr/src/pcre-*.src/pcre-*.x86_64/build/.libs/ # Depending on the pcre version number.

$ make

Then, you should be able to do:

# make install

Have a look at the Run-time Configuration section on how to make sure a binary
can find its dictionary files, and DLLs.


EXAMPLES:
=========

You can find examples of usage in examples directory.
 - examples/c/firstform.c - example of using pure C lemmatizer (poor interface actually)
 - examples/cpp/lemmatize.cpp - example of using LemInterface.

to build the examples you may use their Makefiles.
Building these examples requires that lemmatizer is _already installed_.

 - examples/cpp/lemmatize2 - Modification of lemmatize.cpp to process words
   from a file stream, and output JSON.

This example does not require that lemmatizer is installed, or that you have
`pkg-config`.  However, you need to set your environment variable 'RML', or run
from an approprate working directory, and make sure the liked can find your
shared libraries.  See Run-time Configuration.


Run-time Configuration
======================

The lemmatizer library will search for the <Dicts> directory to load lemma
dictionaries (compiled during the build process).  You can set the 'RML'
environment variable to the lemmatizer path that contains the <Dicts> directory
to be able to run an executable linked against the lemmatizer library.  If
'RML' in not set, a hard-coded list of directory paths is tried: (See
`getRMLDirectory()` in <Source/common/utilit.cpp>.)

	./lemmatizer
	/usr/local/lemmatizer
	/usr/lemmatizer


Shared Libraries
----------------

Don't forget to make sure your dynamic linker can find the lemmatize shared
libraries.  If you did not install them in a directory searched by your OS's
linker, make sure you add a path to the contents of the <Bin> directory to
LD_LIBRARY_PATH on Unix and Linux systems, and to your PATH on Windows systems.


## Running CygWin-Compiled Binaries Outside CygWin

The binaries produced by CygWin's GCC will depend on a number of CygWin DLLs to
run.  You can add your CygWin's <bin> directory to your PATH environment
variable to find them, or ship them in the same directory as your executables.

You may need, at least, the following (Version numbers may vary.):

	cyggcc_s-seh-1.dll cygiconv-2.dll cygpcre-1.dll cygstdc++-6.dll
	cygwin1.dll


DOCUMENTATION:
==============
Here are links to the installation notes:
1. Morphology: Docs/Morph_UNIX.txt
2. Syntax: Docs/Syntax_UNIX.txt
3. Concordance: Docs/DDC_UNIX.txt

About

lemmatizer from AOT group (http://lemmatizer.org/) with autotools added

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 73.9%
  • HTML 21.2%
  • C 3.8%
  • Other 1.1%