Tagger and lemmatizer HOWTO

Installation

> git clone https://github.com/ufal/morphodita
> cd src/
> vim Makefile.builtem
-  C_FLAGS += -std=c++11 -W -Wall -mtune=generic -msse -msse2 -mfpmath=sse -fvisibility=hidden -U_FORTIFY_SOURCE
+  C_FLAGS += -std=c++11 -W -Wall -march=native -fvisibility=hidden -U_FORTIFY_SOURCE
> make

Models

Run tagger

echo "Červený střízlíček a střapatá žluva ďobali šťavnaté ocúny" \
| ./run_tagger czech-morfflex-pdt-131112-raw_lemmas.tagger-best_accuracy

Run lemmatizer

echo "Červený střízlíček a střapatá žluva ďobali šťavnaté ocúny." \
| ./run_tagger --input=untokenized --output=vertical \
czech-morfflex-pdt-131112-pos_only-raw_lemmas.tagger 2>/dev/null \
| cut -f 2 | tr "\n" " "

Problems

Loading big models takes several seconds, but the tagging itself is very fast. The new version contains REST server, so it can be started once and handle multiple requests.

 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Noncommercial-Share Alike 4.0 International
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki