LX-UTagger

PID

LX-UTagger is a POS tagger for Portuguese that adopts the Universal Part-of-Speech tagset (UPOS), related to the Universal Dependency framework, with an initial performance of 99.06% under a ten-fold cross validation scheme.

It is described in this article:

António Branco, João Ricardo Silva, Luís Gomes and João Rodrigues, 2022, "Universal Grammatical Dependencies for Portuguese with CINTIL Data, LX Processing and CLARIN support", In Proceedings, 13th Conference on Language Resources and Evaluation (LREC2022).

which should be used as its canonical citation, and which interested users are referred for detailed information.

This tagger is trained with its companion CINTIL-UPos corpus, with around 1 Million manually annotated tokens, which can be obtained here: https://hdl.handle.net/21.11129/0000-000E-8B30-F.

You may also be interested in the following related resources that can also be found in this repository: LX-USuite (https://hdl.handle.net/21.11129/0000-000F-327C-E), LX-UDParser (https://hdl.handle.net/21.11129/0000-000E-8B31-E), LX-Suite (https://hdl.handle.net/21.11129/0000-000E-5991-A), LX-Tagger (https://hdl.handle.net/21.11129/0000-000B-D325-D), LX-DepParser (https://hdl.handle.net/21.11129/0000-000E-598D-0), LX-Parser (https://hdl.handle.net/21.11129/0000-000E-5999-2).

Identifier
PID https://hdl.handle.net/21.11129/0000-000E-8B2F-2
Metadata Access https://portulanclarin.net/repository/oaipmh/?verb=GetRecord&metadataPrefix=olac&identifier=8209ff367fd011ec9b5802420a87011bc00c66db3fe94c6b9f0ce1d040550447
Provenance
Publisher CLARIN
Contributor António Branco, antonio.branco[at]di.fc.ul.pt, University of Lisbon, Faculty of Sciences
Publication Year 2022
Rights CC-BY-NC-ND Restrictions of Use: academic-nonCommercialUse, attribution, noDerivatives User Nature: academic
OpenAccess true
Contact https://portulanclarin.net/contact/
Representation
Resource Type Software
Discipline Linguistics