NeMo Punctuation and Capitalisation service RSDO-DS2-P&C-API 1.0

PID

Punctuation and Capitalisation service for NeMo models. For more details about building such models, see the official NVIDIA NeMo documentation (https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/punctuation_and_capitalization.html) and NVIDIA NeMo GitHub (https://github.com/NVIDIA/NeMo). A model for punctuation and capitalisation restoration in lowercased non-punctuated Slovene text can be downloaded from http://hdl.handle.net/11356/1735.

The service accepts as input either a single string or list of strings for which punctuation and capitalisation should be restored. The result will be in the same format as the request, either a single string or list of strings. The maximal accepted text length is 5000c. Note that punctuation and capitalization of one 5000c text block on cpu will take advantage of all available cores and may take ~30s (on a system with 24 vCPU). See the service README.md for further details.

Identifier
PID http://hdl.handle.net/11356/1738
Related Identifier https://rsdo.slovenscina.eu/en/speech-technologies
Related Identifier https://github.com/clarinsi/Slovene_punctuator
Metadata Access http://www.clarin.si/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:www.clarin.si:11356/1738
Provenance
Creator Lebar Bajec, Iztok; Bajec, Marko; Bajec, Žan
Publisher Faculty of Computer and Information Science, University of Ljubljana
Publication Year 2022
Rights Apache License 2.0; https://opensource.org/licenses/Apache-2.0; PUB
OpenAccess true
Contact info(at)clarin.si
Representation
Resource Type toolService
Format text/plain; charset=utf-8; application/octet-stream; downloadable_files_count: 1
Discipline Linguistics