Dataset - B2FIND

xLiMe Twitter Corpus XTC 1.0.1

The xLiMe Twitter Corpus contains tweets in German, Italian and Spanish manually annotated with part-of-speech, named entities, and message-level sentiment polarity. In total,...

The sentiment corpus of parliamentary debates ParlaSent-BCS v1.0

The dataset consists of mid-length sentences from the Bosnian, Croatian and Serbian parliamentary proceedings, annotated with a 6-level sentiment schema (defined below). The...

Manually sentiment annotated Slovenian news corpus SentiNews 1.0

Between 2 and 6 annotators independently sentiment annotated a stratified random sample of 10,427 documents from the Slovenian news portals 24ur, Dnevnik, Finance, Rtvslo, and...

Automatically sentiment annotated Slovenian news corpus AutoSentiNews 1.0

The corpus contains 256,567 documents from the Slovenian news portals 24ur, Dnevnik, Finance, Rtvslo, and Žurnal24. These portals contain political, business, economic and...

Emoji Sentiment Ranking 1.0

A lexicon of 751 emoji characters with automatically assigned sentiment. The sentiment is computed from 70,000 tweets, labeled by 83 human annotators in 13 European languages....

Sentiment Annotated Dataset of Croatian News

We present a collection of sentiment annotations for news articles (article links) in Croatian language. A set of 2025 news articles was gathered from 24sata, one of the leading...

Twitter sentiment for 15 European languages

The dataset contains over 1.6 million tweets (tweet IDs), labeled with sentiment by human annotators. There are 15 Twitter corpora for the corresponding 15 European languages....

EMBEDDIA tools output example corpus of Estonian, Croatian and Latvian news a...

This dataset contains articles from EMBEDDIA Media partners with various information added by the tools developed within the EMBEDDIA project: - 12,390 Estonian articles from...

Slovene corpus for aspect-based sentiment analysis - SentiCoref 1.0

SentiCoref 1.0 corpus consists of 837 documents selected from SentiNews 1.0 corpus (http://hdl.handle.net/11356/1110). The documents were selected based on the number of...

9 datasets found