Sarcastic Soulmates: Intimacy and irony markers in social media messaging

DOI

We research the use of sarcasm on Twitter, and show that a computer has more difficulty to detect sarcasm shared among peers than sarcasm shared with any interested audience. This data set features the data used for training machine learning classifiers, and annotations of the output.

  • Usercategory (and User) indicates whether the element is a feature of the classifier on the basis of USER-tweets or NOUSER-tweets.- Featurerankwithincategory indicates the importance of the feature for the respecting classifier.- Frequencyelementwithinusertweets indicates how often the feature was observed in the complete set with USER-tweets.- Frequencyelementwithinnouser indicates how often the feature was observed in the complete set with NOUSER-tweets.- Totalamountofmarkers is the sum of irony markers (Hyperbole, Interjections, Repetition, Hashtag, Capitals, Punctuation Marks and Emoticons).- The values of each marker indicate the presence of the marker, a 1 indicates the presence and a 0 indicates the absence. The marker ‘polarity’ forms an exception. In this case a value of 0 indicates no evaluation, 1 indicates a negative polarity of the evaluation and 2 indicates a positive polarity of the evaluation.Radboud University supplied the 'top_features_annotations' file in .xlsx format. For preservation purposes, DANS added the .csv format.
Identifier
DOI https://doi.org/10.17026/dans-24j-68qr
Metadata Access https://ssh.datastations.nl/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.17026/dans-24j-68qr
Provenance
Creator FA Kunneman
Publisher DANS Data Station Social Sciences and Humanities
Contributor Florian Kunneman; K Hallmann (Radboud University)
Publication Year 2016
Rights CC0 1.0; info:eu-repo/semantics/openAccess; http://creativecommons.org/publicdomain/zero/1.0
OpenAccess true
Contact Florian Kunneman (Radboud University)
Representation
Resource Type Dataset
Format application/octet-stream; application/zip; text/plain; charset=US-ASCII; text/plain; application/pdf; text/csv; application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Size 158; 152; 19148; 4936; 6987481; 2557729; 548209; 191384; 163587; 18697; 18210
Version 2.0
Discipline Humanities