FabulaNet Literary Quality Dataset

Dataset

PID

A dataset designed to study literary fiction's quality as a multidimensional construct with several different proxies for reception and assessment (e.g. Goodreads' scores, libraries' holdings, selection for long-listed awards, presence in canonical anthologies etc.). The dataset contains metadata and tens of linguistic features for more than 9000 contemporary novels. Ideal for study of literary quality and reception. An extensive description of the measures annotated in the corpus can be found at GitHub: https://github.com/centre-for-humanities-computing/chicago_corpus/blob/main/data/corpus_description.md

Identifier
PID	http://hdl.handle.net/20.500.12115/56
Related Identifier	https://aclanthology.org/2024.lrec-main.71/
Related Identifier	https://centre-for-humanities-computing.github.io/fabula-net/
Metadata Access	http://repository.clarin.dk/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:repository.clarin.dk:20.500.12115/56

Provenance
Creator	Bizzoni, Yuri; Feldkamp, Pascale; Nielbo, Kristoffer; Lassen, Ida-Marie; Thomsen, Mads
Publisher	University Århus
Publication Year	2024
Rights	Creative Commons - Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0); http://creativecommons.org/licenses/by-nc-nd/4.0/; PUB
OpenAccess	true
Contact	info(at)clarin.dk

Representation
Language	English
Resource Type	corpus
Format	application/vnd.openxmlformats-officedocument.spreadsheetml.sheet; text/plain; charset=utf-8; downloadable_files_count: 1
Discipline	Linguistics