Carnatic varnam dataset

Dataset

DOI

Audio music content----- They feature 7 varnams in 7 raagas sung by 5 young professional singers who received training for more than 15 years. They are all set to Adi taala. Measuring the intonation variations require absolutely clean pitch contours. For this, all the varṇaṁs are recorded without accompanying instruments, except the drone.

Taala annotations----- The recordings are annotated with taala cycles, each annotation marking the starting of a cycle. We have later automatically divided each cycle into 8 equal parts. The annotations are made available as sonic visualizer annotation layers. Each annotation is of the format m.n where m is the cycle number and n is the division within the cycle. All m.1 annotations are manually done, whereas m.[2-8] are automatically labelled.

Notations----- The notations for 7 varnams are procured from an archive curated by Shivkumar, in word document format. They are manually converted to a machine readable format (yaml). Each file is essentially a dictionary with section names of the composition as keys. Each section is represented as a list of cycles. Each cycle in turn has a list of divisions.

Possible uses of the dataset----- The distinct advantage of this dataset is the free availability of the audio content. Along with the annotations, it can be used for melodic analyses: characterizing intonation, motif discovery and tonic identification. The availability of a machine readable notation files allows the dataset to be used for audio-score alignment.

Carnatic varnam dataset is a collection of 28 solo vocal recordings, recorded for our research on intonation analysis of Carnatic raagas. The collection has the audio recordings, taala cycle annotations and notations in a machine readable format.

Identifier
DOI	https://doi.org/10.34810/data457
Metadata Access	https://dataverse.csuc.cat/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.34810/data457

Provenance
Creator	CompMusic
Publisher	CORA.Repositori de Dades de Recerca
Publication Year	2023
Funding Reference	Europen Comission EC/FP7/267583
Rights	Custom Dataset Terms; info:eu-repo/semantics/openAccess; https://dataverse.csuc.cat/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.34810/data457
OpenAccess	true

Representation
Resource Type	Other; Dataset
Format	audio/mpeg; text/plain
Size	5927114; 15084063; 5768288; 4605526; 5310623; 19518610; 6704935; 10855269; 9195892; 13775851; 7124148; 7293839; 7009038; 6340475; 15526055; 6065875; 7401674; 7011299; 4994228; 7914509; 10306311; 8623053; 14057973; 7621519; 7170541; 10521460; 7828409; 9960837; 4804; 140941
Version	1.0
Discipline	Fine Arts, Music, Theatre and Media Studies; Humanities; Music