How2Sign: a large-scale multimodal dataset for continuous American Sign Language

Dataset

DOI

How2Sign consists of a parallel corpus of 80 hours of sign language videos (collected with multi-view RGB and depth sensor data) with corresponding speech transcriptions and gloss annotations. In addition, a three-hour subset was further recorded in a geodesic dome setup using hundreds of cameras and sensors, which enables detailed 3D reconstruction and pose estimation and paves the way for vision systems to understand the 3D geometry of sign language.

Videos selected from the existing How2 dataset

Ramon Sanabria, Ozan Caglayan, Shruti Palaskar, Desmond Elliott, Lo¨ıc Barrault, Lucia Specia, and Florian Metze. How2: a large-scale dataset for multimodal language understanding. arXiv preprint arXiv:1811.00347, 2018

https://github.com/srvk/how2-dataset

Script download the files directly to their servers (preferably via wget) https://github.com/how2sign/how2sign.github.io/blob/main/download_how2sign.sh

Identifier
DOI	https://doi.org/10.34810/data33
Metadata Access	https://dataverse.csuc.cat/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.34810/data33

Provenance
Creator	Cardoso Duarte, Amanda (ORCID: 0000-0002-9340-958X); Giró Nieto, Xavier ; Palaskar, Shruti; Ghadiyaram, Deepti; Haan, Kenneth de; Metze, Florian ; Torres Viñals, Jordi
Publisher	CORA.Repositori de Dades de Recerca
Contributor	Giró-i-Nieto, Xavier; 520(MH); 160(RI)
Publication Year	2024
Funding Reference	AGAUR. Agència de Gestió d'Ajuts Universitaris i de Recerca 2017 SGR 1414 ; Spanish Ministry of Economy and Competitiveness TEC2016-75976-R ; Spanish Secretary of State for Universities. Research, Development and Innovation TIN2015-65316-P ; Spanish State Research Agency PID2019-107255GB-C22
Rights	Custom Dataset Terms; info:eu-repo/semantics/openAccess; https://dataverse.csuc.cat/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.34810/data33
OpenAccess	true
Contact	Giró-i-Nieto, Xavier (Universitat Politècnica de Catalunya)

Representation
Resource Type	Other; Dataset
Format	text/csv; text/plain; application/zip; application/octet-stream
Size	423682; 5607385; 311419; 423280; 5603209; 311131; 7810; 24941280047; 25030909928; 32212254720; 21645045293; 21437195200; 17528664599; 17158080370
Version	1.0
Discipline	Other