Gå direkt till huvudinnehåll
Researchdata.se

Corpus of spoken isiXhosa

https://doi.org/10.23695/XRSG-MP07

The Corpus of Spoken isiXhosa The Corpus of Spoken isiXhosa consists of transcribed and annotated recordings of spoken Xhosa [xho]. The recordings have been made in the Eastern Cape in South Africa from 2015 onwards. The transcribed texts are annotated with morpheme-by-morpheme glosses, part-of-speech tags, and free English translations. The recordings and the annotations of Xhosa data have been made as part of three different research projects led by senior lecturer Eva-Marie Bloom Ström at the University of Gothenburg. All projects, including the ongoing ‘How do words get in order? The role of speaker-hearer interaction in languages of southern Africa’, were founded by the Swedish Research Council. The Corpus has been developed in collaboration with Språkbanken Text. A user guide and more extensive information about the corpus data can be found in the Corpus of Spoken isiXhosa Manual [PDF]. For more on annotation, preparation of data, and acknowledgements see: Bloom Ström, E.-M., Slater, O., Zahran, A., Berdicevskis, A., & Schumacher, A. (2023). Preparing a corpus of spoken Xhosa. Proceedings of the 2023 CLASP Conference on Learning with Small Data (LSD), 62–67. https://aclanthology.org/2023.clasp-1.7Öppnas i en ny tabb For questions about the corpus: Eva-Marie Bloom Ström eva-marie.strom@gu.se If you notice any errors or inconsistencies in annotations, please report them to this email address. Main contributors: Eva-Marie Bloom Ström Senior Lecturer, University of Gothenburg Onelisa Slater MA, Rhodes University Aron Zahran PhD, Inalco/Llacan (CNRS) & Ghent University

Gå till källa för data
Öppnas i en ny tabb
https://doi.org/10.23695/XRSG-MP07

Citering och åtkomst

Administrativ information

Ämnesområde och nyckelord

Metadata

sprakbanken-text
Göteborgs universitet