SweDiagnostics
https://doi.org/10.23695/YEPN-SE26
I. IDENTIFYING INFORMATION
Title*
SuperLim Diagnostic Dataset, v1.1
Subtitle
Created by*
Felix Morger, Gothenburg University (felix.morger@gu.seÖppnas i en ny tabb)
Publisher(s)*
Språkbanken Text (sb-info@svenska.gu.seÖppnas i en ny tabb)
Link(s) / permanent identifier(s)*
https://spraakbanken.gu.se/en/resources/superlimÖppnas i en ny tabb
License(s)*
CC BY 4.0
Abstract*
Manual Swedish translation of all 1106 sentence pairs of the SuperGLUE diagnostic dataset.
Funded by*
Vinnova (grants no. 2020-02523, 2021-04165)
Cite as
Related datasets
SuperLim, SuperGLUE diagnostic dataset, FraCaS test suite
II. USAGE
Key applications
Fine-grained analysis of system performance on a broad range of linguistic phenomena.
Intended task(s)/usage(s)
Natural language inference.
Recommended evaluation measures
Krippendorff's alpha (the official SuperLim measure), Matthews' correlation coefficient.
Dataset function(s)
Diagnostics
Recommended split(s)
Test only
III. DATA
Primary data*
Text
Language*
Swedish
Dataset in numbers*
1106
Nature of the content*
Pairs of sentences annotated according with their inference relation and the linguistic phenomena that account for their differencs
Format*
JSONL and TSV. Nine columns/objects: id, four columns with the information about the relevant linguistic phenomena; domain; label; premise; hypothesis
Data source(s)*
SuperGLUE Diagnostic Dataset: Pruksachatkun, Yada & Nangia, Nikita & Singh, Amanpreet & Michael, Julian & Hill, Felix & Levy, Omer & Bowman, Samuel. (2019). SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems.
Data collection method(s)*
See original source.
Data selection and filtering*
See original source.
Data preprocessing*
See original source.
Data labeling*
Some data labels (annotations) were changed to fit with Swedish example, but in general the aim was to keep such changes to a minimum.
Annotator characteristics
IV. ETHICS AND CAVEATS
Ethical considerations
See original data source.
Things to watch out for
See original data source.
V. ABOUT DOCUMENTATION
Data last updated*
2023-03-01, v1.1
Which changes have been made, compared to the previous version*
Minor format changes
Access to previous versions
This document created*
2021-06-04, Felix Morger.
This document last updated*
2023-04-02, Aleksandrs Berdicevskis.
Where to look for further details
Documentation template version*
v1.1
VI. OTHER
Related projects
References
Gå till källa för data
Öppnas i en ny tabbhttps://doi.org/10.23695/YEPN-SE26
Citering och åtkomst
Citering och åtkomst
Skapare/primärforskare:
- Morger, Felix
Forskningshuvudman:
Citering:
Språk:
Administrativ information
Administrativ information
Ämnesområde och nyckelord
Ämnesområde och nyckelord
Relationer
Relationer
Metadata
Metadata
