<codeBook xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xsi:schemaLocation="ddi:codebook:2_5 http://www.ddialliance.org/Specification/DDI-Codebook/2.5/XMLSchema/codebook.xsd" xmlns="ddi:codebook:2_5">
  <docDscr>
    <citation>
      <titlStmt>
        <titl xml:lang="sv">SweDiagnostics</titl>
        <parTitl xml:lang="en">SweDiagnostics</parTitl>
        <IDNo agency="SND">doi-10-23695-yepn-se26-0</IDNo>
        <IDNo agency="DOI">https://doi.org/10.23695/YEPN-SE26</IDNo>
      </titlStmt>
      <prodStmt>
        <producer xml:lang="en" abbr="SND">Swedish National Data Service</producer>
        <producer xml:lang="sv" abbr="SND">Svensk nationell datatjänst</producer>
      </prodStmt>
      <holdings URI="https://doi.org/10.23695/YEPN-SE26">Landing page</holdings>
    </citation>
  </docDscr>
  <stdyDscr>
    <citation>
      <titlStmt>
        <titl xml:lang="sv">SweDiagnostics</titl>
        <parTitl xml:lang="en">SweDiagnostics</parTitl>
        <IDNo agency="SND">doi-10-23695-yepn-se26-0</IDNo>
        <IDNo agency="DOI">https://doi.org/10.23695/YEPN-SE26</IDNo>
      </titlStmt>
      <rspStmt />
      <prodStmt />
      <distStmt>
        <distrbtr xml:lang="en" abbr="SND" URI="https://snd.se">Swedish National Data Service</distrbtr>
        <distrbtr xml:lang="sv" abbr="SND" URI="https://snd.se">Svensk nationell datatjänst</distrbtr>
        <distDate xml:lang="en" date="2024-01-01" />
      </distStmt>
      <verStmt>
        <version elementVersion="0" elementVersionDate="2024-01-01" />
      </verStmt>
      <holdings URI="https://doi.org/10.23695/YEPN-SE26">Landing page</holdings>
    </citation>
    <stdyInfo>
      <subject />
      <abstract xml:lang="en" contentType="abstract">I. IDENTIFYING INFORMATION

Title*
SuperLim Diagnostic Dataset, v1.1

Subtitle

Created by*
Felix Morger, Gothenburg University (felix.morger@gu.se)

Publisher(s)*
Språkbanken Text (sb-info@svenska.gu.se)

Link(s) / permanent identifier(s)*
https://spraakbanken.gu.se/en/resources/superlim 

License(s)*
CC BY 4.0

Abstract*
Manual Swedish translation of all 1106 sentence pairs of the SuperGLUE diagnostic dataset.

Funded by*
Vinnova (grants no. 2020-02523, 2021-04165)

Cite as

Related datasets
SuperLim, SuperGLUE diagnostic dataset, FraCaS test suite

II. USAGE

Key applications
Fine-grained analysis of system performance on a broad range of linguistic phenomena.

Intended task(s)/usage(s)
Natural language inference.

Recommended evaluation measures
Krippendorff's alpha (the official SuperLim measure), Matthews' correlation coefficient.

Dataset function(s)
Diagnostics

Recommended split(s)
Test only

III. DATA

Primary data*
Text

Language*
Swedish

Dataset in numbers*
1106

Nature of the content*
Pairs of sentences annotated according with their inference relation and the linguistic phenomena that account for their differencs

Format*
JSONL and TSV. Nine columns/objects: id, four columns with the information about the relevant linguistic phenomena; domain; label; premise; hypothesis

Data source(s)*
SuperGLUE Diagnostic Dataset: Pruksachatkun, Yada &amp; Nangia, Nikita &amp; Singh, Amanpreet &amp; Michael, Julian &amp; Hill, Felix &amp; Levy, Omer &amp; Bowman, Samuel. (2019). SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems. 

Data collection method(s)*
See original source. 

Data selection and filtering*
See original source. 

Data preprocessing*
See original source. 

Data labeling*
Some data labels (annotations) were changed to fit with Swedish example, but in general the aim was to keep such changes to a minimum. 

Annotator characteristics

IV. ETHICS AND CAVEATS

Ethical considerations
See original data source.

Things to watch out for
See original data source.

V. ABOUT DOCUMENTATION

Data last updated*
2023-03-01, v1.1

Which changes have been made, compared to the previous version*
Minor format changes

Access to previous versions

This document created*
2021-06-04, Felix Morger.

This document last updated*
2023-04-02, Aleksandrs Berdicevskis.

Where to look for further details

Documentation template version*
v1.1

VI. OTHER

Related projects

References</abstract>
      <abstract xml:lang="sv" contentType="abstract">I. IDENTIFYING INFORMATION

Title*
SuperLim Diagnostic Dataset, v1.1

Subtitle

Created by*
Felix Morger, Gothenburg University (felix.morger@gu.se)

Publisher(s)*
Språkbanken Text (sb-info@svenska.gu.se)

Link(s) / permanent identifier(s)*
https://spraakbanken.gu.se/en/resources/superlim 

License(s)*
CC BY 4.0

Abstract*
Manual Swedish translation of all 1106 sentence pairs of the SuperGLUE diagnostic dataset.

Funded by*
Vinnova (grants no. 2020-02523, 2021-04165)

Cite as

Related datasets
SuperLim, SuperGLUE diagnostic dataset, FraCaS test suite

II. USAGE

Key applications
Fine-grained analysis of system performance on a broad range of linguistic phenomena.

Intended task(s)/usage(s)
Natural language inference.

Recommended evaluation measures
Krippendorff's alpha (the official SuperLim measure), Matthews' correlation coefficient.

Dataset function(s)
Diagnostics

Recommended split(s)
Test only

III. DATA

Primary data*
Text

Language*
Swedish

Dataset in numbers*
1106

Nature of the content*
Pairs of sentences annotated according with their inference relation and the linguistic phenomena that account for their differencs

Format*
JSONL and TSV. Nine columns/objects: id, four columns with the information about the relevant linguistic phenomena; domain; label; premise; hypothesis

Data source(s)*
SuperGLUE Diagnostic Dataset: Pruksachatkun, Yada &amp; Nangia, Nikita &amp; Singh, Amanpreet &amp; Michael, Julian &amp; Hill, Felix &amp; Levy, Omer &amp; Bowman, Samuel. (2019). SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems. 

Data collection method(s)*
See original source. 

Data selection and filtering*
See original source. 

Data preprocessing*
See original source. 

Data labeling*
Some data labels (annotations) were changed to fit with Swedish example, but in general the aim was to keep such changes to a minimum. 

Annotator characteristics

IV. ETHICS AND CAVEATS

Ethical considerations
See original data source.

Things to watch out for
See original data source.

V. ABOUT DOCUMENTATION

Data last updated*
2023-03-01, v1.1

Which changes have been made, compared to the previous version*
Minor format changes

Access to previous versions

This document created*
2021-06-04, Felix Morger.

This document last updated*
2023-04-02, Aleksandrs Berdicevskis.

Where to look for further details

Documentation template version*
v1.1

VI. OTHER

Related projects

References</abstract>
      <sumDscr />
    </stdyInfo>
    <method>
      <dataColl />
    </method>
    <dataAccs>
      <useStmt>
        <restrctn xml:lang="en">Access to data through an external actor. </restrctn>
        <restrctn xml:lang="sv">Åtkomst till data via extern aktör. </restrctn>
      </useStmt>
    </dataAccs>
    <othrStdyMat />
  </stdyDscr>
</codeBook>