<codeBook xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xsi:schemaLocation="ddi:codebook:2_5 http://www.ddialliance.org/Specification/DDI-Codebook/2.5/XMLSchema/codebook.xsd" xmlns="ddi:codebook:2_5">
  <docDscr>
    <citation>
      <titlStmt>
        <titl xml:lang="sv">Svensk EAT: frågeklassifikation</titl>
        <parTitl xml:lang="en">Swedish EAT: question classification</parTitl>
        <IDNo agency="SND">doi-10-23695-zgkb-s720-0</IDNo>
        <IDNo agency="DOI">https://doi.org/10.23695/ZGKB-S720</IDNo>
      </titlStmt>
      <prodStmt>
        <producer xml:lang="en" abbr="SND">Swedish National Data Service</producer>
        <producer xml:lang="sv" abbr="SND">Svensk nationell datatjänst</producer>
      </prodStmt>
      <holdings URI="https://doi.org/10.23695/ZGKB-S720">Landing page</holdings>
    </citation>
  </docDscr>
  <stdyDscr>
    <citation>
      <titlStmt>
        <titl xml:lang="sv">Svensk EAT: frågeklassifikation</titl>
        <parTitl xml:lang="en">Swedish EAT: question classification</parTitl>
        <IDNo agency="SND">doi-10-23695-zgkb-s720-0</IDNo>
        <IDNo agency="DOI">https://doi.org/10.23695/ZGKB-S720</IDNo>
      </titlStmt>
      <rspStmt>
        <AuthEnty xml:lang="en" affiliation="">Språkbanken Text</AuthEnty>
      </rspStmt>
      <prodStmt />
      <distStmt>
        <distrbtr xml:lang="en" abbr="SND" URI="https://snd.se">Swedish National Data Service</distrbtr>
        <distrbtr xml:lang="sv" abbr="SND" URI="https://snd.se">Svensk nationell datatjänst</distrbtr>
        <distDate xml:lang="en" date="2024-01-01" />
      </distStmt>
      <verStmt>
        <version elementVersion="0" elementVersionDate="2024-01-01" />
      </verStmt>
      <holdings URI="https://doi.org/10.23695/ZGKB-S720">Landing page</holdings>
    </citation>
    <stdyInfo>
      <subject />
      <abstract xml:lang="en" contentType="abstract">I. IDENTIFYING INFORMATION

Title*
Swedish EAT v1.0

Subtitle

Created by*
Jonatan Cerwall (jonatancerwall@gmail.com)

Publisher(s)*
Språkbanken Text

Link(s) / permanent identifier(s)*

License(s)*

Abstract*

      This dataset is a translated version of the QAQC dataset
      (https://cogcomp.seas.upenn.edu/Data/QA/QC/) for expected-answer-type
      classification. Taxonomy is the Li and Roth Taxonomy, also from
      https://cogcomp.seas.upenn.edu/Data/QA/QC/.

Funded by*

Cite as

      Cerwall, J. (2021). What the BERT? Fine-tuning KB-BERT for Question
      Classification. Unpublished manuscript, School of Electrical Engineering
      and Computer Science, KTH.

Related datasets

II. USAGE

Key applications
Machine learning, EAT Classification

Intended task(s)/usage(s)
Evaluate models by standard classification

Recommended evaluation measures
Accuracy

Dataset function(s)
Testing

Recommended split(s)
Test only

III. DATA

Primary data*
Text

Language*
Swedish

Dataset in numbers*
5451 questions in training set, 500 in test set.

Nature of the content*
Open ended factoid questions.

Format*
Comma-separated, four columns:

text -- the open ended factoid question

      verbose label -- both the coarse-grained label and the fine-grained label
      formatted as COARSE:fine

coarse label -- coarse-grained label

fine label -- fine-grained label

Data source(s)*

      Translated from the QAQC dataset
      (https://cogcomp.seas.upenn.edu/Data/QA/QC/)

Data collection method(s)*
--

Data selection and filtering*
--

Data preprocessing*
--

Data labeling*
--

Annotator characteristics

IV. ETHICS AND CAVEATS

Ethical considerations

      "Some outdated treatment of women (eg "Vilka är de sexigaste kvinnorna i
      världen?")"

Things to watch out for

V. ABOUT DOCUMENTATION

Data last updated*
2021-07-27

Which changes have been made, compared to the previous version*
First version

Access to previous versions

This document created*
2021-07-27

This document last updated*
2023-06-08

Where to look for further details

Documentation template version*

VI. OTHER

Related projects

References</abstract>
      <abstract xml:lang="sv" contentType="abstract">I. IDENTIFYING INFORMATION

Title*
Swedish EAT v1.0

Subtitle

Created by*
Jonatan Cerwall (jonatancerwall@gmail.com)

Publisher(s)*
Språkbanken Text

Link(s) / permanent identifier(s)*

License(s)*

Abstract*

      This dataset is a translated version of the QAQC dataset
      (https://cogcomp.seas.upenn.edu/Data/QA/QC/) for expected-answer-type
      classification. Taxonomy is the Li and Roth Taxonomy, also from
      https://cogcomp.seas.upenn.edu/Data/QA/QC/.

Funded by*

Cite as

      Cerwall, J. (2021). What the BERT? Fine-tuning KB-BERT for Question
      Classification. Unpublished manuscript, School of Electrical Engineering
      and Computer Science, KTH.

Related datasets

II. USAGE

Key applications
Machine learning, EAT Classification

Intended task(s)/usage(s)
Evaluate models by standard classification

Recommended evaluation measures
Accuracy

Dataset function(s)
Testing

Recommended split(s)
Test only

III. DATA

Primary data*
Text

Language*
Swedish

Dataset in numbers*
5451 questions in training set, 500 in test set.

Nature of the content*
Open ended factoid questions.

Format*
Comma-separated, four columns:

text -- the open ended factoid question

      verbose label -- both the coarse-grained label and the fine-grained label
      formatted as COARSE:fine

coarse label -- coarse-grained label

fine label -- fine-grained label

Data source(s)*

      Translated from the QAQC dataset
      (https://cogcomp.seas.upenn.edu/Data/QA/QC/)

Data collection method(s)*
--

Data selection and filtering*
--

Data preprocessing*
--

Data labeling*
--

Annotator characteristics

IV. ETHICS AND CAVEATS

Ethical considerations

      "Some outdated treatment of women (eg "Vilka är de sexigaste kvinnorna i
      världen?")"

Things to watch out for

V. ABOUT DOCUMENTATION

Data last updated*
2021-07-27

Which changes have been made, compared to the previous version*
First version

Access to previous versions

This document created*
2021-07-27

This document last updated*
2023-06-08

Where to look for further details

Documentation template version*

VI. OTHER

Related projects

References</abstract>
      <sumDscr />
    </stdyInfo>
    <method>
      <dataColl />
    </method>
    <dataAccs>
      <useStmt>
        <restrctn xml:lang="en">Access to data through an external actor. </restrctn>
        <restrctn xml:lang="sv">Åtkomst till data via extern aktör. </restrctn>
      </useStmt>
    </dataAccs>
    <othrStdyMat />
  </stdyDscr>
</codeBook>