<ddi:DDIInstance xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="ddi:instance:3_3 http://ddialliance.org/Specification/DDI-Lifecycle/3.3/XMLSchema/instance.xsd" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:ddi="ddi:instance:3_3" xmlns:r="ddi:reusable:3_3" xmlns:s="ddi:studyunit:3_3" xmlns:d="ddi:datacollection:3_3" xmlns:a="ddi:archive:3_3" xmlns:c="ddi:conceptualcomponent:3_3" xmlns:cm="ddi:comparative:3_3" xmlns:g="ddi:group:3_3" xmlns:l="ddi:logicalproduct:3_3" xmlns:p="ddi:physicaldataproduct:3_3" xmlns:pi="ddi:physicalinstance:3_3" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:xml="http://www.w3.org/XML/1998/namespace" isMaintainable="true" scopeOfUniqueness="Agency">
  <r:URN>urn:ddi:se.researchdata:doi-10-23695-zgkb-s720:0</r:URN>
  <r:Agency>SND</r:Agency>
  <r:ID>doi-10-23695-zgkb-s720</r:ID>
  <r:Version>0</r:Version>
  <g:ResourcePackage>
    <r:URN>urn:ddi:se.researchdata:doi-10-23695-zgkb-s720.ResourcePackage:2.0</r:URN>
    <r:OtherMaterialScheme>
      <r:URN>urn:ddi:se.researchdata:doi-10-23695-zgkb-s720.OtherMaterialScheme:2.0</r:URN>
    </r:OtherMaterialScheme>
    <a:OrganizationScheme>
      <r:URN>urn:ddi:se.researchdata:doi-10-23695-zgkb-s720.OrganizationScheme-0:2.0</r:URN>
      <a:Organization>
        <r:URN>urn:ddi:se.researchdata:doi-10-23695-zgkb-s720.Organization-0:2.0</r:URN>
        <a:OrganizationIdentification>
          <a:OrganizationName>
            <r:String xml:lang="en">Språkbanken Text</r:String>
          </a:OrganizationName>
        </a:OrganizationIdentification>
      </a:Organization>
    </a:OrganizationScheme>
  </g:ResourcePackage>
  <s:StudyUnit>
    <r:URN>urn:ddi:se.researchdata:doi-10-23695-zgkb-s720.StudyUnit:2.0</r:URN>
    <r:UserID typeOfUserID="datasetIdentifier">doi-10-23695-zgkb-s720</r:UserID>
    <r:Citation>
      <r:Title>
        <r:String xml:lang="sv">Svensk EAT: frågeklassifikation</r:String>
        <r:String xml:lang="en">Swedish EAT: question classification</r:String>
      </r:Title>
      <r:Creator>
        <r:CreatorReference>
          <r:URN>urn:ddi:se.researchdata:doi-10-23695-zgkb-s720.Individual-0:2.0</r:URN>
          <r:TypeOfObject>Individual</r:TypeOfObject>
        </r:CreatorReference>
      </r:Creator>
      <r:Publisher>
        <r:PublisherName>
          <r:String xml:lang="sv">Göteborgs universitet</r:String>
          <r:String xml:lang="en">University of Gothenburg</r:String>
        </r:PublisherName>
      </r:Publisher>
      <r:Publisher>
        <r:PublisherName>
          <r:String xml:lang="sv">Göteborgs universitet</r:String>
          <r:String xml:lang="en">University of Gothenburg</r:String>
        </r:PublisherName>
      </r:Publisher>
      <r:PublicationDate>
        <r:SimpleDate>2024-01-01</r:SimpleDate>
      </r:PublicationDate>
      <r:InternationalIdentifier>
        <r:IdentifierContent>10.23695/ZGKB-S720</r:IdentifierContent>
        <r:ManagingAgency controlledVocabularyAgencyName="DOI">DOI</r:ManagingAgency>
      </r:InternationalIdentifier>
    </r:Citation>
    <r:Abstract>
      <r:Content xml:lang="sv">I. IDENTIFYING INFORMATION

Title*
Swedish EAT v1.0

Subtitle

Created by*
Jonatan Cerwall (jonatancerwall@gmail.com)

Publisher(s)*
Språkbanken Text

Link(s) / permanent identifier(s)*

License(s)*

Abstract*

      This dataset is a translated version of the QAQC dataset
      (https://cogcomp.seas.upenn.edu/Data/QA/QC/) for expected-answer-type
      classification. Taxonomy is the Li and Roth Taxonomy, also from
      https://cogcomp.seas.upenn.edu/Data/QA/QC/.

Funded by*

Cite as

      Cerwall, J. (2021). What the BERT? Fine-tuning KB-BERT for Question
      Classification. Unpublished manuscript, School of Electrical Engineering
      and Computer Science, KTH.

Related datasets

II. USAGE

Key applications
Machine learning, EAT Classification

Intended task(s)/usage(s)
Evaluate models by standard classification

Recommended evaluation measures
Accuracy

Dataset function(s)
Testing

Recommended split(s)
Test only

III. DATA

Primary data*
Text

Language*
Swedish

Dataset in numbers*
5451 questions in training set, 500 in test set.

Nature of the content*
Open ended factoid questions.

Format*
Comma-separated, four columns:

text -- the open ended factoid question

      verbose label -- both the coarse-grained label and the fine-grained label
      formatted as COARSE:fine

coarse label -- coarse-grained label

fine label -- fine-grained label

Data source(s)*

      Translated from the QAQC dataset
      (https://cogcomp.seas.upenn.edu/Data/QA/QC/)

Data collection method(s)*
--

Data selection and filtering*
--

Data preprocessing*
--

Data labeling*
--

Annotator characteristics

IV. ETHICS AND CAVEATS

Ethical considerations

      "Some outdated treatment of women (eg "Vilka är de sexigaste kvinnorna i
      världen?")"

Things to watch out for

V. ABOUT DOCUMENTATION

Data last updated*
2021-07-27

Which changes have been made, compared to the previous version*
First version

Access to previous versions

This document created*
2021-07-27

This document last updated*
2023-06-08

Where to look for further details

Documentation template version*

VI. OTHER

Related projects

References</r:Content>
      <r:Content xml:lang="en">I. IDENTIFYING INFORMATION

Title*
Swedish EAT v1.0

Subtitle

Created by*
Jonatan Cerwall (jonatancerwall@gmail.com)

Publisher(s)*
Språkbanken Text

Link(s) / permanent identifier(s)*

License(s)*

Abstract*

      This dataset is a translated version of the QAQC dataset
      (https://cogcomp.seas.upenn.edu/Data/QA/QC/) for expected-answer-type
      classification. Taxonomy is the Li and Roth Taxonomy, also from
      https://cogcomp.seas.upenn.edu/Data/QA/QC/.

Funded by*

Cite as

      Cerwall, J. (2021). What the BERT? Fine-tuning KB-BERT for Question
      Classification. Unpublished manuscript, School of Electrical Engineering
      and Computer Science, KTH.

Related datasets

II. USAGE

Key applications
Machine learning, EAT Classification

Intended task(s)/usage(s)
Evaluate models by standard classification

Recommended evaluation measures
Accuracy

Dataset function(s)
Testing

Recommended split(s)
Test only

III. DATA

Primary data*
Text

Language*
Swedish

Dataset in numbers*
5451 questions in training set, 500 in test set.

Nature of the content*
Open ended factoid questions.

Format*
Comma-separated, four columns:

text -- the open ended factoid question

      verbose label -- both the coarse-grained label and the fine-grained label
      formatted as COARSE:fine

coarse label -- coarse-grained label

fine label -- fine-grained label

Data source(s)*

      Translated from the QAQC dataset
      (https://cogcomp.seas.upenn.edu/Data/QA/QC/)

Data collection method(s)*
--

Data selection and filtering*
--

Data preprocessing*
--

Data labeling*
--

Annotator characteristics

IV. ETHICS AND CAVEATS

Ethical considerations

      "Some outdated treatment of women (eg "Vilka är de sexigaste kvinnorna i
      världen?")"

Things to watch out for

V. ABOUT DOCUMENTATION

Data last updated*
2021-07-27

Which changes have been made, compared to the previous version*
First version

Access to previous versions

This document created*
2021-07-27

This document last updated*
2023-06-08

Where to look for further details

Documentation template version*

VI. OTHER

Related projects

References</r:Content>
    </r:Abstract>
    <r:Coverage>
      <r:TopicalCoverage>
        <r:URN>urn:ddi:se.researchdata:doi-10-23695-zgkb-s720.TopicalCoverage:2.0</r:URN>
        <r:Subject xml:lang="en" controlledVocabularyID="10208" controlledVocabularyName="Standard för svensk indelning av forskningsämnen 2025">Natural Language Processing</r:Subject>
        <r:Subject xml:lang="sv" controlledVocabularyID="10208" controlledVocabularyName="Standard för svensk indelning av forskningsämnen 2025">Språkbehandling och datorlingvistik</r:Subject>
      </r:TopicalCoverage>
      <r:SpatialCoverage />
    </r:Coverage>
    <a:Archive>
      <r:URN>urn:ddi:se.researchdata:doi-10-23695-zgkb-s720.Archive:2.0</r:URN>
      <a:ArchiveSpecific>
        <a:Item>
          <a:Access>
            <r:URN>urn:ddi:se.researchdata:doi-10-23695-zgkb-s720.Archive-ArchiveSpecificType-AccessType:2.0</r:URN>
            <a:TypeOfAccess controlledVocabularyName="info:eu-repo-Access-Terms vocabulary"></a:TypeOfAccess>
          </a:Access>
          <a:DataFileQuantity>0</a:DataFileQuantity>
        </a:Item>
      </a:ArchiveSpecific>
    </a:Archive>
  </s:StudyUnit>
</ddi:DDIInstance>