<codeBook xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xsi:schemaLocation="ddi:codebook:2_5 http://www.ddialliance.org/Specification/DDI-Codebook/2.5/XMLSchema/codebook.xsd" xmlns="ddi:codebook:2_5">
  <docDscr>
    <citation>
      <titlStmt>
        <titl xml:lang="sv">Engelsk-svensk-turkisk korpus</titl>
        <parTitl xml:lang="en">English-Swedish-Turkish Corpus</parTitl>
        <IDNo agency="SND">ext0078-1-1</IDNo>
      </titlStmt>
      <prodStmt>
        <producer xml:lang="en" abbr="SND">Swedish National Data Service</producer>
        <producer xml:lang="sv" abbr="SND">Svensk nationell datatjänst</producer>
      </prodStmt>
      <holdings URI="http://stp.lingfil.uu.se/~bea/publ/megyesi-etal-lrec10-final.pdf">http://stp.lingfil.uu.se/~bea/publ/megyesi-etal-lrec10-final.pdf</holdings>
      <holdings URI="http://swepub.kb.se/bib/swepub:oai:DiVA.org:uu-121758?tab2=abs&amp;language=en">http://swepub.kb.se/bib/swepub:oai:DiVA.org:uu-121758?tab2=abs&amp;language=en</holdings>
    </citation>
  </docDscr>
  <stdyDscr>
    <citation>
      <titlStmt>
        <titl xml:lang="sv">Engelsk-svensk-turkisk korpus</titl>
        <parTitl xml:lang="en">English-Swedish-Turkish Corpus</parTitl>
        <IDNo agency="SND">ext0078-1-1</IDNo>
        <IDNo agency="URN">urn:nbn:se:uu:diva-121758</IDNo>
      </titlStmt>
      <rspStmt>
        <AuthEnty xml:lang="en" affiliation="Department of Linguistics and Philology, Uppsala University">Megyesi, Beáta</AuthEnty>
        <AuthEnty xml:lang="sv" affiliation="Institutionen för lingvistik och filologi, Uppsala universitet">Megyesi, Beáta</AuthEnty>
        <AuthEnty xml:lang="en" affiliation="Department of Linguistics and Philology, Uppsala University">Csató Johanson, Éva</AuthEnty>
        <AuthEnty xml:lang="sv" affiliation="Institutionen för lingvistik och filologi, Uppsala universitet">Csató Johanson, Éva</AuthEnty>
        <AuthEnty xml:lang="en" affiliation="Department of Linguistics and Philology, Uppsala University">Dahlqvist, Bengt</AuthEnty>
        <AuthEnty xml:lang="sv" affiliation="Institutionen för lingvistik och filologi, Uppsala universitet">Dahlqvist, Bengt</AuthEnty>
        <AuthEnty xml:lang="en" affiliation="Department of Linguistics and Philology, Uppsala University">Nivre, Joakim</AuthEnty>
        <AuthEnty xml:lang="sv" affiliation="Institutionen för lingvistik och filologi, Uppsala universitet">Nivre, Joakim</AuthEnty>
        <AuthEnty xml:lang="en" affiliation="Department of Linguistics and Philology, Uppsala University">Pettersson, Eva</AuthEnty>
        <AuthEnty xml:lang="sv" affiliation="Institutionen för lingvistik och filologi, Uppsala universitet">Pettersson, Eva</AuthEnty>
      </rspStmt>
      <prodStmt />
      <distStmt>
        <distrbtr xml:lang="en" abbr="SND" URI="https://snd.se">Swedish National Data Service</distrbtr>
        <distrbtr xml:lang="sv" abbr="SND" URI="https://snd.se">Svensk nationell datatjänst</distrbtr>
        <distDate xml:lang="en" date="2020-05-13" />
      </distStmt>
      <verStmt>
        <version elementVersion="1" elementVersionDate="2020-05-13" />
      </verStmt>
      <holdings URI="http://stp.lingfil.uu.se/~bea/publ/megyesi-etal-lrec10-final.pdf">http://stp.lingfil.uu.se/~bea/publ/megyesi-etal-lrec10-final.pdf</holdings>
      <holdings URI="http://swepub.kb.se/bib/swepub:oai:DiVA.org:uu-121758?tab2=abs&amp;language=en">http://swepub.kb.se/bib/swepub:oai:DiVA.org:uu-121758?tab2=abs&amp;language=en</holdings>
    </citation>
    <stdyInfo>
      <subject>
        <topcClas xml:lang="en" vocab="CESSDA Topic Classification" vocabURI="https://vocabularies.cessda.eu/vocabulary/TopicClassification?code=MediaCommunicationAndLanguage.LanguageAndLinguistics">Language and linguistics</topcClas>
        <topcClas xml:lang="sv" vocab="CESSDA Topic Classification" vocabURI="https://vocabularies.cessda.eu/vocabulary/TopicClassification?code=MediaCommunicationAndLanguage.LanguageAndLinguistics">Språk och lingvistik</topcClas>
      </subject>
      <abstract xml:lang="en" contentType="abstract">We describe a syntactically annotated parallel corpus containing typologically partly different languages, namely English, Swedish andTurkish. The corpus consists of approximately 300 000 tokens in Swedish, 160 000 in Turkish and 150 000 in English, containing bothfiction and technical documents. We build the corpus by using the Uplug toolkit for automatic structural markup, such as tokenizationand sentence segmentation, as well as sentence and word alignment. In addition, we use basic language resource kits for the linguisticanalysis of the languages involved. The annotation is carried on various layers from  morphological  and  part of speech analysis  todependency structures. The tools used for linguistic annotation, e.g., HunPos tagger and MaltParser, are freely available data-drivenresources, trained on existing corpora and treebanks for each language. The parallel treebank is used in teaching and linguistic researchto study the relationship between the structurally different languages. In order to study the treebank, several tools have been developedfor the visualization of the annotation and alignment, allowing search for linguistic patterns.

Purpose:

The main goal of the project is to promote research and teaching in the Turkish language. More specifically, the aim is to build a language resource for Turkish, Swedish and English allowing contrastive studies between the involved languages.</abstract>
      <abstract xml:lang="sv" contentType="abstract">We describe a syntactically annotated parallel corpus containing typologically partly different languages, namely English, Swedish andTurkish. The corpus consists of approximately 300 000 tokens in Swedish, 160 000 in Turkish and 150 000 in English, containing bothfiction and technical documents. We build the corpus by using the Uplug toolkit for automatic structural markup, such as tokenizationand sentence segmentation, as well as sentence and word alignment. In addition, we use basic language resource kits for the linguisticanalysis of the languages involved. The annotation is carried on various layers from  morphological  and  part of speech analysis  todependency structures. The tools used for linguistic annotation, e.g., HunPos tagger and MaltParser, are freely available data-drivenresources, trained on existing corpora and treebanks for each language. The parallel treebank is used in teaching and linguistic researchto study the relationship between the structurally different languages. In order to study the treebank, several tools have been developedfor the visualization of the annotation and alignment, allowing search for linguistic patterns.

Syfte:

Det övergripande syftet med projektet är att främja forskning och undervisning i turkiska. Mer specifikt syftar projektet till att bygga upp språkteknologiska basresurser för turkiska, svenska och engelska med kontrastiva frågeställningar i fokus.</abstract>
      <sumDscr>
        <dataKind xml:lang="en">Text</dataKind>
        <dataKind xml:lang="en">Interactive resource</dataKind>
      </sumDscr>
    </stdyInfo>
    <method>
      <dataColl />
    </method>
    <dataAccs>
      <useStmt>
        <restrctn xml:lang="en">Access to data through an external actor. Access to data is restricted.</restrctn>
        <restrctn xml:lang="sv">Åtkomst till data via extern aktör. Tillgång till data är begränsad.</restrctn>
        <conditions elementVersion="info:eu-repo-Access-Terms vocabulary">restrictedAccess</conditions>
      </useStmt>
    </dataAccs>
    <othrStdyMat>
      <relPubl>
        <citation>
          <titlStmt>
            <titl xml:lang="sv">Csató Johansson, Megyesi, Beáta, Dahlqvist, Bengt, Csató, Éva Á. &amp; Nivre, Joakim, 'The English-Swedish-Turkish Parallel Treebank', Proceedings of Language Resources and Evaluation (LREC 2010)., 2010</titl>
            <parTitl xml:lang="en">Csató Johansson, Megyesi, Beáta, Dahlqvist, Bengt, Csató, Éva Á. &amp; Nivre, Joakim, 'The English-Swedish-Turkish Parallel Treebank', Proceedings of Language Resources and Evaluation (LREC 2010)., 2010</parTitl>
            <IDNo agency="URN">urn:nbn:se:uu:diva-121758</IDNo>
          </titlStmt>
          <distStmt>
            <distrbtr URI="http://stp.lingfil.uu.se/~bea/publ/megyesi-etal-lrec10-final.pdf" />
            <distDate date="2010">2010</distDate>
          </distStmt>
        </citation>
      </relPubl>
    </othrStdyMat>
  </stdyDscr>
</codeBook>