<codeBook xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xsi:schemaLocation="ddi:codebook:2_5 http://www.ddialliance.org/Specification/DDI-Codebook/2.5/XMLSchema/codebook.xsd" xmlns="ddi:codebook:2_5">
  <docDscr>
    <citation>
      <titlStmt>
        <titl xml:lang="sv"></titl>
        <parTitl xml:lang="en">Synthetic - FEGA Sweden Heilsa synthetic dataset December 2023</parTitl>
        <IDNo agency="SND">fega-sweden-egad50000000119-html-0</IDNo>
      </titlStmt>
      <prodStmt>
        <producer xml:lang="en" abbr="SND">Swedish National Data Service</producer>
        <producer xml:lang="sv" abbr="SND">Svensk nationell datatjänst</producer>
      </prodStmt>
    </citation>
  </docDscr>
  <stdyDscr>
    <citation>
      <titlStmt>
        <titl xml:lang="sv"></titl>
        <parTitl xml:lang="en">Synthetic - FEGA Sweden Heilsa synthetic dataset December 2023</parTitl>
        <IDNo agency="SND">fega-sweden-egad50000000119-html-0</IDNo>
      </titlStmt>
      <rspStmt>
        <AuthEnty xml:lang="en" affiliation="">FEGA Sweden</AuthEnty>
      </rspStmt>
      <prodStmt />
      <distStmt>
        <distrbtr xml:lang="en" abbr="SND" URI="https://snd.se">Swedish National Data Service</distrbtr>
        <distrbtr xml:lang="sv" abbr="SND" URI="https://snd.se">Svensk nationell datatjänst</distrbtr>
        <distDate xml:lang="en" date="2023-12-11" />
      </distStmt>
      <verStmt>
        <version elementVersion="0" elementVersionDate="2023-12-11" />
      </verStmt>
    </citation>
    <stdyInfo>
      <subject />
      <abstract xml:lang="en" contentType="abstract">Synthetic - This submission contains a subset of a synthetic dataset derived from the project Heilsa Tryggvedottir - a Nordic collaboration on sharing sensitive human data. Heilsa Tryggvedottir is funded by the Nordic e-Infrastructure Collaboration (NeIC), the ELIXIR nodes of Finland, Norway, and Sweden, Computerome in Denmark, and the Estonian Scientific Computing Infrastructure (ETAIS).

In the synthetic data creation process, it was attempted to strike a fine balance between the usability of the datasets (e.g. technical FEGA development, testing, user training, and basic bioinformatics) and compliance with GDPR. File names and file content (e.g. headers in fastq) are anonymized. Moreover, the X, Y, and mitochondrial sequences have been discarded from the original data since these data can be used for maternal, paternal, or ethnic origin tracing. The dataset does not follow natural haplotype distribution (inherent to imputation panels). The only inputs derived from real sequence data are variant distribution density per chromosome and learning sequencing error models.

The synthetic dataset consists of two fastq files, a cram file, a vcf file, and two index files. 

This dataset is 1 of 1 included in the study titled Synthetic - FEGA Sweden Heilsa synthetic dataset December 2023, http://identifiers.org/ega.study:EGAS50000000086.</abstract>
      <sumDscr />
    </stdyInfo>
    <method>
      <dataColl />
    </method>
    <dataAccs>
      <useStmt>
        <restrctn xml:lang="en">Access to data through an external actor. </restrctn>
        <restrctn xml:lang="sv">Åtkomst till data via extern aktör. </restrctn>
      </useStmt>
    </dataAccs>
    <othrStdyMat />
  </stdyDscr>
</codeBook>