<codeBook xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xsi:schemaLocation="ddi:codebook:2_5 http://www.ddialliance.org/Specification/DDI-Codebook/2.5/XMLSchema/codebook.xsd" xmlns="ddi:codebook:2_5">
  <docDscr>
    <citation>
      <titlStmt>
        <titl xml:lang="sv">Whole-genome sequencing of follicular thyroid carcinomas reveal recurrent mutations in microRNA processing subunit DGCR8</titl>
        <parTitl xml:lang="en">Whole-genome sequencing of follicular thyroid carcinomas reveal recurrent mutations in microRNA processing subunit DGCR8</parTitl>
        <IDNo agency="SND">2021-108-1-1</IDNo>
        <IDNo agency="DOI">https://doi.org/10.5878/6fcv-1795</IDNo>
      </titlStmt>
      <prodStmt>
        <producer xml:lang="en" abbr="SND">Swedish National Data Service</producer>
        <producer xml:lang="sv" abbr="SND">Svensk nationell datatjänst</producer>
      </prodStmt>
      <holdings URI="https://doi.org/10.5878/6fcv-1795">Landing page</holdings>
    </citation>
  </docDscr>
  <stdyDscr>
    <citation>
      <titlStmt>
        <titl xml:lang="sv">Whole-genome sequencing of follicular thyroid carcinomas reveal recurrent mutations in microRNA processing subunit DGCR8</titl>
        <parTitl xml:lang="en">Whole-genome sequencing of follicular thyroid carcinomas reveal recurrent mutations in microRNA processing subunit DGCR8</parTitl>
        <IDNo agency="SND">2021-108-1-1</IDNo>
        <IDNo agency="DOI">https://doi.org/10.5878/6fcv-1795</IDNo>
        <IDNo agency="SwePub">oai:DiVA.org:uu-459861</IDNo>
        <IDNo agency="DOI">10.1210/clinem/dgab471</IDNo>
      </titlStmt>
      <rspStmt>
        <AuthEnty xml:lang="en" affiliation="Department of Oncology-Pathology, Karolinska Institutet">Juhlin, Christofer</AuthEnty>
        <AuthEnty xml:lang="sv" affiliation="Institutionen för Onkologi och Patologi, Karolinska Institutet">Juhlin, Christofer</AuthEnty>
        <AuthEnty xml:lang="en" affiliation="Department of Oncology-Pathology, Karolinska Institutet">Paulsson, Johan</AuthEnty>
        <AuthEnty xml:lang="sv" affiliation="Institutionen för Onkologi och Patologi, Karolinska Institutet">Paulsson, Johan</AuthEnty>
      </rspStmt>
      <prodStmt />
      <distStmt>
        <distrbtr xml:lang="en" abbr="SND" URI="https://snd.se">Swedish National Data Service</distrbtr>
        <distrbtr xml:lang="sv" abbr="SND" URI="https://snd.se">Svensk nationell datatjänst</distrbtr>
        <distDate xml:lang="en" date="2021-06-24" />
      </distStmt>
      <verStmt>
        <version elementVersion="1" elementVersionDate="2021-06-24" />
      </verStmt>
      <holdings URI="https://doi.org/10.5878/6fcv-1795">Landing page</holdings>
    </citation>
    <stdyInfo>
      <subject>
        <keyword xml:lang="en" vocab="ELSST" vocabURI="https://elsst.cessda.eu/id/6/4214bde5-2261-45db-84ce-e36c1eb09011">HUMAN GENETICS</keyword>
        <keyword xml:lang="sv" vocab="ELSST" vocabURI="https://elsst.cessda.eu/id/6/4214bde5-2261-45db-84ce-e36c1eb09011">HUMANGENETIK</keyword>
        <keyword xml:lang="en" vocab="MeSH" vocabURI="http://id.nlm.nih.gov/mesh/D004701">Endocrine Gland Neoplasms</keyword>
        <keyword xml:lang="sv" vocab="MeSH" vocabURI="http://id.nlm.nih.gov/mesh/D004701">Tumörer i endokrina körtlar</keyword>
        <keyword xml:lang="en" vocab="MeSH" vocabURI="http://id.nlm.nih.gov/mesh/D013964">Thyroid Neoplasms</keyword>
        <keyword xml:lang="sv" vocab="MeSH" vocabURI="http://id.nlm.nih.gov/mesh/D013964">Sköldkörteltumörer</keyword>
        <keyword xml:lang="en" vocab="MeSH" vocabURI="http://id.nlm.nih.gov/mesh/D016606">Thyroid Nodule</keyword>
        <keyword xml:lang="sv" vocab="MeSH" vocabURI="http://id.nlm.nih.gov/mesh/D016606">Sköldkörtelknuta</keyword>
      </subject>
      <abstract xml:lang="en" contentType="abstract">Data availability
This dataset can only be shared within Sweden due to legal restrictions.

Background
The genomic and transcriptomic landscape of widely invasive follicular thyroid carcinomas (wiFTCs) is poorly characterized, and a large subset of these tumours lack information on credible genetic driver events. The aim of this study was to bridge this gap. 
Methods
We performed whole-genome and RNA sequencing and subsequent bioinformatic analyses of 13 wiFTCs with a particularly poor prognosis, and matched normal tissue.
Results
Ten out of thirteen (77%) tumours exhibited one or several mutations in established genes ranked as the top 20 mutated in thyroid cancer, including TERT (n=4), NRAS (n=3), HRAS, KRAS, AKT, PTEN, PIK3CA, MUTYH and MEN1 (n=1 each). Recurrent somatic mutations in three genes were annotated as significant according to MutSig2CV: FAM72D (n=3), TP53 (n=3) and EIF1AX (n=3), with DGCR8 (n=2) as borderline significant. Of interest, both DGCR8 mutations were recurrent p.E518K missense alterations, a mutation known to cause familial multinodular goiter (MNG) via disruption of microRNA (miRNA) processing. Expression analyses pinpointed a trend towards reduced DGCR8 mRNA expression in FTCs in general. Copy number analyses revealed recurrent gains of loci on chromosomes 4, 6 and 10, and fusion gene analyses revealed 27 high-quality events. Based on the transcriptome data FTCs clustered in two principal clusters, displaying significant differences in expression of genes associated with metabolic pathways. 
Conclusion
In summary, we describe the genomic and transcriptomic landscape in wiFTCs and identify novel recurrent mutations and copy number alterations with possible driver properties and lay the foundation for future studies.

The dataset consists of tables and lists containing underlying data, and supplementary figures for a manuscript submitted to "Journal of Clinical Endocrinology &amp; Metabolism". It includes 8 tables and 3 figures: 

File name: T1_Detailed-characteristics-of-the-study-cohort.csv
Contains "Table 1: Detailed characteristics of the study cohort." 

File name: T2_List-of-Somatic-SNVs.csv
Contains "Table 2: List of Somatic SNV's (Small nucleotide variants)." 

File name: T3_MutSig2CV-input-genes.csv
Contains "Table 3: MutSig2CV input genes." 

File name: T4_MutSig2CV-genes-ranked-by-p-value.csv
Contains "Table 4: MutSig2CV genes ranked by p-value."

File name: T5_Genes-in-copy-number-altered-minimal-region-of-amplification.csv
Contains "Table 5: List of genes in copy number altered minimal region of amplification." 

File name: T6_Aberrant-cell-fraction-and-ploidy-as-determined-by-ASCAT.csv
Contains "Table 6: Aberrant cell fraction and ploidy as determined by ASCAT." 

File name: T7_High-confidence-structural-variations-in-the-tumor-cohort.csv
Contains "Table 7: List of high-confidence structural variations in the tumor cohort." 

File name: T8_Significant-differentially-expressed-genes-in-tumor-vs-normal-thyroid.csv
Contains "Table 8: List of significant differentially expressed genes in tumor versus normal thyroid."

File name: List_of_variables.pdf
Contains List of variables: Metadata and abbreviation explanations for Table 1-8.

File name: Whole-genome-sequencing-follicular-thyroid-carcinomas_Figures.pdf 
Contains Supplementary Figure S1-S3:
- Supplementary Figure S1: Somatic mutational overview in the WGS cohort. 
- Supplementary Figure S2: Normalized DGCR8 mRNA expression in tumours with or without loss of heterozygosity (LOH) of the DGCR8 locus. 
- Supplementary Figure S3: a Gene set enrichment analysis (GSEA).</abstract>
      <abstract xml:lang="sv" contentType="abstract">Tillgänglighet för data
Datasetet kan endast delas inom Sverige på grund av juridiska restriktioner.

Bakgrund
Det fullständiga genomiska och transkriptomiska landskapet i widely invasive follikulära tyreoideacancrar är ännu ej helt kartlagt och en stor andel av dessa tumörer har ingen identifierad driver. Målet med denna studie var att identifiera fler drivers.
Metod
Studien innefattar helgenom- och transkriptomsekvensering samt bioinformatiska analyser av 13 stycken fall av widely invasive follikulära tyreoideacancrar med parad normal vävnad.
Resultat
Tio av tretton tumörer visade mutationer i tyreoideacancer-relaterade gener, TERT (n=4), NRAS (n=3), HRAS, KRAS, AKT, PTEN, PIK3CA, MUTYH and MEN1 (n=1 each). MutSig2CV-analysen visade signifikant återkommande mutationer i FAM72D (n=3), TP53 (n=3), EIF1AX (n=3), och DGCR8 (n=2). Båda DGCR8-mutationerna var p.E518K missense som är en mutation som visats orsaka ärftlig multinodös struma genom dysreglering av mikro-RNA-maskineriet. Inga fler DGCR8-mutationer hittades i en utökad kohort av follikulära tumörer men expressionsanalys visade signifikant nedreglerad DGCR8-uttryck i maligna jämfört med benigna follikulära tumörer. Vidare visade kopieantalsanalys återkommande amplifiering av cytoband på kromosom 4, 6 och 10.
Konklusion
Sammanfattningsvis presenterar vi det fullständiga genomiska och transkriptomiska landskapet i widely invasive follikulära tyreoideacancrar och vi identifierade återkommande mutationer och kopieantalsförändringar som kan utgöra viktiga faktorer i tumörutvecklingen av dessa tumörer.

Datasetet består av tabeller och listor med underliggande data samt kompletterande bilder, för ett manuskript skickat till "Journal of Clinical Endocrinology &amp; Metabolism". Det innehåller 8 tabeller och 3 bilder:

Filnamn: T1_Detailed-characteristics-of-the-study-cohort.csv
Innehåller "Table 1: Detailed characteristics of the study cohort." 

File name: T2_List-of-Somatic-SNVs.csv
Innehåller "Table 2: List of Somatic SNV's (Small nucleotide variants)." 

Filnamn: T3_MutSig2CV-input-genes.csv
Innehåller "Table 3: MutSig2CV input genes." 

Filnamn: T4_MutSig2CV-genes-ranked-by-p-value.csv
Innehåller "Table 4: MutSig2CV genes ranked by p-value."

Filnamn: T5_Genes-in-copy-number-altered-minimal-region-of-amplification.csv
Innehåller "Table 5: List of genes in copy number altered minimal region of amplification." 

Filnamn: T6_Aberrant-cell-fraction-and-ploidy-as-determined-by-ASCAT.csv
Innehåller "Table 6: Aberrant cell fraction and ploidy as determined by ASCAT." 

Filnamn: T7_High-confidence-structural-variations-in-the-tumor-cohort.csv
Innehåller "Table 7: List of high-confidence structural variations in the tumor cohort." 

Filnamn: T8_Significant-differentially-expressed-genes-in-tumor-vs-normal-thyroid.csv
Innehåller "Table 8: List of significant differentially expressed genes in tumor versus normal thyroid."

Filnamn: List_of_variables.pdf
Innehåller Variabellista med metadata och förkortningsuttydningar för Table 1-8.

Filnamn: Whole-genome-sequencing-follicular-thyroid-carcinomas_Figures.pdf 
Innehåller Supplementary Figure S1-S3:
- Supplementary Figure S1: Somatic mutational overview in the WGS cohort. 
- Supplementary Figure S2: Normalized DGCR8 mRNA expression in tumours with or without loss of heterozygosity (LOH) of the DGCR8 locus. 
- Supplementary Figure S3: a Gene set enrichment analysis (GSEA).</abstract>
      <sumDscr>
        <nation xml:lang="en" abbr="SE">Sweden</nation>
        <nation xml:lang="sv" abbr="SE">Sverige</nation>
        <universe xml:lang="en">The study included 13 patients with follicular thyroid carcinoma. All diagnosed at the Karolinska University Hospital, Stockholm.</universe>
        <universe xml:lang="sv">Studien inkluderade 13 fall av patienter med follikulär tyreoideacancer. Samtliga fall diagnosticerades på Karolinska Universitetssjukhuset.</universe>
        <dataKind xml:lang="en">Text</dataKind>
        <dataKind xml:lang="en">Still image</dataKind>
      </sumDscr>
    </stdyInfo>
    <method>
      <dataColl>
        <collMode xml:lang="en">Measurements and tests<concept vocab="DDI Mode of Collection" vocabURI="https://vocabularies.cessda.eu/v2/vocabularies/ModeOfCollection/5.0.0?languageVersion=en-5.0.0">Measurements and tests</concept></collMode>
        <collMode xml:lang="sv">Mätningar och tester<concept vocab="DDI Mode of Collection" vocabURI="https://vocabularies.cessda.eu/v2/vocabularies/ModeOfCollection/5.0.0?languageVersion=sv-5.0.0">Mätningar och tester</concept></collMode>
      </dataColl>
    </method>
    <dataAccs>
      <useStmt>
        <restrctn xml:lang="en">Access to data through SND. Access to data is restricted.</restrctn>
        <restrctn xml:lang="sv">Åtkomst till data via SND. Tillgång till data är begränsad.</restrctn>
        <conditions elementVersion="info:eu-repo-Access-Terms vocabulary">restrictedAccess</conditions>
      </useStmt>
    </dataAccs>
    <othrStdyMat>
      <relPubl>
        <citation>
          <titlStmt>
            <titl xml:lang="sv">Paulsson, J. O., Rafati, N., DiLorenzo, S., Chen, Y., Haglund, F., Zedenius, J., &amp; Juhlin, C. C. (n.d.). Whole-genome Sequencing of Follicular Thyroid Carcinomas Reveal Recurrent Mutations in MicroRNA Processing Subunit DGCR8. In Journal of Clinical Endocrinology and Metabolism (Vol. 106, Issue 11, pp. 3265–3282). https://doi.org/10.1210/clinem/dgab471</titl>
            <parTitl xml:lang="en">Paulsson, J. O., Rafati, N., DiLorenzo, S., Chen, Y., Haglund, F., Zedenius, J., &amp; Juhlin, C. C. (n.d.). Whole-genome Sequencing of Follicular Thyroid Carcinomas Reveal Recurrent Mutations in MicroRNA Processing Subunit DGCR8. In Journal of Clinical Endocrinology and Metabolism (Vol. 106, Issue 11, pp. 3265–3282). https://doi.org/10.1210/clinem/dgab471</parTitl>
            <IDNo agency="DOI">10.1210/clinem/dgab471</IDNo>
            <IDNo agency="SWEPUB">oai:DiVA.org:uu-459861</IDNo>
          </titlStmt>
          <distStmt>
            <distDate date="2021">2021</distDate>
          </distStmt>
          <any xml:lang="en" xmlns="http://purl.org/dc/elements/1.1/">oai:DiVA.org:uu-459861</any>
        </citation>
      </relPubl>
    </othrStdyMat>
  </stdyDscr>
</codeBook>