<codeBook xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xsi:schemaLocation="ddi:codebook:2_5 http://www.ddialliance.org/Specification/DDI-Codebook/2.5/XMLSchema/codebook.xsd" xmlns="ddi:codebook:2_5">
  <docDscr>
    <citation>
      <titlStmt>
        <titl xml:lang="sv"></titl>
        <parTitl xml:lang="en">ATP-cone sequence clusters</parTitl>
        <IDNo agency="SND">doi-10-17045-sthlmuni-7886641-0</IDNo>
        <IDNo agency="DOI">https://doi.org/10.17045/STHLMUNI.7886641</IDNo>
      </titlStmt>
      <prodStmt>
        <producer xml:lang="en" abbr="SND">Swedish National Data Service</producer>
        <producer xml:lang="sv" abbr="SND">Svensk nationell datatjänst</producer>
      </prodStmt>
      <holdings URI="https://doi.org/10.17045/STHLMUNI.7886641">Landing page</holdings>
    </citation>
  </docDscr>
  <stdyDscr>
    <citation>
      <titlStmt>
        <titl xml:lang="sv"></titl>
        <parTitl xml:lang="en">ATP-cone sequence clusters</parTitl>
        <IDNo agency="SND">doi-10-17045-sthlmuni-7886641-0</IDNo>
        <IDNo agency="DOI">https://doi.org/10.17045/STHLMUNI.7886641</IDNo>
      </titlStmt>
      <rspStmt />
      <prodStmt />
      <distStmt>
        <distrbtr xml:lang="en" abbr="SND" URI="https://snd.se">Swedish National Data Service</distrbtr>
        <distrbtr xml:lang="sv" abbr="SND" URI="https://snd.se">Svensk nationell datatjänst</distrbtr>
        <distDate xml:lang="en" date="2019-03-25" />
      </distStmt>
      <verStmt>
        <version elementVersion="0" elementVersionDate="2019-03-25" />
      </verStmt>
      <holdings URI="https://doi.org/10.17045/STHLMUNI.7886641">Landing page</holdings>
    </citation>
    <stdyInfo>
      <subject />
      <abstract xml:lang="en" contentType="abstract">The NCBI RefSeq database (2019-03-19; Haft et al. 2018 https://doi.org/10.1093/nar/gkx1068) was searched with Pfam's ATP-cone profile (accno: PF03477; Finn et al. 2010 https://doi.org/10.1093/nar/gkp985) returning 44367 NCBI accessions. Ribonucleotide reductase proteins were identified using HMMER (Eddy 2011 https://doi.org/10.1371/journal.pcbi.1002195) profiles from the RNRdb database (http://rnrdb.pfitmap.org). Subsequently, sequences were clustered with UCLUST (Edgar 2010 https://doi.org/10.1093/bioinformatics/btq461) at identity to remove sequence duplicates (24477 sequences remaning).
All sequences were pairwise aligned to each other using LAST (Kiełbasa et al. 2011 https://doi.org/10.1101/gr.113985.110) and a bitscore matrix was constructed. The bitscore matrix was clustered with MCL (Enright, Dongen &amp; Ouzounis 2002 https://doi.org/10.1093/nar/30.7.1575) using the Cluster Maker 2 (Morris et al. 2011 https://doi.org/10.1186/1471-2105-12-436) Cytoscape (Shannon et al. 2003 https://doi.org/10.1101/gr.1239303) app. Bitscores &lt; 200 were not included in the initial network and an inflation parameter of 2.5 was used.
The file "atp-cone_mcl_clustering.tsv" contains all information necessary to recreate the clustering as well as the assigned cluster numbers to each sequence. 

Column names: SUID: Cytoscape's id, accno: NCBI's accession number, mcl2.0ewc200-mcl3.0: cluster assignments with inflation parameters and bitscore cutoff (ewc; when used), name: sequence identifier composed of accno plus cone number, outer_inner: "inner", "middle" or "outer" when more than one cone present in full sequence, pclass and psubclass: RNR class and subclass, ptype: protein type, taxon, tdomain: taxonomic domain, title: NCBI's description of the sequence.
The "precluster_assignments.tsv" file contains the results of the preclustering with USEARCH, i.e. which usearch cluster (first column) each accession number (second column) belong to.</abstract>
      <sumDscr />
    </stdyInfo>
    <method>
      <dataColl />
    </method>
    <dataAccs>
      <useStmt>
        <restrctn xml:lang="en">Access to data through an external actor. </restrctn>
        <restrctn xml:lang="sv">Åtkomst till data via extern aktör. </restrctn>
      </useStmt>
    </dataAccs>
    <othrStdyMat />
  </stdyDscr>
</codeBook>