Skip to content

MIxS-MInAS/MInAS

Repository files navigation

MInAS

DOI

This repository holds the combined base MIxS schema, plus the various extensions and combinations generated in the scope of the MInAS project.

The source YAML file is generated using the yq tool, merging each of the individual YAML files into a single one.

This merging happens once per extension release, and the resulting file is then used to generate the JSON schema and TSV using LinkML tools and scripts.

The resulting output files are stored in the src/ directory in file format specific subdirectories.

Schema Generation Instructions

Required tools and dependencies

The following tools are required to generate the schema files:

  • linkml-toolkit
    • Not yet on pip/conda etc! Will need to manually install
  • linkml (Version 1.8.1)
    • Available on pip: pip install linkml==1.8.1

Deprecated

  • yq (version 4.44.2)
    • Note: version not on conda or pip requires binary or OS distribution installation

Merging the YAML files

To generate the combined YAML file, we can use a combination of yq and curl to download specific tagged releases from the MIxS and various MInAS repositories.

Release

For a release, (making sure updating the versions in the variables):

  1. Specify release variables

    ## Set versions
    MIXS_VERSION=6.2.0
    EXTANCIENT_VERSION=0.7.0
    EXTRADIOCARBONDATING_VERSION=0.2.1
    COMBINATIONS_VERSION=0.2.1
  2. Download schemas

    ## Core MIxS Schema
    curl -o src/mixs/schema/mixs-v$MIXS_VERSION.yaml "https://raw.githubusercontent.com/GenomicsStandardsConsortium/mixs/v$MIXS_VERSION/src/mixs/schema/mixs.yaml" ## Base MIxS schema
    
    ## MInAS Extensions
    curl -o src/mixs/schema/ancient-v$EXTANCIENT_VERSION.yaml "https://raw.githubusercontent.com/MIxS-MInAS/extension-ancient/v$EXTANCIENT_VERSION/src/mixs/schema/ancient.yml" ## Ancient DNA extension
    curl -o src/mixs/schema/radiocarbon-dating-v$EXTRADIOCARBONDATING_VERSION.yaml "https://raw.githubusercontent.com/MIxS-MInAS/extension-radiocarbon-dating/v$EXTRADIOCARBONDATING_VERSION/src/mixs/schema/radiocarbon-dating.yml" ## Radiocarbon extension
    curl -o src/mixs/schema/minas-combinations-v$COMBINATIONS_VERSION.yaml "https://raw.githubusercontent.com/MIxS-MInAS/minas-combinations/refs/tags/v$COMBINATIONS_VERSION/src/mixs/schema/minas-combinations.yml" ## Combinations
  3. Merge together with linkml-toolkit

    ## Merge together
    lmtk combine --mode merge --schema src/mixs/schema/mixs-v$MIXS_VERSION.yaml \
      -a src/mixs/schema/ancient-v$EXTANCIENT_VERSION.yaml \
      -a src/mixs/schema/radiocarbon-dating-v$EXTRADIOCARBONDATING_VERSION.yaml \
      -a src/mixs/schema/minas-combinations-v$COMBINATIONS_VERSION.yaml \
      --output src/mixs/schema/mixs-minas.yaml
    
    ## Fix some metadata to ensure passes linting
    sed -i 's#source: https://github.com/MIxS-MInAS/extension-radiocarbon-dating/raw/main/proposals/0.1.0/extension-radiocarbon-dating-v0_1_0.csv#source: https://github.com/MIxS-MInAS/MInAS/#g' src/mixs/schema/mixs-minas.yaml
  4. Validate that all new YAML files (extensions, combinations) are represented in the combine schema

    for i in permit_scope localised_reservoir_offset_sd mims_symbiontassociated_ancient_data; do
      if [[ $(grep "$i" src/mixs/schema/mixs-minas.yaml | wc -l) -ge 2 ]]; then echo "$i: true"; else echo "$i: false"; fi
    done

    [!WARNING] There should be one string per input YAML file, and be aware these strings may change per release.

  5. Lint and validate the newly extended MIxS schema that it is valid LinkML

    linkml lint --validate src/mixs/schema/mixs-minas.yaml
  6. Generate the JSON schema version using the LinkML package's gen-json-schema:

    gen-json-schema src/mixs/schema/mixs-minas.yaml > src/mixs/schema/mixs-minas.json
  7. Generate the old-style MIxS TSV files using the python3 script in the scripts/ directory:

    python3 ./scripts/linkml2class_tsvs.py --schema-file src/mixs/schema/mixs-minas.yaml --output-dir project/class-model-tsvs/

    [!NOTE] This script has been copied and modified very slightly to include the python3 shebang, and is placed under scripts until properly packaged for the MIxS project.

    To use this script, you only need python3 and no other dependencies (it seems).

  8. Update the CITATION.cff file with the new version of the schema and any new major contributors.

  9. Commit and push to GitHub.

  10. Make release on GitHub using previous releases as a template.

About

Repository of proposed and combined MIxS + MInAS standardised metadata checklists using LinkML

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages