This repository holds the combined base MIxS schema, plus the various extensions and combinations generated in the scope of the MInAS project.
The source YAML file is generated using the yq tool, merging each of the individual YAML files into a single one.
This merging happens once per extension release, and the resulting file is then used to generate the JSON schema and TSV using LinkML tools and scripts.
The resulting output files are stored in the src/ directory in file format specific subdirectories.
The following tools are required to generate the schema files:
- linkml-toolkit
- Not yet on pip/conda etc! Will need to manually install
linkml(Version 1.8.1)- Available on pip:
pip install linkml==1.8.1
- Available on pip:
Deprecated
yq(version 4.44.2)- Note: version not on
condaorpiprequires binary or OS distribution installation
- Note: version not on
To generate the combined YAML file, we can use a combination of yq and curl to download specific tagged releases from the MIxS and various MInAS repositories.
For a release, (making sure updating the versions in the variables):
-
Specify release variables
## Set versions MIXS_VERSION=6.2.0 EXTANCIENT_VERSION=0.7.0 EXTRADIOCARBONDATING_VERSION=0.2.1 COMBINATIONS_VERSION=0.2.1 -
Download schemas
## Core MIxS Schema curl -o src/mixs/schema/mixs-v$MIXS_VERSION.yaml "https://raw.githubusercontent.com/GenomicsStandardsConsortium/mixs/v$MIXS_VERSION/src/mixs/schema/mixs.yaml" ## Base MIxS schema ## MInAS Extensions curl -o src/mixs/schema/ancient-v$EXTANCIENT_VERSION.yaml "https://raw.githubusercontent.com/MIxS-MInAS/extension-ancient/v$EXTANCIENT_VERSION/src/mixs/schema/ancient.yml" ## Ancient DNA extension curl -o src/mixs/schema/radiocarbon-dating-v$EXTRADIOCARBONDATING_VERSION.yaml "https://raw.githubusercontent.com/MIxS-MInAS/extension-radiocarbon-dating/v$EXTRADIOCARBONDATING_VERSION/src/mixs/schema/radiocarbon-dating.yml" ## Radiocarbon extension curl -o src/mixs/schema/minas-combinations-v$COMBINATIONS_VERSION.yaml "https://raw.githubusercontent.com/MIxS-MInAS/minas-combinations/refs/tags/v$COMBINATIONS_VERSION/src/mixs/schema/minas-combinations.yml" ## Combinations
-
Merge together with
linkml-toolkit## Merge together lmtk combine --mode merge --schema src/mixs/schema/mixs-v$MIXS_VERSION.yaml \ -a src/mixs/schema/ancient-v$EXTANCIENT_VERSION.yaml \ -a src/mixs/schema/radiocarbon-dating-v$EXTRADIOCARBONDATING_VERSION.yaml \ -a src/mixs/schema/minas-combinations-v$COMBINATIONS_VERSION.yaml \ --output src/mixs/schema/mixs-minas.yaml ## Fix some metadata to ensure passes linting sed -i 's#source: https://github.com/MIxS-MInAS/extension-radiocarbon-dating/raw/main/proposals/0.1.0/extension-radiocarbon-dating-v0_1_0.csv#source: https://github.com/MIxS-MInAS/MInAS/#g' src/mixs/schema/mixs-minas.yaml
-
Validate that all new YAML files (extensions, combinations) are represented in the combine schema
for i in permit_scope localised_reservoir_offset_sd mims_symbiontassociated_ancient_data; do if [[ $(grep "$i" src/mixs/schema/mixs-minas.yaml | wc -l) -ge 2 ]]; then echo "$i: true"; else echo "$i: false"; fi done
[!WARNING] There should be one string per input YAML file, and be aware these strings may change per release.
-
Lint and validate the newly extended MIxS schema that it is valid LinkML
linkml lint --validate src/mixs/schema/mixs-minas.yaml
-
Generate the JSON schema version using the LinkML package's
gen-json-schema:gen-json-schema src/mixs/schema/mixs-minas.yaml > src/mixs/schema/mixs-minas.json -
Generate the old-style MIxS TSV files using the python3 script in the
scripts/directory:python3 ./scripts/linkml2class_tsvs.py --schema-file src/mixs/schema/mixs-minas.yaml --output-dir project/class-model-tsvs/
[!NOTE] This script has been copied and modified very slightly to include the python3 shebang, and is placed under scripts until properly packaged for the MIxS project.
To use this script, you only need python3 and no other dependencies (it seems).
-
Update the
CITATION.cfffile with the new version of the schema and any new major contributors. -
Commit and push to GitHub.
-
Make release on GitHub using previous releases as a template.