Skip to content

MDverse/mdverse_database

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MDverse database schema

Setup environment

We use uv to manage dependencies and the project environment.

Clone the GitHub repository:

git clone git@github.com:MDverse/mdverse_data_schema.git
cd mdverse_data_schema

Sync dependencies:

uv sync

Retrieve data

Download parquet files from Zenodo to build the database:

uv run src/download_data.py

Files will be downloaded to data/parquet_files:

data
└── parquet_files
    ├── datasets.parquet
    ├── files.parquet
    ├── gromacs_gro_files.parquet
    ├── gromacs_mdp_files.parquet
    ├── gromacs_xtc_files.parquet

Build the database

Create the empty database:

uv run src/create_database.py

Populate the tables with the data from parquet files:

uv run src/ingest_data.py

Information on the database

Report on the number of rows and columns of the table of the database:

uv run report.py

This will create the file report.log with the information.

Re-ingesting simulation data

If you wish to re-ingest data from any of the following tables:

  • TopologyFile
  • ParameterFile
  • TrajectoryFile

You can run these commands:

uv run src/ingest_topol_files.py

or

uv run src/ingest_param_files.py

or

uv run src/ingest_traj_files.py

About

MDverse data schema

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages