Add python data gathering tools #199

exequeryphil · 2025-12-17T21:45:30Z

This is a set of python scripts that might quickly provide some basic data for the many missing models (largely by scraping Huggingface). I guess I could've just pushed up the new YAML files, but we might want to run this again at some point, or make some minor adjustments to the operations.

Try it out by pulling down locally, installing the python dependencies in tools-py and then running the interactive script:

batch_scrape_missing.sh

It should compare our current models to the most active models on Huggingface, make a list of which to gather, then gather the data into YAML files in the ./models folder. There is some additional logic for determining the correct repository value.

Signed-off-by: Phil Williams <phil.williams@ibm.com>

Add python data gathering tools

2e2f957

Signed-off-by: Phil Williams <phil.williams@ibm.com>

exequeryphil mentioned this pull request Dec 17, 2025

We should have an agent gathering data automatically #196

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add python data gathering tools #199

Add python data gathering tools #199

Uh oh!

exequeryphil commented Dec 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add python data gathering tools #199

Are you sure you want to change the base?

Add python data gathering tools #199

Uh oh!

Conversation

exequeryphil commented Dec 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant