Skip to content
@open-thoughts

OpenThoughts

Open collaborations on data-centric research

https://openthoughts.ai

A community effort to curate the best open post-training datasets.

We are currently working on OpenThoughts-Agent, a collaboration building the best open agent training datasets.

Our first project was curating open reasoning data recipes. OpenThoughts3, our best reasoning dataset recipe, is detailed in our release blog and the full paper.

About us

We are a team of researchers and engineers from Bespoke Labs, Stanford, University of California Berkeley, University of Washington, Juelich Supercomputing Center (JSC), LAION, UCLA, UNC Chapel Hill, and Toyota Research Institute united around building the best datasets (and thus the best models). See our previous works at datacomp.ai and mlfoundations.

Open Thoughts is supported by Bespoke Labs, Lambda Labs, NSF IFML, Juelich Supercomputing Center, Toyota Research Institute.

Pinned Loading

  1. open-thoughts open-thoughts Public

    Fully open data curation for reasoning models

    Python 2.2k 179

Repositories

Showing 4 of 4 repositories
  • open-thoughts/open-thoughts-website’s past year of commit activity
    MDX 2 0 0 0 Updated Dec 6, 2025
  • OpenThoughts-Agent Public

    Data recipes and robust infrastructure for training AI agents

    open-thoughts/OpenThoughts-Agent’s past year of commit activity
    Python 2 Apache-2.0 0 0 0 Updated Dec 6, 2025
  • .github Public
    open-thoughts/.github’s past year of commit activity
    0 0 0 0 Updated Dec 6, 2025
  • open-thoughts Public

    Fully open data curation for reasoning models

    open-thoughts/open-thoughts’s past year of commit activity
    Python 2,160 Apache-2.0 179 6 0 Updated Dec 3, 2025

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…