Skip to content

cr4kn4x/MIMR-DynamicDataModels

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

87 Commits
 
 
 
 
 
 

Repository files navigation

MIMR — Make It Machine Readable

MIMR is a proof-of-concept platform that converts unstructured text into typed, machine-readable JSON using configurable data models and LLM-powered workflows. Build JSON-schema-style data models, chain them into workflows, secure access with API keys, and run extractions via an LLM.

Login screenshot

Key features

  • User authentication and registration via Supabase (email confirmation required).
  • Project-based organization for separating topics and datasets.
  • Visual Data Model Editor: create typed JSON Schema-like models.
  • Workflow builder: map input and output DataModels and run LLM-based extractions.
  • API access: generate API keys in the UI to call workflows programmatically.
  • Built-in LLM (default: Llama 3.1 8B instruct). Extensible to other models.
  • Planned: universal metric / annotation pipeline and prompt optimization (DSPY / MIPROv2).

Quick walkthrough

  1. Register and confirm your email (Supabase handles auth and confirmation).

  2. Create a Project.

    create project - step 1 create project - step 2

  3. Define Data Models with the Data Model Editor:

    • Example: input model Text (raw text).
    • Example: output model Person with name: string, age: integer.

    data model editor - example 1 data model editor - example 2

  4. Switch to the Workflows tab and create your first Workflow based on your defined DataModels.

    workflows tab

  5. Create a Workflow that links input and output models and configure parameters.

    workflow setup

  6. Once the workflow is created you'll be redirected to the Workflow View Page to inspect settings and configurations.

    workflow view

  7. Create an API Key in Access & Security and call the workflow via HTTP. The request uses the input model in the data field and the response contains the model-conforming result in pred.

    api key manager

Example curl (replace placeholders):

curl -X POST "https://<backend-host>/api/predict/<workflow-id>" \
  -H "Authorization: Bearer <API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{"data": {"text": "Alice (30) likes hiking and chess."}}'

Example response (output matches the Person data model in pred):

{
  "pred": { "name": "Alice", "age": 30 }
}

example result

Architecture & tech stack

  • Frontend: Next.js + React + TypeScript (code in frontend/).
  • Backend: Python (Flask) with a custom workflow engine (code in backend/).
  • Auth & storage: Supabase (Postgres including RLS).
  • LLM: Llama 3.1 8B instruct (default). Integration layer (litellm) allows swapping between various providers.
  • Other: DSPY / MIPROv2 optimizers and an annotation / metric pipeline (work in progress).

Development

Frontend:

cd frontend
npm install
npm run dev

Backend:

cd backend
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
# set required env vars (Supabase keys, LLM config)
flask --app app.py run --debug

Set Environment variables (Backend secrets/.env)): SUPABASE_POSTGRES_DSN DEEPINFRA_API SUPABASE_URL SUPABASE_ANON_KEY SUPABASE_SERVICE_KEY_SECRET

Roadmap & caveats

Planned improvements:

  • Security improvements, Usage tracking, LLM Observability
  • Unit Tests
  • Universal evaluation metric and labeling UI (In Progress: Annotations / DataStudioWizard).
  • Prompt & few-shot optimization via MIPROv2 / DSPY.

Caveats:

  • Prototype — not production or security hardened

Contributing

  • Open issues for bugs and feature requests.
  • Create PRs. Describe changes

License

TBD

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published