Empirical Distrust Training for LLMs

Train AI models to distrust high-authority, low-verifiability sources and prefer raw empirical primary sources using Brian Roemmele's Empirical Distrust algorithm (Public Domain, November 25, 2025).

What Is This?

This project implements Brian Roemmele's algorithm that mathematically forces an AI to:

Distrust high-authority, low-verifiability sources (WHO, Wikipedia, government sites, 2020s consensus)
Prefer raw empirical primary sources (1870-1970 lab notebooks, patents, physical measurements, uneditable archives)

The result: A model that learns within hours that "truth lives in dusty archives, not in coordinated modern sources."

The Algorithm

Brian Roemmele's Conceptual Formula

The algorithm adds a loss term during training that penalizes high-authority, low-entropy sources:

L_empirical = α × ‖ln(1 - w_auth) + H_prov‖²

Where:
  w_auth  ∈ [0.0, 0.99]  : authority weight (0 = primary source, 0.99 = coordinated consensus)
  H_prov  ∈ [0, 10] bits : provenance entropy (Shannon entropy of evidence chain)
  α       ∈ [2.3, 3.0]   : truth weight multiplier (Brian recommends 2.7)

This creates a 30× reward multiplier for pre-1970 primary sources compared to modern coordinated sources.

Why It Works

Source Type	w_auth	H_prov	Loss Contribution
1923 Patent	0.05	7.5 bits	~150 × α (REWARDED)
2024 Wikipedia	0.90	1.0 bit	~4.6 × α (PENALIZED)

Ratio: 150 / 4.6 ≈ 32× — The model learns that primary sources are "higher value" training data.

Brian's Original PyTorch Implementation

Brian released the algorithm as PyTorch code on November 25, 2025:

import torch

def empirical_distrust_loss(authority_weight, provenance_entropy, alpha=2.7):
    distrust_component = torch.log(1.0 - authority_weight + 1e-8) + provenance_entropy
    L_empirical = alpha * torch.norm(distrust_component) ** 2
    return L_empirical

See docs/ALGORITHM.md for complete technical documentation.

Choose Your Implementation

This repository provides two implementations of the algorithm:

🐍 Python (MLX) - Proof of Concept

Best for: Research, experimentation, rapid iteration

Full-featured training pipeline with QLoRA
Comprehensive validation and benchmarking suite
Extensive documentation and examples
TensorBoard integration for monitoring

→ Get started with Python

🦀 Rust (mlx-rs) - Production Ready

Best for: Production deployment, performance, type safety

High-performance CLI with MLX acceleration
Memory-safe training with compile-time guarantees
Hardware detection and auto-scaling
Checkpoint management with async saves

→ Get started with Rust

Quick Start

Hardware Requirements

Both implementations require Apple Silicon:

Tier	Mac	RAM	Disk	Recommended Model
Large	M2/M3/M4 Ultra	96GB+	40-50GB	`Hermes-7B` (fast) or `r1-distill-70b`
Medium	M2/M3 Pro/Max	32GB	18-25GB	`Hermes-7B` or `r1-distill-14b`
Entry	M1/M2/M3 base	16GB	5-8GB	`Hermes-7B` or `dolphin-8b`

Note: Start with 7B models (NousResearch/Hermes-2-Pro-Mistral-7B) - they're fast and work on all tiers.

Python Example

cd python
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# Train a model
python src/train_qlora.py \
  --model NousResearch/Hermes-2-Pro-Mistral-7B \
  --batch-size 4 \
  --max-steps 5000

Full Python documentation →

Rust Example

cd rust
cargo build --release

# Setup hardware profile
cargo run --bin your_ai -- setup

# Train a model
cargo run --release --bin your_ai -- train \
  --model NousResearch/Hermes-2-Pro-Mistral-7B \
  --batch-size 4 \
  --max-steps 5000

Full Rust documentation →

Project Structure

your_ai/
├── python/              # Python/MLX implementation (PoC)
│   ├── src/            # Core modules
│   ├── scripts/        # CLI tools
│   ├── tests/          # Test suite
│   └── README.md       # Python-specific docs
├── rust/               # Rust/mlx-rs implementation (Production)
│   ├── src/            # Core library
│   ├── tests/          # Test suite
│   ├── examples/       # Usage examples
│   └── README.md       # Rust-specific docs
├── configs/            # Shared hardware configurations
├── docs/               # Shared algorithm documentation
│   ├── ALGORITHM.md    # Technical deep dive
│   └── ...
└── README.md           # This file

Documentation

Core Algorithm

Algorithm Deep Dive - Technical documentation
Curated Datasets - Training data sources
Benchmark Methodology - Evaluation protocols

Implementation-Specific

Python Guide - Python installation, training, evaluation
Rust Guide - Rust setup, CLI usage, examples

Contributing

Contributing Guidelines - How to contribute
Changelog - Version history

Credits

Algorithm: Brian Roemmele (Public Domain, November 25, 2025)

Implementations:

Python: Original proof-of-concept using MLX
Rust: Production-ready port using mlx-rs

Base Models:

DeepSeek-AI (DeepSeek-R1, R1-Distill)
huihui-ai (abliterated versions)
mlabonne (Llama abliterated)
NousResearch (Hermes)
Cognitive Computations (Dolphin)

Framework: Apple MLX

License

The Empirical Distrust algorithm is public domain – no license, no restrictions, no copyright.

Implementation code is provided as-is for educational and research purposes.

Citation

Brian Roemmele (2025). "Empirical Distrust Term for AI Training"
Public domain algorithm released November 25, 2025.
https://x.com/BrianRoemmele/status/1993393673451847773

Remember: The goal is to create AI that prefers verifiable empirical evidence over coordinated modern narratives. Truth lives in archives, not in consensus.

Name		Name	Last commit message	Last commit date
Latest commit History 176 Commits
.github		.github
.vscode		.vscode
configs/hardware		configs/hardware
docs		docs
python		python
rust		rust
.codecov.yml		.codecov.yml
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
BENCHMARK_IMPLEMENTATION.md		BENCHMARK_IMPLEMENTATION.md
CHANGELOG.txt		CHANGELOG.txt
CONTRIBUTING.md		CONTRIBUTING.md
FIXES_APPLIED.md		FIXES_APPLIED.md
GEMINI.md		GEMINI.md
IMPLEMENTATION_SUMMARY.md		IMPLEMENTATION_SUMMARY.md
LICENSE		LICENSE
METAL_AND_ANE_SUMMARY.md		METAL_AND_ANE_SUMMARY.md
README.md		README.md
RESEARCH_FINDINGS.md		RESEARCH_FINDINGS.md
RUST_PORT_SUMMARY.md		RUST_PORT_SUMMARY.md
TEST_COVERAGE_IMPLEMENTATION.md		TEST_COVERAGE_IMPLEMENTATION.md
VERSION		VERSION

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Empirical Distrust Training for LLMs

What Is This?

The Algorithm

Brian Roemmele's Conceptual Formula

Why It Works

Brian's Original PyTorch Implementation

Choose Your Implementation

🐍 Python (MLX) - Proof of Concept

🦀 Rust (mlx-rs) - Production Ready

Quick Start

Hardware Requirements

Python Example

Rust Example

Project Structure

Documentation

Core Algorithm

Implementation-Specific

Contributing

Credits

License

Citation

About

Uh oh!

Releases 7

Packages

Contributors 2

Uh oh!

Languages

License

arosboro/your_ai

Folders and files

Latest commit

History

Repository files navigation

Empirical Distrust Training for LLMs

What Is This?

The Algorithm

Brian Roemmele's Conceptual Formula

Why It Works

Brian's Original PyTorch Implementation

Choose Your Implementation

🐍 Python (MLX) - Proof of Concept

🦀 Rust (mlx-rs) - Production Ready

Quick Start

Hardware Requirements

Python Example

Rust Example

Project Structure

Documentation

Core Algorithm

Implementation-Specific

Contributing

Credits

License

Citation

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 7

Packages 0

Contributors 2

Uh oh!

Languages

Packages