Skip to content

rysweet/Yellowstone

Repository files navigation

Note: this is currently an AI Generated Experimental Code Base - Use at your own risk!

Yellowstone

Cypher-to-KQL translator for Microsoft Sentinel, enabling graph query capabilities for security analysts.

License: MIT Python 3.11+

Overview

Yellowstone translates graph queries (Cypher and Gremlin) into KQL (Kusto Query Language) for Microsoft Sentinel. Security analysts can use familiar graph query syntax to investigate relationships between entities like users, devices, and security events.

Supported Languages: Cypher, Gremlin Status: Core translation functional for both languages.

Quick Start

Prerequisites

  • Python 3.11 or higher
  • Microsoft Sentinel workspace access (optional, for execution)
  • Claude API key (optional, for AI-enhanced translation)

Installation

git clone https://github.com/rysweet/Yellowstone.git
cd Yellowstone
pip install -e .

Basic Usage

from yellowstone.models import CypherQuery, TranslationContext
from yellowstone.main_translator import CypherTranslator

# Works with both Cypher and Gremlin
cypher_query = "MATCH (u:User) WHERE u.age > 25 RETURN u.name"
gremlin_query = "g.V().hasLabel('User').has('age',gt(25)).values('name')"

translator = CypherTranslator()
context = TranslationContext(user_id="analyst", tenant_id="org", permissions=[])

# Translate Cypher
result = translator.translate(CypherQuery(query=cypher_query), context)
print(result.query)

# Translate Gremlin (automatically detected)
result = translator.translate(CypherQuery(query=gremlin_query), context)
print(result.query)

Output:

IdentityInfo
| make-graph AccountObjectId with_node_id=AccountObjectId
| graph-match (u:User)-[:LOGGED_IN]->(d:Device)
| where u.age > 25
| project u.name

Features

Implemented

  • Cypher Parsing: Full MATCH, WHERE, RETURN clause support
  • KQL Generation: Uses native make-graph and graph-match operators
  • Schema Mapping: Maps Cypher labels to Sentinel tables (IdentityInfo, DeviceInfo, SecurityEvent, etc.)
  • Property Access: Node and relationship property filtering
  • AI Enhancement: Optional Claude-powered translation for complex queries

Current Limitations

  • Variable-length paths (-[*1..3]->) not fully validated
  • Multi-hop queries (>3 hops) have limited testing
  • Schema mappings are generic and may need tuning for specific environments

Architecture

flowchart TD
    A[Cypher Query] --> B[Parser]
    B --> C{Translation Routing}
    C -->|85% Fast Path| D[Direct KQL Operators]
    C -->|10% AI Path| E[Claude SDK]
    C -->|5% Fallback| F[Join-based Translation]
    D --> G[KQL Output]
    E --> G
    F --> G
    G --> H[Microsoft Sentinel]
Loading

Key Components:

  • yellowstone.parser: Cypher parsing and AST generation
  • yellowstone.translator: KQL generation and query assembly
  • yellowstone.schema: Sentinel table and relationship mappings
  • yellowstone.ai_translator: Claude SDK integration (optional)

For detailed architecture documentation, see docs/ARCHITECTURE.md.

Usage Examples

Basic Node Query

Cypher:

MATCH (u:User) WHERE u.age > 30 RETURN u.name LIMIT 10

Generated KQL:

IdentityInfo
| make-graph AccountObjectId with_node_id=AccountObjectId
| graph-match (u:User)
| where u.age > 30
| project u.name
| limit 10

Relationship Query

Cypher:

MATCH (u:User)-[:LOGGED_IN]->(d:Device)
WHERE d.os_type == "Windows"
RETURN u.name, d.device_id

Generated KQL:

IdentityInfo
| join kind=inner (DeviceInfo) on AccountName == DeviceName
| make-graph AccountObjectId with_node_id=AccountObjectId
| graph-match (u:User)-[:LOGGED_IN]->(d:Device)
| where d.os_type == "Windows"
| project u.name, d.device_id

Multi-Node Pattern

Cypher:

MATCH (u:User)-[:ACCESSED]->(f:File)<-[:CREATED_BY]-(p:Process)
RETURN u.name, f.path, p.name

Generated KQL:

IdentityInfo
| join kind=inner (FileInfo) on AccountObjectId == FileOwnerId
| join kind=inner (ProcessInfo) on FileId == ProcessCreatedFileId
| make-graph AccountObjectId with_node_id=AccountObjectId
| graph-match (u:User)-[:ACCESSED]->(f:File)<-[:CREATED_BY]-(p:Process)
| project u.name, f.path, p.name

Testing

# Run all tests
pytest

# Run with coverage
pytest --cov=src/yellowstone --cov-report=html

# Run specific test suite
pytest tests/integration

Documentation

Research and planning documents are in the context/ directory.

Development

Setup Development Environment

# Clone repository
git clone https://github.com/rysweet/Yellowstone.git
cd Yellowstone

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Format code
black src/
ruff check src/

Project Structure

src/yellowstone/
├── parser/              # Cypher parsing (ANTLR)
├── translator/          # KQL generation
├── schema/              # Sentinel schema mappings
├── ai_translator/       # Claude SDK integration
└── security/            # Input validation

tests/
├── integration/         # End-to-end tests
└── sentinel_integration/ # Azure validation tests

docs/                    # Detailed documentation
context/                 # Research and planning

Contributing

This is currently a development project. Contribution guidelines will be published upon initial release.

For questions or discussions, open an issue on GitHub.

License

MIT License - see LICENSE file for details.

Resources

Contact

Project Lead: Ryan Sweet (@rysweet)

About

Cypher Query Engine for Microsoft Sentinel Graph - Project Yellowstone

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •