Implement vision #159

bezo97 · 2025-11-09T00:04:44Z

Hi, I added vision to glados. Please review whether my solution is acceptable.

See vision.md for usage notes and technical details. I'm open to change stuff / make it better before merging.

Summary by CodeRabbit

New Features
- Vision module now available: capture and process camera frames with Vision Language Models
- Configurable camera input with adjustable capture intervals and image resolution
- Vision observations seamlessly integrated into LLM conversation flow
Documentation
- README updated to highlight vision module completion
- Added comprehensive vision usage and configuration guide
Dependencies
- Added opencv-python for camera and image processing support

coderabbitai · 2025-11-09T00:04:52Z

Walkthrough

This PR introduces a vision module to Glados, enabling the system to process camera input via a Vision Language Model. Changes include a new VisionProcessor class that captures frames and queries a VLM, configuration schemas, system prompts for vision-aware LLM responses, integration into the core engine orchestration, updated documentation, and a new dependency on OpenCV.

Changes

Cohort / File(s)	Summary
Documentation Updates `README.md`, `vision.md`	Added vision feature update section and comprehensive vision module documentation describing configuration, usage, implementation details, and known issues.
Configuration & Dependencies `pyproject.toml`, `configs/glados_vision_config.yaml`	Added opencv-python>=4.12.0 dependency and created comprehensive vision configuration file with VLM settings, camera parameters, and persona prompts.
Vision Module - Exports & Constants `src/glados/vision/__init__.py`, `src/glados/vision/constants.py`	Created vision package with public exports of `VisionConfig` and `VisionProcessor`; added system prompts for vision message handling and VLM scene description instructions.
Vision Module - Configuration & Processing `src/glados/vision/vision_config.py`, `src/glados/vision/vision_processor.py`	Introduced `VisionConfig` Pydantic model for validation and `VisionProcessor` class implementing threaded camera capture, frame encoding, VLM querying, and LLM queue integration.
Core Engine Integration `src/glados/core/engine.py`	Integrated vision module into Glados orchestration with `vision_config` parameter, system prompt augmentation for vision handling, and VisionProcessor thread lifecycle management.
Legacy Package Cleanup `src/glados/Vision/__init__.py`	Removed module docstring from legacy Vision package.

Sequence Diagram

sequenceDiagram
    participant Main as Glados Engine
    participant Vision as VisionProcessor
    participant Camera as Camera Input
    participant VLM as Vision LLM
    participant LLM as LLM Queue
    
    Main->>Vision: Create with config
    Main->>Vision: Start thread
    loop Every capture_interval_seconds
        Vision->>Vision: Wait for processing_active_event
        Vision->>Camera: Initialize/grab frame
        Camera-->>Vision: Frame data
        Vision->>Vision: Preprocess & encode to JPEG
        Vision->>VLM: POST image with system prompt
        VLM-->>Vision: Scene description
        Vision->>Vision: Prefix with [vision]
        Vision->>LLM: Enqueue description
        LLM->>Main: Feed to LLM processing
        Main->>Main: Append vision handling system prompt
        Main-->>LLM: LLM response
    end
    Main->>Vision: Shutdown event
    Vision->>Vision: Release camera, cleanup

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~40 minutes

Extra attention areas:
- src/glados/vision/vision_processor.py: Dense logic involving threading, camera I/O, frame encoding, HTTP communication, and queue management. Verify error handling, resource cleanup, and thread safety.
- src/glados/core/engine.py: System prompt augmentation and thread lifecycle integration. Confirm vision config propagation through YAML loading and initialization paths.
- Frame preprocessing and JPEG encoding pipeline in VisionProcessor for correctness and performance.

Poem

🐰 Hop, hop—a vision module appears,
Through frames and cameras, Glados now sees!
VLM whispers scenes in [bracketed] verse,
While threading through images, for better or worse.
Config and constants, all bundled with care,
GLaDOS now gazes—what magic laid bare! 🎥✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 36.36% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title "Implement vision" directly matches the primary purpose of the PR: adding vision functionality to GLaDOS, as confirmed by the comprehensive vision-related changes across configuration, core engine integration, and new vision module files.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4a813a3 and e229eab.

📒 Files selected for processing (1)

src/glados/Vision/__init__.py (0 hunks)

💤 Files with no reviewable changes (1)

src/glados/Vision/init.py

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 00549d4 and 618d6c9.

📒 Files selected for processing (9)

README.md (2 hunks)
configs/glados_vision_config.yaml (1 hunks)
pyproject.toml (1 hunks)
src/glados/Vision/__init__.py (1 hunks)
src/glados/Vision/constants.py (1 hunks)
src/glados/Vision/vision_config.py (1 hunks)
src/glados/Vision/vision_processor.py (1 hunks)
src/glados/core/engine.py (7 hunks)
vision.md (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-09-14T08:22:38.799Z

Learnt from: Leandro4002
Repo: dnhkng/GLaDOS PR: 154
File: src/glados/default_configs/glados_config.yaml:11-18
Timestamp: 2025-09-14T08:22:38.799Z
Learning: User Leandro4002 prefers to keep the original GLaDOS persona examples in src/glados/default_configs/glados_config.yaml, including the dark humor references, as part of maintaining the character's authentic personality from the Portal game series.

Applied to files:

README.md
configs/glados_vision_config.yaml

🧬 Code graph analysis (3)

src/glados/Vision/__init__.py (2)

src/glados/Vision/vision_config.py (1)

VisionConfig (4-11)

src/glados/Vision/vision_processor.py (1)

VisionProcessor (20-183)

src/glados/core/engine.py (3)

src/glados/Vision/vision_config.py (1)

VisionConfig (4-11)

src/glados/Vision/vision_processor.py (2)

VisionProcessor (20-183)

run (43-88)

src/glados/core/llm_processor.py (1)

run (123-230)

src/glados/Vision/vision_processor.py (1)

src/glados/Vision/vision_config.py (1)

VisionConfig (4-11)

🪛 LanguageTool

vision.md

[style] ~3-~3: As a shorter alternative for ‘able to’, consider using “can”.
Context: # Glados vision module Glados is able to capture the world with a camera and rea...

(BE_ABLE_TO)

[style] ~24-~24: As a shorter alternative for ‘able to’, consider using “can”.
Context: ...other processors. I made so that glados is able to react to changes in the environment. - ...

(BE_ABLE_TO)

[style] ~26-~26: Who is ‘not sure’? Consider being more precise.
Context: ...ions, even when it's instructed not to. Not sure whether this is a problem with qwen3:4b...

(WHO_NOT_SURE)

🪛 markdownlint-cli2 (0.18.1)

vision.md

6-6: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🪛 Ruff (0.14.3)

src/glados/Vision/vision_processor.py

82-82: Do not catch blind exception: Exception

(BLE001)

171-171: Do not catch blind exception: Exception

(BLE001)

src/glados/core/engine.py

src/glados/Vision/vision_processor.py

vision.md

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

src/glados/vision/vision_processor.py (1)
68-82: Skip the VLM call when the LLM queue is already busy

Right now we still capture, encode, and post to the VLM even if the downstream queue is backed up, only to drop the result immediately afterward. Moving the qsize() backpressure check ahead of _post_vision_query avoids unnecessary camera work and HTTP calls while preserving the existing throttling behaviour.
-                description = self._post_vision_query(payload_image)
-
-                if self.llm_queue.qsize() >= 1: # LLM is busy, avoid flooding the queue with vision updates
+                if self.llm_queue.qsize() >= 1:  # LLM is busy, avoid flooding the queue with vision updates
                     logger.info("VisionProcessor: Skipped a vision update.")
                     self._sleep(loop_started)
                     continue
+
+                description = self._post_vision_query(payload_image)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 618d6c9 and 4a813a3.

📒 Files selected for processing (4)

src/glados/vision/__init__.py (1 hunks)
src/glados/vision/constants.py (1 hunks)
src/glados/vision/vision_config.py (1 hunks)
src/glados/vision/vision_processor.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

src/glados/vision/__init__.py (2)

src/glados/vision/vision_config.py (1)

VisionConfig (4-11)

src/glados/vision/vision_processor.py (1)

VisionProcessor (20-183)

src/glados/vision/vision_processor.py (1)

src/glados/vision/vision_config.py (1)

VisionConfig (4-11)

🪛 Ruff (0.14.3)

src/glados/vision/vision_processor.py

82-82: Do not catch blind exception: Exception

(BLE001)

171-171: Do not catch blind exception: Exception

(BLE001)

src/glados/vision/vision_processor.py

bezo97 added 4 commits November 8, 2025 22:31

implement vision module

d9d2632

tune vision config

d2b98a8

update documentation

be7fc2d

fix broken embed in readme

618d6c9

coderabbitai bot reviewed Nov 9, 2025

View reviewed changes

src/glados/core/engine.py Show resolved Hide resolved

src/glados/Vision/vision_processor.py Outdated Show resolved Hide resolved

vision.md Show resolved Hide resolved

bezo97 added 2 commits November 9, 2025 01:20

fix folder name

4a813a3

fix folder name, amend

e229eab

coderabbitai bot reviewed Nov 9, 2025

View reviewed changes

src/glados/vision/vision_processor.py Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Implement vision #159

Implement vision #159

Uh oh!

bezo97 commented Nov 9, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Nov 9, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Implement vision #159

Are you sure you want to change the base?

Implement vision #159

Uh oh!

Conversation

bezo97 commented Nov 9, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

bezo97 commented Nov 9, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 9, 2025 •

edited

Loading