Skip to content

Conversation

@bezo97
Copy link

@bezo97 bezo97 commented Nov 9, 2025

Hi, I added vision to glados. Please review whether my solution is acceptable.

See vision.md for usage notes and technical details. I'm open to change stuff / make it better before merging.

Demo: Youtube video

Summary by CodeRabbit

  • New Features

    • Vision module now available: capture and process camera frames with Vision Language Models
    • Configurable camera input with adjustable capture intervals and image resolution
    • Vision observations seamlessly integrated into LLM conversation flow
  • Documentation

    • README updated to highlight vision module completion
    • Added comprehensive vision usage and configuration guide
  • Dependencies

    • Added opencv-python for camera and image processing support

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 9, 2025

Walkthrough

This PR introduces a vision module to Glados, enabling the system to process camera input via a Vision Language Model. Changes include a new VisionProcessor class that captures frames and queries a VLM, configuration schemas, system prompts for vision-aware LLM responses, integration into the core engine orchestration, updated documentation, and a new dependency on OpenCV.

Changes

Cohort / File(s) Summary
Documentation Updates
README.md, vision.md
Added vision feature update section and comprehensive vision module documentation describing configuration, usage, implementation details, and known issues.
Configuration & Dependencies
pyproject.toml, configs/glados_vision_config.yaml
Added opencv-python>=4.12.0 dependency and created comprehensive vision configuration file with VLM settings, camera parameters, and persona prompts.
Vision Module - Exports & Constants
src/glados/vision/__init__.py, src/glados/vision/constants.py
Created vision package with public exports of VisionConfig and VisionProcessor; added system prompts for vision message handling and VLM scene description instructions.
Vision Module - Configuration & Processing
src/glados/vision/vision_config.py, src/glados/vision/vision_processor.py
Introduced VisionConfig Pydantic model for validation and VisionProcessor class implementing threaded camera capture, frame encoding, VLM querying, and LLM queue integration.
Core Engine Integration
src/glados/core/engine.py
Integrated vision module into Glados orchestration with vision_config parameter, system prompt augmentation for vision handling, and VisionProcessor thread lifecycle management.
Legacy Package Cleanup
src/glados/Vision/__init__.py
Removed module docstring from legacy Vision package.

Sequence Diagram

sequenceDiagram
    participant Main as Glados Engine
    participant Vision as VisionProcessor
    participant Camera as Camera Input
    participant VLM as Vision LLM
    participant LLM as LLM Queue
    
    Main->>Vision: Create with config
    Main->>Vision: Start thread
    loop Every capture_interval_seconds
        Vision->>Vision: Wait for processing_active_event
        Vision->>Camera: Initialize/grab frame
        Camera-->>Vision: Frame data
        Vision->>Vision: Preprocess & encode to JPEG
        Vision->>VLM: POST image with system prompt
        VLM-->>Vision: Scene description
        Vision->>Vision: Prefix with [vision]
        Vision->>LLM: Enqueue description
        LLM->>Main: Feed to LLM processing
        Main->>Main: Append vision handling system prompt
        Main-->>LLM: LLM response
    end
    Main->>Vision: Shutdown event
    Vision->>Vision: Release camera, cleanup
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~40 minutes

  • Extra attention areas:
    • src/glados/vision/vision_processor.py: Dense logic involving threading, camera I/O, frame encoding, HTTP communication, and queue management. Verify error handling, resource cleanup, and thread safety.
    • src/glados/core/engine.py: System prompt augmentation and thread lifecycle integration. Confirm vision config propagation through YAML loading and initialization paths.
    • Frame preprocessing and JPEG encoding pipeline in VisionProcessor for correctness and performance.

Poem

🐰 Hop, hop—a vision module appears,
Through frames and cameras, Glados now sees!
VLM whispers scenes in [bracketed] verse,
While threading through images, for better or worse.
Config and constants, all bundled with care,
GLaDOS now gazes—what magic laid bare! 🎥✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 36.36% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title "Implement vision" directly matches the primary purpose of the PR: adding vision functionality to GLaDOS, as confirmed by the comprehensive vision-related changes across configuration, core engine integration, and new vision module files.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4a813a3 and e229eab.

📒 Files selected for processing (1)
  • src/glados/Vision/__init__.py (0 hunks)
💤 Files with no reviewable changes (1)
  • src/glados/Vision/init.py

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 00549d4 and 618d6c9.

📒 Files selected for processing (9)
  • README.md (2 hunks)
  • configs/glados_vision_config.yaml (1 hunks)
  • pyproject.toml (1 hunks)
  • src/glados/Vision/__init__.py (1 hunks)
  • src/glados/Vision/constants.py (1 hunks)
  • src/glados/Vision/vision_config.py (1 hunks)
  • src/glados/Vision/vision_processor.py (1 hunks)
  • src/glados/core/engine.py (7 hunks)
  • vision.md (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-14T08:22:38.799Z
Learnt from: Leandro4002
Repo: dnhkng/GLaDOS PR: 154
File: src/glados/default_configs/glados_config.yaml:11-18
Timestamp: 2025-09-14T08:22:38.799Z
Learning: User Leandro4002 prefers to keep the original GLaDOS persona examples in src/glados/default_configs/glados_config.yaml, including the dark humor references, as part of maintaining the character's authentic personality from the Portal game series.

Applied to files:

  • README.md
  • configs/glados_vision_config.yaml
🧬 Code graph analysis (3)
src/glados/Vision/__init__.py (2)
src/glados/Vision/vision_config.py (1)
  • VisionConfig (4-11)
src/glados/Vision/vision_processor.py (1)
  • VisionProcessor (20-183)
src/glados/core/engine.py (3)
src/glados/Vision/vision_config.py (1)
  • VisionConfig (4-11)
src/glados/Vision/vision_processor.py (2)
  • VisionProcessor (20-183)
  • run (43-88)
src/glados/core/llm_processor.py (1)
  • run (123-230)
src/glados/Vision/vision_processor.py (1)
src/glados/Vision/vision_config.py (1)
  • VisionConfig (4-11)
🪛 LanguageTool
vision.md

[style] ~3-~3: As a shorter alternative for ‘able to’, consider using “can”.
Context: # Glados vision module Glados is able to capture the world with a camera and rea...

(BE_ABLE_TO)


[style] ~24-~24: As a shorter alternative for ‘able to’, consider using “can”.
Context: ...other processors. I made so that glados is able to react to changes in the environment. - ...

(BE_ABLE_TO)


[style] ~26-~26: Who is ‘not sure’? Consider being more precise.
Context: ...ions, even when it's instructed not to. Not sure whether this is a problem with qwen3:4b...

(WHO_NOT_SURE)

🪛 markdownlint-cli2 (0.18.1)
vision.md

6-6: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🪛 Ruff (0.14.3)
src/glados/Vision/vision_processor.py

82-82: Do not catch blind exception: Exception

(BLE001)


171-171: Do not catch blind exception: Exception

(BLE001)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/glados/vision/vision_processor.py (1)

68-82: Skip the VLM call when the LLM queue is already busy

Right now we still capture, encode, and post to the VLM even if the downstream queue is backed up, only to drop the result immediately afterward. Moving the qsize() backpressure check ahead of _post_vision_query avoids unnecessary camera work and HTTP calls while preserving the existing throttling behaviour.

-                description = self._post_vision_query(payload_image)
-
-                if self.llm_queue.qsize() >= 1: # LLM is busy, avoid flooding the queue with vision updates
+                if self.llm_queue.qsize() >= 1:  # LLM is busy, avoid flooding the queue with vision updates
                     logger.info("VisionProcessor: Skipped a vision update.")
                     self._sleep(loop_started)
                     continue
+
+                description = self._post_vision_query(payload_image)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 618d6c9 and 4a813a3.

📒 Files selected for processing (4)
  • src/glados/vision/__init__.py (1 hunks)
  • src/glados/vision/constants.py (1 hunks)
  • src/glados/vision/vision_config.py (1 hunks)
  • src/glados/vision/vision_processor.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
src/glados/vision/__init__.py (2)
src/glados/vision/vision_config.py (1)
  • VisionConfig (4-11)
src/glados/vision/vision_processor.py (1)
  • VisionProcessor (20-183)
src/glados/vision/vision_processor.py (1)
src/glados/vision/vision_config.py (1)
  • VisionConfig (4-11)
🪛 Ruff (0.14.3)
src/glados/vision/vision_processor.py

82-82: Do not catch blind exception: Exception

(BLE001)


171-171: Do not catch blind exception: Exception

(BLE001)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant