Skip to content

Conversation

@nishika26
Copy link
Collaborator

@nishika26 nishika26 commented Dec 26, 2025

Summary

Explain the motivation for making this change. What existing problem does the pull request solve?
The DocumentPublic response model includes a signed_url field, but the collection info endpoint had no logic to populate it. The include_url parameter existed but was never used. In this pull request, Implemented the missing URL generation logic by fetching storage and calling build_document_schemas with the include_url parameter.

Checklist

Before submitting a pull request, please ensure that you mark these task.

  • Ran fastapi run --reload app/main.py or docker compose up in the repository root and test.
  • If you've fixed a bug or added code that is tested and has test cases.

Summary by CodeRabbit

  • New Features

    • Added include_url query parameter to the collection info endpoint to return signed document URLs when requested.
    • Replaced skip with limit (max 500) to control document retrieval size.
  • Documentation

    • Updated API docs to explain include_url behavior and the new limit parameter.
  • Tests

    • Added an API test verifying include_docs + include_url returns documents with HTTPS signed URLs.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 26, 2025

📝 Walkthrough

Walkthrough

collection_info now accepts an include_url flag and an optional limit (max 500). When include_docs is true, documents are loaded and passed to build_document_schemas; cloud storage is acquired only if include_url is true and documents exist. Documentation and a test were added.

Changes

Cohort / File(s) Summary
Documentation Update
backend/app/api/docs/collections/info.md
Reordered include_docs description and added include_url behavior: when include_url=true a signed URL is included in document responses; omitting it yields no URL.
Collections API Handler
backend/app/api/routes/collections.py
Signature: removed skip, added limit (optional, max 500) and include_url (bool). When include_docs true, documents fetched with read(collection, skip=None, limit=limit). Per-document DocumentPublic validation removed; documents built via build_document_schemas(documents, storage, include_url). get_cloud_storage invoked only if include_url is true and documents exist; removed DocumentPublic import.
Documents Route Imports
backend/app/api/routes/documents.py
Removed unused HttpUrl import, moved get_cloud_storage import, and minor formatting change.
Tests
backend/app/tests/api/routes/collections/test_collection_info.py
Added test_collection_info_include_docs_and_url to assert that include_docs=true and include_url=true return documents with an HTTPS signed_url.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Route as collection_info (Route)
    participant DB as Collections Store
    participant Helper as build_document_schemas
    participant Storage as Cloud Storage

    Client->>Route: GET /collections/{id}?include_docs&include_url&limit
    Route->>DB: read(collection, skip=None, limit)
    DB-->>Route: documents[]
    alt include_url == true AND documents not empty
        Route->>Storage: get_cloud_storage(session, project_id)
        Storage-->>Route: storage
        Route->>Helper: build_document_schemas(documents, storage, include_url=true)
    else
        Route->>Helper: build_document_schemas(documents, None, include_url=false)
    end
    Helper-->>Route: documents_with_schemas
    Route-->>Client: 200 collection_info response (documents as built)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I hopped through docs and route tonight,

I only fetch a URL when asked just right,
I bundle docs in tidy rows and curl,
Storage stirs only when the flag's unfurled,
A tiny signed ribbon in a digital whirl.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'collection: include signed url of documents' accurately describes the main change in the PR: implementing signed URL generation in the collection info endpoint response.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch enhancement/collection_signed_url

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 448a851 and 4f38697.

📒 Files selected for processing (3)
  • backend/app/api/docs/collections/info.md
  • backend/app/api/routes/collections.py
  • backend/app/api/routes/documents.py
🧰 Additional context used
📓 Path-based instructions (3)
backend/app/api/docs/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Store endpoint documentation files in backend/app/api/docs/<domain>/<action>.md

Files:

  • backend/app/api/docs/collections/info.md
backend/app/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

backend/app/api/**/*.py: Define FastAPI REST endpoints in backend/app/api/ organized by domain
Load Swagger endpoint descriptions from external markdown files instead of inline strings using load_description("domain/action.md")

Files:

  • backend/app/api/routes/collections.py
  • backend/app/api/routes/documents.py
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Always add type hints to all function parameters and return values in Python code
Prefix all log messages with the function name in square brackets: logger.info(f"[function_name] Message {mask_string(sensitive_value)}")
Use Python 3.11+ with type hints throughout the codebase

Files:

  • backend/app/api/routes/collections.py
  • backend/app/api/routes/documents.py
🧬 Code graph analysis (2)
backend/app/api/routes/collections.py (4)
backend/app/core/cloud/storage.py (1)
  • get_cloud_storage (261-278)
backend/app/services/documents/helpers.py (1)
  • build_document_schemas (130-142)
backend/app/crud/document_collection.py (1)
  • read (25-48)
backend/app/models/auth.py (1)
  • project_ (32-36)
backend/app/api/routes/documents.py (1)
backend/app/core/cloud/storage.py (1)
  • get_cloud_storage (261-278)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: checks (3.12, 6)
🔇 Additional comments (4)
backend/app/api/routes/documents.py (2)

29-36: LGTM!

The import reorganization is clean. get_cloud_storage is correctly imported from app.core.cloud, and build_document_schemas is added alongside build_document_schema for consistent document schema construction across endpoints.


263-264: LGTM!

Minor formatting improvement with the blank line before the return statement improves readability.

backend/app/api/routes/collections.py (2)

15-15: LGTM!

Imports are correctly added for cloud storage integration and document schema building.

Also applies to: 35-35


203-215: LGTM!

The implementation correctly:

  1. Fetches documents with the optional limit parameter
  2. Initializes storage conditionally only when include_url is true and documents exist (avoiding unnecessary cloud storage calls)
  3. Uses build_document_schemas with proper keyword arguments for consistent document schema construction

This aligns well with the pattern in documents.py and the helper function signature from backend/app/services/documents/helpers.py.

@nishika26 nishika26 added the bug Something isn't working label Dec 26, 2025
@codecov
Copy link

codecov bot commented Dec 26, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
backend/app/api/routes/collections.py (1)

188-197: Type annotation has been corrected.

The limit parameter now correctly uses int | None to match its default value of None. This addresses the type annotation issue raised in the previous review.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4f38697 and aa4e1a1.

📒 Files selected for processing (3)
  • backend/app/api/docs/collections/info.md
  • backend/app/api/routes/collections.py
  • backend/app/tests/api/routes/collections/test_collection_info.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/app/api/docs/collections/info.md
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Always add type hints to all function parameters and return values in Python code
Prefix all log messages with the function name in square brackets: logger.info(f"[function_name] Message {mask_string(sensitive_value)}")
Use Python 3.11+ with type hints throughout the codebase

Files:

  • backend/app/tests/api/routes/collections/test_collection_info.py
  • backend/app/api/routes/collections.py
backend/app/tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use factory pattern for test fixtures in backend/app/tests/

Files:

  • backend/app/tests/api/routes/collections/test_collection_info.py
backend/app/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

backend/app/api/**/*.py: Define FastAPI REST endpoints in backend/app/api/ organized by domain
Load Swagger endpoint descriptions from external markdown files instead of inline strings using load_description("domain/action.md")

Files:

  • backend/app/api/routes/collections.py
🧬 Code graph analysis (2)
backend/app/tests/api/routes/collections/test_collection_info.py (3)
backend/app/tests/conftest.py (3)
  • client (65-68)
  • db (28-45)
  • user_api_key_header (90-92)
backend/app/tests/utils/utils.py (1)
  • get_project (54-73)
backend/app/tests/utils/collection.py (1)
  • get_collection (25-47)
backend/app/api/routes/collections.py (3)
backend/app/core/cloud/storage.py (1)
  • get_cloud_storage (261-278)
backend/app/services/documents/helpers.py (1)
  • build_document_schemas (130-142)
backend/app/crud/document_collection.py (1)
  • read (25-48)
🪛 Ruff (0.14.10)
backend/app/tests/api/routes/collections/test_collection_info.py

202-202: Local variable document is assigned to but never used

Remove assignment to unused variable document

(F841)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: checks (3.12, 6)
🔇 Additional comments (1)
backend/app/api/routes/collections.py (1)

208-216: Conditional storage initialization and centralized schema building look good.

The logic correctly initializes storage only when include_url is true and documents exist, avoiding unnecessary overhead. The centralized build_document_schemas helper properly handles document transformation with optional signed URLs.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
backend/app/api/routes/collections.py (1)

180-198: Add return type hint to align with coding guidelines.

The function signature lacks a return type annotation. As per coding guidelines, "Always add type hints to all function parameters and return values in Python code."

While the response_model decorator documents the API contract, an explicit return type improves type-checking and IDE support.

🔎 Proposed fix
 def collection_info(
     session: SessionDep,
     current_user: AuthContextDep,
     collection_id: UUID = FastPath(description="Collection to retrieve"),
     include_docs: bool = Query(
         True,
         description="If true, include documents linked to this collection",
     ),
     include_url: bool = Query(
         False, description="Include a signed URL to access the document"
     ),
     limit: int
     | None = Query(
         None,
         gt=0,
         le=500,
         description="Limit number of documents returned (default: all, max: 500)",
     ),
-):
+) -> APIResponse[CollectionWithDocsPublic]:

Based on coding guidelines.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aa4e1a1 and af9d9b8.

📒 Files selected for processing (2)
  • backend/app/api/routes/collections.py
  • backend/app/tests/api/routes/collections/test_collection_info.py
🧰 Additional context used
📓 Path-based instructions (3)
backend/app/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

backend/app/api/**/*.py: Define FastAPI REST endpoints in backend/app/api/ organized by domain
Load Swagger endpoint descriptions from external markdown files instead of inline strings using load_description("domain/action.md")

Files:

  • backend/app/api/routes/collections.py
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Always add type hints to all function parameters and return values in Python code
Prefix all log messages with the function name in square brackets: logger.info(f"[function_name] Message {mask_string(sensitive_value)}")
Use Python 3.11+ with type hints throughout the codebase

Files:

  • backend/app/api/routes/collections.py
  • backend/app/tests/api/routes/collections/test_collection_info.py
backend/app/tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use factory pattern for test fixtures in backend/app/tests/

Files:

  • backend/app/tests/api/routes/collections/test_collection_info.py
🧬 Code graph analysis (2)
backend/app/api/routes/collections.py (3)
backend/app/core/cloud/storage.py (1)
  • get_cloud_storage (261-278)
backend/app/services/documents/helpers.py (1)
  • build_document_schemas (130-142)
backend/app/crud/document_collection.py (1)
  • read (25-48)
backend/app/tests/api/routes/collections/test_collection_info.py (3)
backend/app/tests/conftest.py (3)
  • client (65-68)
  • db (28-45)
  • user_api_key_header (90-92)
backend/app/tests/utils/utils.py (1)
  • get_project (54-73)
backend/app/tests/utils/collection.py (1)
  • get_collection (25-47)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: checks (3.12, 6)
🔇 Additional comments (2)
backend/app/tests/api/routes/collections/test_collection_info.py (1)

190-226: LGTM! Well-structured test with comprehensive validation.

The test correctly validates the new include_url functionality:

  • Proper setup with linked document
  • API call with both include_docs and include_url parameters
  • Verification that the specific linked document is returned with a valid signed URL

All previous review feedback has been addressed—the return type hint is present and the linked document is properly verified in the response.

backend/app/api/routes/collections.py (1)

206-216: LGTM! Efficient implementation with correct parameter passing.

The implementation correctly addresses previous review feedback:

  • Explicit keyword arguments (skip=None, limit=limit) ensure proper pagination behavior
  • Storage is lazily initialized only when include_url=True and documents exist, avoiding unnecessary overhead
  • The build_document_schemas call properly delegates document schema construction with appropriate parameters

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
backend/app/api/routes/collections.py (1)

188-216: Implementation looks correct; consider adding return type annotation.

The implementation properly addresses the missing signed URL functionality:

  • Parameters are correctly typed (int | None for limit)
  • Documents are fetched with explicit skip=None, limit=limit (avoiding the previous bug)
  • Storage is efficiently initialized only when needed (include_url=True and documents exist)
  • build_document_schemas is called with correct keyword arguments

The logic flow is sound and handles edge cases appropriately.

Optional: Add return type annotation per coding guidelines

According to the coding guidelines, all functions should have return type annotations. While FastAPI's response_model in the decorator specifies the response type, adding an explicit annotation improves type checking:

 def collection_info(
     session: SessionDep,
     current_user: AuthContextDep,
     collection_id: UUID = FastPath(description="Collection to retrieve"),
     include_docs: bool = Query(
         True,
         description="If true, include documents linked to this collection",
     ),
     include_url: bool = Query(
         True, description="Include a signed URL to access the document"
     ),
     limit: int
     | None = Query(
         None,
         gt=0,
         le=500,
         description="Limit number of documents returned (default: all, max: 500)",
     ),
-):
+) -> APIResponse[CollectionWithDocsPublic]:

This is a very minor refinement and can be deferred.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between af9d9b8 and ade4344.

📒 Files selected for processing (1)
  • backend/app/api/routes/collections.py
🧰 Additional context used
📓 Path-based instructions (2)
backend/app/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

backend/app/api/**/*.py: Define FastAPI REST endpoints in backend/app/api/ organized by domain
Load Swagger endpoint descriptions from external markdown files instead of inline strings using load_description("domain/action.md")

Files:

  • backend/app/api/routes/collections.py
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Always add type hints to all function parameters and return values in Python code
Prefix all log messages with the function name in square brackets: logger.info(f"[function_name] Message {mask_string(sensitive_value)}")
Use Python 3.11+ with type hints throughout the codebase

Files:

  • backend/app/api/routes/collections.py
🧬 Code graph analysis (1)
backend/app/api/routes/collections.py (3)
backend/app/core/cloud/storage.py (1)
  • get_cloud_storage (261-278)
backend/app/services/documents/helpers.py (1)
  • build_document_schemas (130-142)
backend/app/crud/document_collection.py (1)
  • read (25-48)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: checks (3.12, 6)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working ready-for-review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants