collection: include signed url of documents #509

nishika26 · 2025-12-26T08:35:53Z

Summary

Explain the motivation for making this change. What existing problem does the pull request solve?
The DocumentPublic response model includes a signed_url field, but the collection info endpoint had no logic to populate it. The include_url parameter existed but was never used. In this pull request, Implemented the missing URL generation logic by fetching storage and calling build_document_schemas with the include_url parameter.

Checklist

Before submitting a pull request, please ensure that you mark these task.

Ran fastapi run --reload app/main.py or docker compose up in the repository root and test.
If you've fixed a bug or added code that is tested and has test cases.

Summary by CodeRabbit

New Features
- Added include_url query parameter to the collection info endpoint to return signed document URLs when requested.
- Replaced skip with limit (max 500) to control document retrieval size.
Documentation
- Updated API docs to explain include_url behavior and the new limit parameter.
Tests
- Added an API test verifying include_docs + include_url returns documents with HTTPS signed URLs.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-26T08:36:01Z

📝 Walkthrough

Walkthrough

collection_info now accepts an include_url flag and an optional limit (max 500). When include_docs is true, documents are loaded and passed to build_document_schemas; cloud storage is acquired only if include_url is true and documents exist. Documentation and a test were added.

Changes

Cohort / File(s)	Summary
Documentation Update `backend/app/api/docs/collections/info.md`	Reordered `include_docs` description and added `include_url` behavior: when `include_url=true` a signed URL is included in document responses; omitting it yields no URL.
Collections API Handler `backend/app/api/routes/collections.py`	Signature: removed `skip`, added `limit` (optional, max 500) and `include_url` (bool). When `include_docs` true, documents fetched with `read(collection, skip=None, limit=limit)`. Per-document `DocumentPublic` validation removed; documents built via `build_document_schemas(documents, storage, include_url)`. `get_cloud_storage` invoked only if `include_url` is true and documents exist; removed `DocumentPublic` import.
Documents Route Imports `backend/app/api/routes/documents.py`	Removed unused `HttpUrl` import, moved `get_cloud_storage` import, and minor formatting change.
Tests `backend/app/tests/api/routes/collections/test_collection_info.py`	Added `test_collection_info_include_docs_and_url` to assert that `include_docs=true` and `include_url=true` return documents with an HTTPS `signed_url`.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Route as collection_info (Route)
    participant DB as Collections Store
    participant Helper as build_document_schemas
    participant Storage as Cloud Storage

    Client->>Route: GET /collections/{id}?include_docs&include_url&limit
    Route->>DB: read(collection, skip=None, limit)
    DB-->>Route: documents[]
    alt include_url == true AND documents not empty
        Route->>Storage: get_cloud_storage(session, project_id)
        Storage-->>Route: storage
        Route->>Helper: build_document_schemas(documents, storage, include_url=true)
    else
        Route->>Helper: build_document_schemas(documents, None, include_url=false)
    end
    Helper-->>Route: documents_with_schemas
    Route-->>Client: 200 collection_info response (documents as built)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I hopped through docs and route tonight,

I only fetch a URL when asked just right,
I bundle docs in tidy rows and curl,
Storage stirs only when the flag's unfurled,
A tiny signed ribbon in a digital whirl.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'collection: include signed url of documents' accurately describes the main change in the PR: implementing signed URL generation in the collection info endpoint response.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch enhancement/collection_signed_url

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 448a851 and 4f38697.

📒 Files selected for processing (3)

backend/app/api/docs/collections/info.md
backend/app/api/routes/collections.py
backend/app/api/routes/documents.py

🧰 Additional context used

📓 Path-based instructions (3)

backend/app/api/docs/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Store endpoint documentation files in backend/app/api/docs/<domain>/<action>.md

Files:

backend/app/api/docs/collections/info.md

backend/app/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

backend/app/api/**/*.py: Define FastAPI REST endpoints in backend/app/api/ organized by domain
Load Swagger endpoint descriptions from external markdown files instead of inline strings using load_description("domain/action.md")

Files:

backend/app/api/routes/collections.py
backend/app/api/routes/documents.py

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Always add type hints to all function parameters and return values in Python code
Prefix all log messages with the function name in square brackets: logger.info(f"[function_name] Message {mask_string(sensitive_value)}")
Use Python 3.11+ with type hints throughout the codebase

Files:

backend/app/api/routes/collections.py
backend/app/api/routes/documents.py

🧬 Code graph analysis (2)

backend/app/api/routes/collections.py (4)

backend/app/core/cloud/storage.py (1)

get_cloud_storage (261-278)

backend/app/services/documents/helpers.py (1)

build_document_schemas (130-142)

backend/app/crud/document_collection.py (1)

read (25-48)

backend/app/models/auth.py (1)

project_ (32-36)

backend/app/api/routes/documents.py (1)

backend/app/core/cloud/storage.py (1)

get_cloud_storage (261-278)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: checks (3.12, 6)

🔇 Additional comments (4)

backend/app/api/routes/documents.py (2)

29-36: LGTM!

The import reorganization is clean. get_cloud_storage is correctly imported from app.core.cloud, and build_document_schemas is added alongside build_document_schema for consistent document schema construction across endpoints.

263-264: LGTM!

Minor formatting improvement with the blank line before the return statement improves readability.

backend/app/api/routes/collections.py (2)

15-15: LGTM!

Imports are correctly added for cloud storage integration and document schema building.

Also applies to: 35-35

203-215: LGTM!

The implementation correctly:

Fetches documents with the optional limit parameter

Initializes storage conditionally only when include_url is true and documents exist (avoiding unnecessary cloud storage calls)

Uses build_document_schemas with proper keyword arguments for consistent document schema construction

This aligns well with the pattern in documents.py and the helper function signature from backend/app/services/documents/helpers.py.

backend/app/api/docs/collections/info.md

backend/app/api/routes/collections.py

codecov · 2025-12-26T08:40:28Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (1)

backend/app/api/routes/collections.py (1)

188-197: Type annotation has been corrected.

The limit parameter now correctly uses int | None to match its default value of None. This addresses the type annotation issue raised in the previous review.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4f38697 and aa4e1a1.

📒 Files selected for processing (3)

backend/app/api/docs/collections/info.md
backend/app/api/routes/collections.py
backend/app/tests/api/routes/collections/test_collection_info.py

🚧 Files skipped from review as they are similar to previous changes (1)

backend/app/api/docs/collections/info.md

🧰 Additional context used

📓 Path-based instructions (3)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Always add type hints to all function parameters and return values in Python code
Prefix all log messages with the function name in square brackets: logger.info(f"[function_name] Message {mask_string(sensitive_value)}")
Use Python 3.11+ with type hints throughout the codebase

Files:

backend/app/tests/api/routes/collections/test_collection_info.py
backend/app/api/routes/collections.py

backend/app/tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use factory pattern for test fixtures in backend/app/tests/

Files:

backend/app/tests/api/routes/collections/test_collection_info.py

backend/app/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

backend/app/api/**/*.py: Define FastAPI REST endpoints in backend/app/api/ organized by domain
Load Swagger endpoint descriptions from external markdown files instead of inline strings using load_description("domain/action.md")

Files:

backend/app/api/routes/collections.py

🧬 Code graph analysis (2)

backend/app/tests/api/routes/collections/test_collection_info.py (3)

backend/app/tests/conftest.py (3)

client (65-68)

db (28-45)

user_api_key_header (90-92)

backend/app/tests/utils/utils.py (1)

get_project (54-73)

backend/app/tests/utils/collection.py (1)

get_collection (25-47)

backend/app/api/routes/collections.py (3)

backend/app/core/cloud/storage.py (1)

get_cloud_storage (261-278)

backend/app/services/documents/helpers.py (1)

build_document_schemas (130-142)

backend/app/crud/document_collection.py (1)

read (25-48)

🪛 Ruff (0.14.10)

backend/app/tests/api/routes/collections/test_collection_info.py

202-202: Local variable document is assigned to but never used

Remove assignment to unused variable document

(F841)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: checks (3.12, 6)

🔇 Additional comments (1)

backend/app/api/routes/collections.py (1)

208-216: Conditional storage initialization and centralized schema building look good.

The logic correctly initializes storage only when include_url is true and documents exist, avoiding unnecessary overhead. The centralized build_document_schemas helper properly handles document transformation with optional signed URLs.

backend/app/api/routes/collections.py

backend/app/tests/api/routes/collections/test_collection_info.py

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

backend/app/api/routes/collections.py (1)
180-198: Add return type hint to align with coding guidelines.

The function signature lacks a return type annotation. As per coding guidelines, "Always add type hints to all function parameters and return values in Python code."

While the response_model decorator documents the API contract, an explicit return type improves type-checking and IDE support.
🔎 Proposed fix
 def collection_info(
     session: SessionDep,
     current_user: AuthContextDep,
     collection_id: UUID = FastPath(description="Collection to retrieve"),
     include_docs: bool = Query(
         True,
         description="If true, include documents linked to this collection",
     ),
     include_url: bool = Query(
         False, description="Include a signed URL to access the document"
     ),
     limit: int
     | None = Query(
         None,
         gt=0,
         le=500,
         description="Limit number of documents returned (default: all, max: 500)",
     ),
-):
+) -> APIResponse[CollectionWithDocsPublic]:
Based on coding guidelines.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aa4e1a1 and af9d9b8.

📒 Files selected for processing (2)

backend/app/api/routes/collections.py
backend/app/tests/api/routes/collections/test_collection_info.py

🧰 Additional context used

📓 Path-based instructions (3)

backend/app/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

backend/app/api/**/*.py: Define FastAPI REST endpoints in backend/app/api/ organized by domain
Load Swagger endpoint descriptions from external markdown files instead of inline strings using load_description("domain/action.md")

Files:

backend/app/api/routes/collections.py

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Always add type hints to all function parameters and return values in Python code
Prefix all log messages with the function name in square brackets: logger.info(f"[function_name] Message {mask_string(sensitive_value)}")
Use Python 3.11+ with type hints throughout the codebase

Files:

backend/app/api/routes/collections.py
backend/app/tests/api/routes/collections/test_collection_info.py

backend/app/tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use factory pattern for test fixtures in backend/app/tests/

Files:

backend/app/tests/api/routes/collections/test_collection_info.py

🧬 Code graph analysis (2)

backend/app/api/routes/collections.py (3)

backend/app/core/cloud/storage.py (1)

get_cloud_storage (261-278)

backend/app/services/documents/helpers.py (1)

build_document_schemas (130-142)

backend/app/crud/document_collection.py (1)

read (25-48)

backend/app/tests/api/routes/collections/test_collection_info.py (3)

backend/app/tests/conftest.py (3)

client (65-68)

db (28-45)

user_api_key_header (90-92)

backend/app/tests/utils/utils.py (1)

get_project (54-73)

backend/app/tests/utils/collection.py (1)

get_collection (25-47)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: checks (3.12, 6)

🔇 Additional comments (2)

backend/app/tests/api/routes/collections/test_collection_info.py (1)

190-226: LGTM! Well-structured test with comprehensive validation.

The test correctly validates the new include_url functionality:

Proper setup with linked document

API call with both include_docs and include_url parameters

Verification that the specific linked document is returned with a valid signed URL

All previous review feedback has been addressed—the return type hint is present and the linked document is properly verified in the response.

backend/app/api/routes/collections.py (1)

206-216: LGTM! Efficient implementation with correct parameter passing.

The implementation correctly addresses previous review feedback:

Explicit keyword arguments (skip=None, limit=limit) ensure proper pagination behavior

Storage is lazily initialized only when include_url=True and documents exist, avoiding unnecessary overhead

The build_document_schemas call properly delegates document schema construction with appropriate parameters

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

backend/app/api/routes/collections.py (1)
188-216: Implementation looks correct; consider adding return type annotation.

The implementation properly addresses the missing signed URL functionality:

Parameters are correctly typed (int | None for limit)

Documents are fetched with explicit skip=None, limit=limit (avoiding the previous bug)

Storage is efficiently initialized only when needed (include_url=True and documents exist)

build_document_schemas is called with correct keyword arguments

The logic flow is sound and handles edge cases appropriately.
Optional: Add return type annotation per coding guidelines

According to the coding guidelines, all functions should have return type annotations. While FastAPI's response_model in the decorator specifies the response type, adding an explicit annotation improves type checking:
 def collection_info(
     session: SessionDep,
     current_user: AuthContextDep,
     collection_id: UUID = FastPath(description="Collection to retrieve"),
     include_docs: bool = Query(
         True,
         description="If true, include documents linked to this collection",
     ),
     include_url: bool = Query(
         True, description="Include a signed URL to access the document"
     ),
     limit: int
     | None = Query(
         None,
         gt=0,
         le=500,
         description="Limit number of documents returned (default: all, max: 500)",
     ),
-):
+) -> APIResponse[CollectionWithDocsPublic]:
This is a very minor refinement and can be deferred.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between af9d9b8 and ade4344.

📒 Files selected for processing (1)

backend/app/api/routes/collections.py

🧰 Additional context used

📓 Path-based instructions (2)

backend/app/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

backend/app/api/**/*.py: Define FastAPI REST endpoints in backend/app/api/ organized by domain
Load Swagger endpoint descriptions from external markdown files instead of inline strings using load_description("domain/action.md")

Files:

backend/app/api/routes/collections.py

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Always add type hints to all function parameters and return values in Python code
Prefix all log messages with the function name in square brackets: logger.info(f"[function_name] Message {mask_string(sensitive_value)}")
Use Python 3.11+ with type hints throughout the codebase

Files:

backend/app/api/routes/collections.py

🧬 Code graph analysis (1)

backend/app/api/routes/collections.py (3)

backend/app/core/cloud/storage.py (1)

get_cloud_storage (261-278)

backend/app/services/documents/helpers.py (1)

build_document_schemas (130-142)

backend/app/crud/document_collection.py (1)

read (25-48)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: checks (3.12, 6)

collection: include signed url of documents

4f38697

coderabbitai bot reviewed Dec 26, 2025

View reviewed changes

backend/app/api/docs/collections/info.md Outdated Show resolved Hide resolved

backend/app/api/routes/collections.py Show resolved Hide resolved

nishika26 added the bug Something isn't working label Dec 26, 2025

pr review and adding test case

aa4e1a1

coderabbitai bot reviewed Dec 26, 2025

View reviewed changes

backend/app/api/routes/collections.py Outdated Show resolved Hide resolved

backend/app/tests/api/routes/collections/test_collection_info.py Outdated Show resolved Hide resolved

backend/app/tests/api/routes/collections/test_collection_info.py Show resolved Hide resolved

nishika26 added the ready-for-review label Dec 26, 2025

nishika26 self-assigned this Dec 26, 2025

nishika26 requested review from Prajna1999 and kartpop December 26, 2025 09:03

coderabbit reviews

af9d9b8

coderabbitai bot reviewed Dec 26, 2025

View reviewed changes

include url default to true

ade4344

coderabbitai bot reviewed Dec 26, 2025

View reviewed changes

Prajna1999 approved these changes Dec 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

collection: include signed url of documents #509

collection: include signed url of documents #509

nishika26 commented Dec 26, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 26, 2025 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Dec 26, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

collection: include signed url of documents #509

Are you sure you want to change the base?

collection: include signed url of documents #509

Conversation

nishika26 commented Dec 26, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nishika26 commented Dec 26, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 26, 2025 •

edited

Loading

codecov bot commented Dec 26, 2025 •

edited

Loading