-
Notifications
You must be signed in to change notification settings - Fork 2.6k
fix(models): Add PDF support for Claude models #3825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Add _is_pdf_part() helper function to detect PDF parts - Add PDF handling in part_to_message_block() function - PDFs are encoded as base64 and sent as document blocks to Anthropic API - Update return type annotation to include DocumentBlockParam for PDF document blocks - Add test for PDF support Fixes google#3614
Summary of ChangesHello @sarojrout, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves a Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors the Anthropic LLM integration by renaming the AnthropicLlm class to Claude and updating its default model to claude-3-5-sonnet-v2@20241022. The changes include migrating from AsyncAnthropic and AsyncAnthropicVertex clients to the synchronous AnthropicVertex client, which involved adjusting the messages.create call to remove the await keyword. A new feature was added to support PDF documents, introducing a helper function _is_pdf_part and modifying part_to_message_block to convert PDF types.Part objects into anthropic_types.DocumentBlockParam using base64 encoding. Correspondingly, the AnthropicLlm class and its specific test case were removed, and a new test was added to validate the PDF handling.
|
Hi @sarojrout ,Thank you for your contribution! We appreciate you taking the time to submit this pull request. Your PR has been received by the team and is currently under review. We will provide feedback as soon as we have an update to share. |
Fixes #3614
Please ensure you have read the contribution guide before creating a pull request.
Link to Issue or Description of Change
1. Link to an existing issue (if applicable):
2. Or, if no issue exists, describe the change:
If applicable, please follow the issue templates to provide as much detail as
possible.
Problem:
When using Claude models (e.g., Claude Sonnet 4.5) in ADK with PDF files, the code throws a NotImplementedError: Not supported yet error. The part_to_message_block() function in anthropic_llm.py handles text, images, function calls, and function responses, but does not handle PDF documents. When a user attempts to upload a PDF file (with mime_type="application/pdf"),
Solution:
Added PDF support by:
Creating a _is_pdf_part() helper function (similar to _is_image_part()) to detect PDF parts by checking for mime_type == "application/pdf"
Adding PDF handling in part_to_message_block() function that:
Detects PDF parts using the new helper function
Encodes PDF data as base64 (same as images)
Returns a document block dictionary with type="document" and the base64-encoded PDF data
Updated the return type annotation to include dict[str, Any] for PDF document blocks
Added comprehensive unit test to verify PDF handling works correctly
This solution follows the same pattern used for image handling and leverages Anthropic's API support for PDF documents as document blocks.
Testing Plan
Please describe the tests that you ran to verify your changes. This is required
for all PRs that are not small documentation or typo fixes.
Unit Tests:
Please include a summary of passed
pytestresults.Manual End-to-End (E2E) Tests:
Setup:
Test Steps:
Expected Result:
Checklist
Additional context
Code Changes Summary:
File: src/google/adk/models/anthropic_llm.py
Added _is_pdf_part() helper function (lines 79-85)
Added PDF handling in part_to_message_block() (lines 147-159)
Updated return type annotation (line 95)
File: tests/unittests/models/test_anthropic_llm.py
Added test_part_to_message_block_with_pdf() test (lines 467-496)
Technical Details:
PDFs are handled similarly to images: base64-encoded and sent as document blocks
The implementation follows Anthropic's API specification for document blocks
The fix is backward compatible - existing functionality (text, images, function calls) remains unchanged