chore: Add Sys test large obj #1676

chandra-siri · 2025-12-18T14:15:33Z

chore: Add Sys test large obj

…torage into bidi_writes_checksum

gemini-code-assist · 2025-12-18T14:16:01Z

Summary of Changes

Hello @chandra-siri, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on improving the robustness and integrity of asynchronous storage operations, particularly for large objects. It introduces a standardized mechanism to enforce the use of the C-accelerated google-crc32c library, crucial for efficient data integrity checks. Furthermore, it enhances the AsyncAppendableObjectWriter to proactively include checksums during uploads and significantly expands system tests to validate the handling of large objects and segmented writes, ensuring reliable performance and data consistency.

Highlights

CRC32C C-Extension Enforcement: Introduced a new utility function raise_if_no_fast_crc32c to ensure the C-accelerated google-crc32c implementation is available, preventing silent performance degradation.
Checksum Inclusion in Writes: The AsyncAppendableObjectWriter now includes CRC32C checksums with data chunks during object writes, enhancing data integrity.
Expanded System Tests: Added new system tests in test_zonal.py to cover large object uploads and downloads, including scenarios where objects are written in multiple slices, and verifying their checksums.
Refactored CRC32C Checks: Consolidated the CRC32C implementation check into the new utility function, which is now called by both AsyncAppendableObjectWriter and AsyncMultiRangeDownloader.
Updated Unit Tests: Modified unit tests for AsyncAppendableObjectWriter and AsyncMultiRangeDownloader to reflect the new FailedPrecondition exception type when the C-accelerated CRC32C is not found.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

chandra-siri · 2025-12-18T14:17:06Z

/gcbrun(2b1d6ce)

gemini-code-assist

Code Review

This pull request introduces a check for the C-accelerated google-crc32c implementation to prevent silent performance degradation, refactoring this check into a shared utility function. It also adds CRC32C checksum validation for data chunks in AsyncAppendableObjectWriter. To support these changes, extensive system tests for large object uploads (both single and multi-part) have been added, along with corresponding unit tests. My feedback includes a minor correction to a docstring and a suggestion to reduce code duplication in the new system tests for better maintainability.

I am having trouble creating individual review comments. Click here to see my feedback.

google/cloud/storage/_experimental/asyncio/_utils.py (20-28)

The docstring for raise_if_no_fast_crc32c incorrectly states that the function returns a boolean value. This function either raises an exception or returns None. The docstring should be updated to remove the returns and rtype sections. I've also taken the liberty to format the raises section to follow common style guides.

    """Check if the C-accelerated version of google-crc32c is available.

    If not, raise an error to prevent silent performance degradation.

    Raises:
        google.api_core.exceptions.FailedPrecondition: If the C extension is not available.
    """

tests/system/test_zonal.py (131-168)

There is significant code duplication between test_basic_wrd and test_basic_wrd_in_slices. The logic for client instantiation, data generation, download, verification, and cleanup is identical in both tests.

To improve maintainability, you could refactor this common code into a helper function. The helper could accept a callback to perform the specific writing logic for each test case.

Here's an example of how you could structure it:

async def _perform_wrd_test(storage_client, blobs_to_delete, attempt_direct_path, object_size, write_callable):
    # ... common setup code ...
    object_name = f"test_basic_wrd-{str(uuid.uuid4())}"
    object_data = os.urandom(object_size)
    object_checksum = google_crc32c.value(object_data)
    grpc_client = AsyncGrpcClient(attempt_direct_path=attempt_direct_path).grpc_client

    writer = AsyncAppendableObjectWriter(grpc_client, _ZONAL_BUCKET, object_name)
    await writer.open()

    # Call the specific write logic
    await write_callable(writer, object_data)

    object_metadata = await writer.close(finalize_on_close=True)
    assert object_metadata.size == object_size
    # ... common verification and cleanup ...

# In test_basic_wrd:
await _perform_wrd_test(..., write_callable=lambda writer, data: writer.append(data))

# In test_basic_wrd_in_slices:
async def sliced_write(writer, data):
    mark1, mark2 = _get_equal_dist(0, len(data))
    await writer.append(data[0:mark1])
    await writer.append(data[mark1:mark2])
    await writer.append(data[mark2:])

await _perform_wrd_test(..., write_callable=sliced_write)

This approach would reduce code duplication and make the tests easier to read and maintain.

…test_large_obj

chandra-siri · 2025-12-19T12:18:13Z

hi @suni72 , samples 3.14 are blocked because of b/470276398 , it's not releated to these changes and you can ignore them.

3.14 support was recently added and it's still hasn't stabilized yet.

tests/system/test_zonal.py

chandra-siri added 8 commits December 18, 2025 08:40

feat: compute chunk wise checksum for bidi_writes

e2a595a

move common code to utils

4f3d18e

add and fix failing unit tests

62a2c89

Merge branch 'main' into bidi_writes_checksum

dddf6c2

add license info in _utils file

e89e205

Merge branch 'bidi_writes_checksum' of github.com:googleapis/python-s…

f1abcaa

…torage into bidi_writes_checksum

use FailedPreCondition instead of NotFound

9353e58

chore: add test cases for large objects

2b1d6ce

product-auto-label bot added size: m Pull request size is medium. api: storage Issues related to the googleapis/python-storage API. labels Dec 18, 2025

chandra-siri changed the base branch from main to bidi_writes_checksum December 18, 2025 14:16

gemini-code-assist bot reviewed Dec 18, 2025

View reviewed changes

chandra-siri marked this pull request as ready for review December 18, 2025 14:27

chandra-siri requested review from a team as code owners December 18, 2025 14:27

chandra-siri assigned suni72 Dec 18, 2025

Base automatically changed from bidi_writes_checksum to main December 19, 2025 10:51

Merge branch 'main' of github.com:googleapis/python-storage into sys_…

fad8a88

…test_large_obj

chandra-siri added kokoro:run Add this label to force Kokoro to re-run the tests. kokoro:force-run Add this label to force Kokoro to re-run the tests. labels Dec 19, 2025

yoshi-kokoro removed kokoro:run Add this label to force Kokoro to re-run the tests. kokoro:force-run Add this label to force Kokoro to re-run the tests. labels Dec 19, 2025

remove unused imports

5bc78db

suni72 reviewed Dec 19, 2025

View reviewed changes

tests/system/test_zonal.py Outdated Show resolved Hide resolved

remove parametrization on "attempt_direct_path"

3f92cda

chandra-siri requested a review from suni72 December 19, 2025 13:36

suni72 approved these changes Dec 19, 2025

View reviewed changes

chandra-siri merged commit a0668ec into main Dec 19, 2025
16 of 17 checks passed

chandra-siri deleted the sys_test_large_obj branch December 19, 2025 14:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: Add Sys test large obj #1676

chore: Add Sys test large obj #1676

Uh oh!

chandra-siri commented Dec 18, 2025

Uh oh!

gemini-code-assist bot commented Dec 18, 2025

Uh oh!

chandra-siri commented Dec 18, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

chandra-siri commented Dec 19, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

chore: Add Sys test large obj #1676

chore: Add Sys test large obj #1676

Uh oh!

Conversation

chandra-siri commented Dec 18, 2025

Uh oh!

gemini-code-assist bot commented Dec 18, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

chandra-siri commented Dec 18, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

google/cloud/storage/_experimental/asyncio/_utils.py (20-28)

tests/system/test_zonal.py (131-168)

Uh oh!

chandra-siri commented Dec 19, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants