chore: Add Sys test large obj#1676
Conversation
…torage into bidi_writes_checksum
|
/gcbrun(2b1d6ce) |
There was a problem hiding this comment.
Code Review
This pull request introduces a check for the C-accelerated google-crc32c implementation to prevent silent performance degradation, refactoring this check into a shared utility function. It also adds CRC32C checksum validation for data chunks in AsyncAppendableObjectWriter. To support these changes, extensive system tests for large object uploads (both single and multi-part) have been added, along with corresponding unit tests. My feedback includes a minor correction to a docstring and a suggestion to reduce code duplication in the new system tests for better maintainability.
I am having trouble creating individual review comments. Click here to see my feedback.
google/cloud/storage/_experimental/asyncio/_utils.py (20-28)
The docstring for raise_if_no_fast_crc32c incorrectly states that the function returns a boolean value. This function either raises an exception or returns None. The docstring should be updated to remove the returns and rtype sections. I've also taken the liberty to format the raises section to follow common style guides.
"""Check if the C-accelerated version of google-crc32c is available.
If not, raise an error to prevent silent performance degradation.
Raises:
google.api_core.exceptions.FailedPrecondition: If the C extension is not available.
"""tests/system/test_zonal.py (131-168)
There is significant code duplication between test_basic_wrd and test_basic_wrd_in_slices. The logic for client instantiation, data generation, download, verification, and cleanup is identical in both tests.
To improve maintainability, you could refactor this common code into a helper function. The helper could accept a callback to perform the specific writing logic for each test case.
Here's an example of how you could structure it:
async def _perform_wrd_test(storage_client, blobs_to_delete, attempt_direct_path, object_size, write_callable):
# ... common setup code ...
object_name = f"test_basic_wrd-{str(uuid.uuid4())}"
object_data = os.urandom(object_size)
object_checksum = google_crc32c.value(object_data)
grpc_client = AsyncGrpcClient(attempt_direct_path=attempt_direct_path).grpc_client
writer = AsyncAppendableObjectWriter(grpc_client, _ZONAL_BUCKET, object_name)
await writer.open()
# Call the specific write logic
await write_callable(writer, object_data)
object_metadata = await writer.close(finalize_on_close=True)
assert object_metadata.size == object_size
# ... common verification and cleanup ...
# In test_basic_wrd:
await _perform_wrd_test(..., write_callable=lambda writer, data: writer.append(data))
# In test_basic_wrd_in_slices:
async def sliced_write(writer, data):
mark1, mark2 = _get_equal_dist(0, len(data))
await writer.append(data[0:mark1])
await writer.append(data[mark1:mark2])
await writer.append(data[mark2:])
await _perform_wrd_test(..., write_callable=sliced_write)This approach would reduce code duplication and make the tests easier to read and maintain.

chore: Add Sys test large obj