Skip to content

feat(storage): full object checksum: parse finalize_time and server crc32c in async object stream#17261

Open
chandra-siri wants to merge 2 commits into
mainfrom
feat/stream-checksum-metadata
Open

feat(storage): full object checksum: parse finalize_time and server crc32c in async object stream#17261
chandra-siri wants to merge 2 commits into
mainfrom
feat/stream-checksum-metadata

Conversation

@chandra-siri
Copy link
Copy Markdown
Contributor

PR 1 of 3: Extract finalize_time and full_obj_server_crc32c from the response metadata in _AsyncReadObjectStream.open().

@chandra-siri chandra-siri requested a review from a team as a code owner May 26, 2026 20:53
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces tracking for object finalization and full object CRC32C checksums in the asynchronous read stream. The feedback focuses on simplifying the production code by removing logic added solely to accommodate unit test mocks (such as checking for a .seconds attribute). Instead, it is recommended to mock finalize_time as a standard datetime.datetime or None in the unit tests, which allows the production code to rely on standard isinstance checks.

Comment on lines +140 to +153
# Since full object checksum validation is only required for finalized objects,
# check finalize_time (which is DatetimeWithNanoseconds/datetime in production, or mocked in tests).
finalize_time = getattr(response.metadata, "finalize_time", None)
if finalize_time:
is_finalized_val = False
if isinstance(finalize_time, datetime.datetime):
is_finalized_val = True
elif hasattr(finalize_time, "seconds") and finalize_time.seconds > 0:
is_finalized_val = True

if is_finalized_val:
self.is_finalized = True
if hasattr(response.metadata, "checksums") and response.metadata.checksums:
self.full_obj_server_crc32c = response.metadata.checksums.crc32c
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Avoid adding production code logic (such as checking for a .seconds attribute) solely to satisfy unit test mocks. In production, finalize_time is a datetime.datetime (or DatetimeWithNanoseconds) object. By updating the unit tests to mock finalize_time as a datetime.datetime instance (or None when unfinalized), we can simplify this logic significantly and keep the production code clean.

Suggested change
# Since full object checksum validation is only required for finalized objects,
# check finalize_time (which is DatetimeWithNanoseconds/datetime in production, or mocked in tests).
finalize_time = getattr(response.metadata, "finalize_time", None)
if finalize_time:
is_finalized_val = False
if isinstance(finalize_time, datetime.datetime):
is_finalized_val = True
elif hasattr(finalize_time, "seconds") and finalize_time.seconds > 0:
is_finalized_val = True
if is_finalized_val:
self.is_finalized = True
if hasattr(response.metadata, "checksums") and response.metadata.checksums:
self.full_obj_server_crc32c = response.metadata.checksums.crc32c
# Since full object checksum validation is only required for finalized objects,
# check finalize_time (which is DatetimeWithNanoseconds/datetime in production).
finalize_time = getattr(response.metadata, "finalize_time", None)
if isinstance(finalize_time, datetime.datetime):
self.is_finalized = True
checksums = getattr(response.metadata, "checksums", None)
if checksums:
self.full_obj_server_crc32c = getattr(checksums, "crc32c", None)

Comment on lines +44 to +45
recv_response.metadata.finalize_time.seconds = 12345
recv_response.metadata.checksums.crc32c = 98765
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Mock finalize_time as an actual datetime.datetime instance instead of a mock with a .seconds attribute. This allows the production code to use standard isinstance checks without mock-specific workarounds. (Note: Ensure datetime is imported in this test file).

Suggested change
recv_response.metadata.finalize_time.seconds = 12345
recv_response.metadata.checksums.crc32c = 98765
recv_response.metadata.finalize_time = datetime.datetime(2024, 1, 1, tzinfo=datetime.timezone.utc)
recv_response.metadata.checksums.crc32c = 98765

recv_response.metadata = mock.MagicMock()
recv_response.metadata.generation = _TEST_GENERATION_NUMBER
recv_response.metadata.size = _TEST_OBJECT_SIZE
recv_response.metadata.finalize_time.seconds = 0 # NOT finalized!
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Set finalize_time to None to represent an unfinalized object, matching production behavior where the field is absent or unset.

Suggested change
recv_response.metadata.finalize_time.seconds = 0 # NOT finalized!
recv_response.metadata.finalize_time = None # NOT finalized!

@chandra-siri chandra-siri changed the title feat(storage): parse finalize_time and server crc32c in async object stream feat(storage): full object checksum: parse finalize_time and server crc32c in async object stream May 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant