Conversation
There was a problem hiding this comment.
Code Review
This pull request refactors timestamp parsing in Firebase Database functions to use a centralized utility, improving support for various ISO 8601 formats, including those without microsecond precision. It updates db_fn.py to use timestamp_conversion, refactors get_precision_timestamp in util.py for better robustness, and adds comprehensive test cases. Feedback includes correcting the docstring for get_precision_timestamp to reflect its actual return type and improving the error handling in util.py by providing a more descriptive error message and using the walrus operator for conciseness.
ajperel
left a comment
There was a problem hiding this comment.
I think the logic in this is solid, but as I got help from AI reviewing it suggested we could fix a long-standing but and simply the code at the same time. Let me know what you think?
| """Converts a nanosecond timestamp and returns a datetime object of the current time in UTC""" | ||
|
|
||
| # Separate the date and time part from the nanoseconds. | ||
| datetime_str, nanosecond_str = time.replace("Z", "").replace("z", "").split(".") |
There was a problem hiding this comment.
Not strictly in this PR but as I used AI to review this PR it called out a bug here:
It aggressively strips Z/z and then hardcodes tzinfo=_dt.timezone.utc. If Firebase ever delivers a NANOSECONDS timestamp containing a numeric offset (e.g., 2023-01-01T12:34:56.123456789+05:30), nanosecond_str becomes 123456789+05:30. The code slices [:6] correctly to get 123456 microseconds, but then it forces tzinfo=utc without shifting the hours, meaning the parsed datetime will be wrong by 5 hours and 30 minutes.
Note: Firebase commonly sends Z for nanosecond timestamps, which is why this might not be breaking in production right now, but it's a clear correctness gap compared to the microsecond path which handles offsets properly.
Do you know if we ever don't send 'Z'? Could be good to fix this too?
More generally the same AI agent suggested drastically simplifying the datetime code, which as a side effect would fix this bug by replacing the enum and all the conversion methods with:
def normalize_timestamp_string(time: str) -> str:
"""Truncates sub-second digits to a maximum of 6 (microseconds) to allow standard strptime parsing."""
if "." not in time:
return time
prefix, suffix = time.split(".", 1)
# suffix contains digits followed by timezone, e.g., "123456789Z" or "123456+05:30"
match = _re.match(r"(\d+)(.*)", suffix)
if not match:
raise ValueError(f"Invalid timestamp format: {time}")
digits, tz = match.groups()
if len(digits) > 6:
digits = digits[:6] # Truncate nanoseconds to microseconds
return f"{prefix}.{digits}{tz}"
def timestamp_conversion(time: str) -> _dt.datetime:
"""Converts an ISO 8601 timestamp and returns a timezone-aware datetime object."""
normalized_time = normalize_timestamp_string(time)
# Choose format based on whether it has a sub-second fraction
if "." in normalized_time:
return _dt.datetime.strptime(normalized_time, "%Y-%m-%dT%H:%M:%S.%f%z")
else:
return _dt.datetime.strptime(normalized_time, "%Y-%m-%dT%H:%M:%S%z")
I'm relatively new to this code but... I think it's right? Do you agree? If so, let's use this opportunity to both fix the bug and simplify things.
Resolves #257
Fixes RTDB event timestamp parsing by centralizing ISO 8601 handling and supporting timestamps without microsecond precision.
Adds coverage for the affected timestamp formats to make sure they keep working.