fix(server): keep schema ~create_time as Date after reload#3026
Conversation
Normalize server-side schema ~create_time userdata in SchemaElement so serializer reloads and fromMap paths keep the Date contract. Add SchemaElement, TextSerializer, and BinarySerializer coverage. Closes apache#3013
There was a problem hiding this comment.
Pull request overview
This PR fixes a schema reload regression where internal schema userdata Userdata.CREATE_TIME ("~create_time") can round-trip through backend JSON serializers as a String and come back as String after reload, violating the server-side contract that it should be a java.util.Date.
Changes:
- Added
SchemaElement.normalizeUserdataValue(key, value)to re-type~create_timestring values back intoDate(and throw a clearer error on invalid values). - Routed both
SchemaElement.userdata(String, Object)andSchemaElement.userdata(Userdata)through the normalization logic (and added a null-argument check for the bulk setter). - Added regression tests for normalization behavior and for Text/Binary serializer round-trips, and registered them in
UnitTestSuite.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/schema/SchemaElement.java | Normalizes ~create_time userdata on set (single and bulk), ensuring reload paths restore Date instead of String. |
| hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/core/SchemaElementTest.java | Unit coverage for normalization behavior, idempotency, invalid strings, and null bulk userdata validation. |
| hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/serializer/TextSerializerTest.java | Regression test verifying TextSerializer schema round-trip keeps CREATE_TIME as Date. |
| hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/serializer/BinarySerializerTest.java | Regression test verifying BinarySerializer schema round-trip keeps CREATE_TIME as Date. |
| hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/UnitTestSuite.java | Registers the new unit tests in the suite. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
imbajin
left a comment
There was a problem hiding this comment.
Reviewed correctness, design, and tests. No blocking issues — fix is correct and narrowly scoped, all backend reload paths (Text / Binary / Mysql / Cassandra / HBase / Postgres / Palo / Hstore + the four fromMap paths) route through the new helper, and DateUtil.parse is symmetric with the "yyyy-MM-dd HH:mm:ss.SSS" format used by JsonUtil/HugeGraphSONModule. Inline comments cover important refinements (helper placement, public setter docs, test depth). A few additional minor notes below that don't have a clean inline target:
- 🧹
removeUserdata(Userdata)(SchemaElement.java:114-118) lacks a null check — addingE.checkArgumentNotNull(userdata, ...)keeps the bulk setters symmetric with the new check onuserdata(Userdata). - 🧹 Behavior change worth a release note: any REST client that posts
~create_timeas a JSON string in user-data now silently receives aDate(or anIllegalArgumentExceptionif malformed).~create_timeis a server-reserved key (~prefix, written bySchemaTransaction.setCreateTimeIfNeeded), so risk is small, but flagging it in the PR body / changelog is helpful. - 🧹 Follow-up tracking:
Userdata.DEFAULT_VALUEhas the same Jackson type-erasure problem (e.g., a property key whose data type isDatewill reload its default value asString). The PR description already defers thehugegraph-structmirror — please addDEFAULT_VALUEto the same follow-up list, since the root cause and fix shape are identical.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #3026 +/- ##
============================================
- Coverage 36.15% 32.44% -3.72%
- Complexity 403 493 +90
============================================
Files 812 803 -9
Lines 68307 68053 -254
Branches 8927 8907 -20
============================================
- Hits 24694 22077 -2617
- Misses 40978 43498 +2520
+ Partials 2635 2478 -157 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
imbajin
left a comment
There was a problem hiding this comment.
No blocking issues. The core fix (normalizing CREATE_TIME String→Date on deserialization) is correct and the main round-trip paths are tested. Four non-blocking observations below.
| "was " + (value == null ? "null" : value.getClass()), | ||
| value instanceof Date); | ||
| Assert.assertEquals(created, value); | ||
| } |
There was a problem hiding this comment.
MysqlSerializer and CassandraSerializer are also affected by the same CREATE_TIME deserialization gap:
"
userdata(String, Object)covers serializer reload paths such as TextSerializer, BinarySerializer, MysqlSerializer, and CassandraSerializer"
Only TextSerializer and BinarySerializer have round-trip tests added. Adding equivalent testPropertyKeyUserdataCreateTimeRoundTripsAsDate tests for MysqlSerializer and CassandraSerializer would close this coverage gap for the stated fix surface.
There was a problem hiding this comment.
Holding off, with reasoning:
MysqlSerializer.readUserdata (L167) and CassandraSerializer.readUserdata
(L222) both call the same schema.userdata(key, value) setter as
TextSerializer/BinarySerializer. That setter now routes through the
single Userdata.put normalization path, which is already exercised by:
TextSerializerTest/BinarySerializerTestround-tripsSchemaElementTest(constructor, single setter, bulk,fromMap)
There's no existing unit-test harness for the SQL serializers — adding one
means backend/mock infra for zero new code paths. The marginal coverage
doesn't justify it.
Better closure for the shared logic: a direct UserdataTest.normalizeValue
unit test, serializer-agnostic. Happy to add that as a follow-up; proposing
not to gate this PR on it. Pushback welcome if you see a path difference
I'm missing.
…holder The builder accumulates userdata via Userdata.put() before eliminate() runs, so `.userdata(CREATE_TIME, "").eliminate()` parsed "" as a date and threw before the key-only removal path. Pass a blank ~create_time through unchanged; non-empty malformed values still throw on the add path, so the existing contract is unchanged.
|
@imbajin WDYT about this approach? |
imbajin
left a comment
There was a problem hiding this comment.
LGTM, thanks for addressing the eliminate placeholder case. The current approach looks good to me.
One small follow-up: the PR description still mentions SchemaElement.normalizeUserdataValue, but the final implementation is centered on Userdata.normalizeValue / Userdata.put(). It would be good to update the description so it matches the current code.
|
Changed |
|
Thanks @imbajin for lead the direction of this PR with your sharp and precise review |
close [Bug] schema ~create_time userdata round-trips Date as String after cache reload #3013
Userdata.CREATE_TIME("~create_time") is written asjava.util.Dateby
SchemaTransaction.setCreateTimeIfNeeded(), then persisted through backendserializers as JSON. On reload, userdata is deserialized through raw
Map.classpaths, so Jackson cannot restore the value type andschemaElement.userdata().get(Userdata.CREATE_TIME)can come back asString.That breaks callers and existing tests that expect
~create_timeto keep theserver-side
Datecontract after a schema is reloaded from backend storage.Main Changes
Added
Userdata.normalizeValue(key, value)as a static helper on theUserdataclass.Userdata.CREATE_TIMEstring values back withDateUtil.parse(...).Datevalues and all other userdata keys unchanged.IllegalArgumentExceptionif a~create_timestring is invalid.Routed both userdata setters through
Userdata.put(key, value), which appliesnormalization before storing:
userdata(String, Object)covers serializer reload paths such asTextSerializer,BinarySerializer,MysqlSerializer, andCassandraSerializer.userdata(Userdata)covers schemafromMap()hydration paths forPropertyKey,VertexLabel,EdgeLabel, andIndexLabel.Added a null check to
userdata(Userdata)for consistent argument validation.Added regression coverage for schema userdata normalization and serializer
round trips.
This PR is intentionally limited to the server-side
SchemaElement. The mirrorclass in
hugegraph-structhas a similar userdata shape and can be handled in aseparate follow-up PR.
Verifying these changes
SchemaElementTestcovers single-key normalization, bulk normalization,idempotency for
Date, invalid~create_timestrings, null bulk userdata,and non-
CREATE_TIMEkeys.TextSerializerTest#testPropertyKeyUserdataCreateTimeRoundTripsAsDateverifies the text serializer schema round trip keeps
CREATE_TIMEasDate.BinarySerializerTest#testPropertyKeyUserdataCreateTimeRoundTripsAsDateverifies the binary serializer schema round trip keeps
CREATE_TIMEasDate.UnitTestSuite.Does this PR potentially affect the following parts?
The public SchemaElement.userdata(...) signatures are unchanged. The behavior change is limited to internal userdata normalization for Userdata.CREATE_TIME, handled via
Userdata.normalizeValue/Userdata.put().Documentation Status
Notes
Userdata.CREATE_TIME(~create_time) only.~create_timeis a server-reserved userdata key; valid string values are normalized back tojava.util.Date, while malformed values now fail withIllegalArgumentException.Userdata.DEFAULT_VALUE(~default_value) reloads and checking the same raw userdata deserialization gap inhugegraph-struct.