Skip to content

Conversation

@pengpeng-lu
Copy link
Contributor

@pengpeng-lu pengpeng-lu commented Oct 2, 2025

Modify TypeRepository to allow registering 2 types (only different in nullability) in the same schema template.
Use Case:
If we have struct A, and array of A in the same template, A is nullable, while A in the array is non-nullable. This should be allowed.

@pengpeng-lu pengpeng-lu added the enhancement New feature or request label Oct 16, 2025
*/
@Nullable
private Type findCanonicalTypeForStructure(@Nonnull final Type type) {
for (Map.Entry<Type, String> entry : typeToNameMap.entrySet()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now introduces an Θ(n) step into creating the type repository, which would mean that creating a repository with n) types is now Θ(n2).

Did you consider an alternative design where we enforce that only the notNullable (or nullable--it doesn't matter except for certain edge cases like Type.Null) variants of each type are kept in the TypeRepository? From my reading of the type repository, we don't actually need to keep both the nullable and non-nullable variants of the type at all, as we only use it to associate a type to a Protobuf descriptor, and those are the same for the two different variants.

I think the basic shape of the solution would look something like:

  1. Modify the builder to store only the (non) nullable version of each type
  2. Wrap the access paths to the typeToName bi-map with logic to first transform the type to its (non) nullable variant


if (!targetMap.containsKey(type)) {
// Check if the opposite nullability variant exists and get its protobuf name
final Type oppositeNullabilityType = type.isNullable() ? type.notNullable() : type.withNullability(true);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be simplified to:

final Type oppositeNullabilityType = type.withNullability(!type.isNullable());

*/
@Nonnull
private final Map<Type, String> typeToNameMap;
private final BiMap<Type, String> nonNullableTypeToNameMap;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think splitting the maps has made the logic in registerTypeToTypeNameMapping and addTypeIfNeeded a little too complicated. My suggestion was actually to keep everything in one map, but to only put the nullable variants into it. Then I think the only thing that needs to change is that anything that accesses the map needs to call nullable() to create the nullable variant before it goes to look things up from the map.

Comment on lines 491 to 497
// Type.Null is special - it can only be nullable and has no non-nullable variant
if (type instanceof Type.Null) {
if (!nullableTypeToNameMap.containsKey(type)) {
type.defineProtoType(this);
}
return this;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe Relations and Any are also special in that they also have a fixed nullability. (Relations are always non-null, and Any is always nullable.)

It's a bit weird, as none of those types actually override defineProtoType, so they won't end up adding anything to the type repository anyway. But I suppose it doesn't hurt to make sure we cover for them. Potentially a method like canonicalizeNullability(Type) would be useful that returned (1) a non-nullable Relation and None, and (2) a nullable value for everything else. Then that would be the type we probed for.

Comment on lines 573 to 575
final Type existingTypeForName = targetMap.inverse().get(protoTypeName);
if (existingTypeForName != null && !existingTypeForName.equals(type)) {
throw new IllegalArgumentException(String.format(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is actually already handled by targetMap.put, though I'm not necessarily opposed to overriding the error message.

What I think we are missing here is coverage of the case:

  1. You insert a (let's say) nullable type t with a name x
  2. You insert a type u that is not nullable and !u.nullable().equals(t) but you also assign it the name x

I think you'd need to add a check that oppositeMap.inverse().get(protoTypeName) was either null or equal to oppositeNullabilityType. I also think that this would naturally get covered if there were just one BiMap with a canonicalized nullability

import com.google.common.collect.HashBiMap;
import com.google.common.collect.ImmutableMap;
import com.google.common.collect.Maps;
import com.google.common.collect.ImmutableBiMap;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commenting here some unit tests that should be added to TypeRepositoryTest:

  1. A simple unit test adding both a nullable and non-nullable variant of the same type with the same name
  2. A test like addSameTypeMultipleTimesShouldNotCreateMultipleMessageTypes, but it adds a nullable and non-nullable variant of the same type
  3. Ideally, a variant of the above with an enum
  4. Check that if we can't add two copies of the same type (even if they have different nullability) with different names. (I think this one is handled correctly by the code already.)
  5. Check what happens if we add two types with the same name and different nullability and different contents. This should be an error, but I think it's currently not handled correctly
  6. Creating a builder using one nullability of a type, and then calling getProtoTypeName or newMessageBuilder with the other nullability. I believe this currently throws an error, though I'm not sure it needs to
  7. Something like addPrimitiveTypeIsNotAllowed, but also including Any, Null, None, and Relation types. It would be good to also update addPrimitiveTypeIsNotAllowed while you're there, because the assertion is off. That doesn't throw anymore, so having the catch is confusing and muddies what the assertion actually is

Maybe these should be in some schema file, but it is probably going to be easier to get the coverage in unit testing

Copy link
Collaborator

@alecgrieser alecgrieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! I had a few minor things that could be addressed if you feel like it, but I like the way this has shaped up!

if (type.getTypeCode() == Type.TypeCode.RELATION || type.getTypeCode() == Type.TypeCode.NONE) {
return type.notNullable();
} else {
return type.withNullability(true);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit:

Suggested change
return type.withNullability(true);
return type.nullable();

}

// Test special types that should not be allowed
List<Type> unsupportedTypes = List.of(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason that primitives and unsupported types are being separated out by this test case? Unless I'm missing something, it looks like the asserts are the same. Is it just that the primitive types put their type inside the loop?

It would be nice to have the asserts shared between the two test cases. One way to do this would be to have this be a parameterized test based on the Type, and then have a helper method generate the different primitive and "unsupported" types. But it would also be fine just to collect them in a list, and then have them all call the asserts. It may also make sense to add (1) both nullable and non-null primitives and (2) Type.UUID_NULL_INSTANCE and Type.UUID_NON_NULL_INSTANCE

List<Type> unsupportedTypes = List.of(
Type.any(), // ANY type
Type.nullType(), // NULL type
new Type.None(), // NONE type
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor:

Suggested change
new Type.None(), // NONE type
Type.NONE, // NONE type

@github-actions
Copy link

📊 Metrics Diff Analysis Report

Summary

  • New queries: 5
  • Dropped queries: 0
  • Plan changed + metrics changed: 0
  • Plan unchanged + metrics changed: 0
ℹ️ About this analysis

This automated analysis compares query planner metrics between the base branch and this PR. It categorizes changes into:

  • New queries: Queries added in this PR
  • Dropped queries: Queries removed in this PR. These should be reviewed to ensure we are not losing coverage.
  • Plan changed + metrics changed: The query plan has changed along with planner metrics.
  • Metrics only changed: Same plan but different metrics

The last category in particular may indicate planner regressions that should be investigated.

New Queries

Count of new queries by file:

  • yaml-tests/src/test/resources/struct-type-nullability-variants.metrics.yaml: 5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants