Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
86 changes: 85 additions & 1 deletion cloudevent.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,14 @@
package cloudevent

import (
"encoding/base64"
"encoding/json"
"fmt"
"mime"
"strings"
"time"

"github.com/tidwall/sjson"
)
Comment on lines +9 to 13
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sjson is newly imported/used here, which conflicts with the PR goal of removing the tidwall/sjson dependency. If the intent is dependency removal, RawEvent.MarshalJSON needs to be rewritten without sjson (e.g., marshal header to map and set data/data_base64 before json.Marshal).

Copilot uses AI. Check for mistakes.

const (
Expand Down Expand Up @@ -98,7 +104,85 @@ type CloudEvent[A any] struct {
}

// RawEvent is a cloudevent with a json.RawMessage data field.
type RawEvent = CloudEvent[json.RawMessage]
// It supports both "data" and "data_base64" (CloudEvents JSON spec).
type RawEvent struct {
CloudEventHeader
Data json.RawMessage `json:"data,omitempty"`

// DataBase64 is the raw "data_base64" string when the event was received with
// data_base64 (CloudEvents spec). When set, MarshalJSON emits data_base64 for
// round-trip; otherwise wire form is chosen from DataContentType and Data.
DataBase64 string `json:"data_base64,omitempty"`
Comment on lines +110 to +115
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RawEvent struct has JSON struct tags on both Data and DataBase64 fields (lines 107 and 112), but it also implements custom MarshalJSON/UnmarshalJSON methods. The struct tags will be ignored during marshaling/unmarshaling because of the custom methods.

While this isn't technically a bug (custom marshalers take precedence), having the tags could be misleading to developers who might expect them to be used. Consider removing the tags or adding a comment indicating they're documentation-only since custom marshalers are used.

Copilot uses AI. Check for mistakes.
}
Comment on lines 106 to +116
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change from type RawEvent = CloudEvent[json.RawMessage] to a full struct is a breaking API change. Code that previously constructed RawEvent using the CloudEvent[json.RawMessage] type will need to be updated.

For example, code like var re RawEvent = CloudEvent[json.RawMessage]{...} or direct field access patterns that relied on RawEvent being a type alias will break. While there are no internal usages in this repository, external consumers of this library will be affected.

Consider:

  1. Documenting this breaking change in a CHANGELOG or migration guide
  2. Bumping the major version if using semantic versioning
  3. Providing clear upgrade instructions for library consumers

Copilot uses AI. Check for mistakes.

// BytesForSignature returns the bytes that were signed (wire form of data or data_base64).
// Use for signature verification; not the same as Data when the CE used data_base64.
func (r RawEvent) BytesForSignature() []byte {
if r.DataBase64 != "" {
return []byte(r.DataBase64)
Comment on lines +118 to +122
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BytesForSignature is new behavior but there are no tests asserting its output for both wire forms (data vs data_base64). Adding tests here would lock in the intended semantics for signature verification and prevent regressions (especially around base64 vs decoded bytes).

Copilot uses AI. Check for mistakes.
}
return r.Data
}

// UnmarshalJSON implements json.Unmarshaler so that both "data" and "data_base64"
// are supported; Data is always set to the resolved payload bytes.
func (r *RawEvent) UnmarshalJSON(data []byte) error {
var dataRaw json.RawMessage
var dataBase64 string
header, err := unmarshalCloudEventWithPayload(data, func(d json.RawMessage, b64 string) error {
dataRaw = d
dataBase64 = b64
return nil
})
if err != nil {
return err
}
r.CloudEventHeader = header
if dataRaw != nil && dataBase64 != "" {
return fmt.Errorf("cloudevent: both \"data\" and \"data_base64\" present; only one allowed")
}
if dataBase64 != "" {
decoded, err := base64.StdEncoding.DecodeString(dataBase64)
Comment on lines +141 to +145
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RawEvent treats presence of "data_base64" as dataBase64 != "", so a JSON payload with "data_base64":"" (present-but-empty) is treated as if the field were absent. That loses round-trip fidelity and affects signature bytes. Consider tracking presence separately (e.g., *string or a boolean) so empty base64 payloads are still recognized and preserved.

Copilot uses AI. Check for mistakes.
if err != nil {
return err
}
r.Data = decoded
r.DataBase64 = dataBase64
} else {
r.Data = dataRaw
r.DataBase64 = ""
}
Comment on lines 129 to 154
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RawEvent.UnmarshalJSON does not detect the invalid case where both "data" and "data_base64" are present. CloudEvents JSON requires that at most one of these fields is set; consider returning an error when both are provided to avoid ambiguous precedence and signature verification mismatches.

Copilot uses AI. Check for mistakes.
return nil
}

// IsJSONDataContentType returns true if the MIME type indicates a JSON payload.
// Matches "application/json" and any "+json" suffix type (e.g. "application/cloudevents+json").
func IsJSONDataContentType(ct string) bool {
parsed, _, err := mime.ParseMediaType(strings.TrimSpace(ct))
return err == nil && (parsed == "application/json" || strings.HasSuffix(parsed, "+json"))
Comment on lines +158 to +162
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RawEvent wire-form selection (IsJSONDataContentType + MarshalJSON choosing "data" vs "data_base64", including handling application/json; charset=utf-8 and application/*+json) is new behavior but isn’t covered by tests in cloudevent_test.go. Add marshal/unmarshal tests that assert the chosen field name and round-trip preservation.

Copilot uses AI. Check for mistakes.
}

// MarshalJSON implements json.Marshaler. Uses DataContentType to choose wire form:
// application/json -> "data"; otherwise -> "data_base64" (CloudEvents spec).
func (r RawEvent) MarshalJSON() ([]byte, error) {
data, err := json.Marshal(r.CloudEventHeader)
if err != nil {
return nil, err
}
if len(r.Data) > 0 || r.DataBase64 != "" {
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When both Data is empty (len(r.Data) == 0) and DataBase64 is empty (""), the entire if block is skipped and neither "data" nor "data_base64" is added to the output. This is correct per CloudEvents spec (both fields are optional).

However, there's a subtle case: if DataBase64 is set to "" explicitly (empty string) and len(r.Data) > 0, it will enter the block but the DataBase64 != "" check on line 163 will be false, proceeding to the else clause. This is correct behavior, but the condition on line 162 could be simplified to just check len(r.Data) > 0 since an empty DataBase64 string is equivalent to it not being set.

Copilot uses AI. Check for mistakes.
if r.DataBase64 != "" {
data, err = sjson.SetBytes(data, "data_base64", r.DataBase64)
} else if IsJSONDataContentType(r.DataContentType) || (r.DataContentType == "" && json.Valid(r.Data)) {
data, err = sjson.SetRawBytes(data, "data", r.Data)
} else {
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition on line 166 checks two scenarios: IsJSONDataContentType(r.DataContentType) || (r.DataContentType == "" && json.Valid(r.Data)). The second part (r.DataContentType == "" && json.Valid(r.Data)) means that when DataContentType is empty and the data happens to be valid JSON, it will emit "data" instead of "data_base64".

However, this creates an ambiguity: if someone has binary data that happens to be valid JSON (e.g., the bytes [1,2,3] which is both valid binary and valid JSON), the marshaling behavior changes based on whether DataContentType is set. This might surprise users and could break round-trip guarantees for edge cases.

Consider documenting this behavior clearly, or requiring explicit DataContentType to avoid ambiguity.

Suggested change
} else {
} else {
// NOTE: When DataContentType is empty and r.Data happens to be valid JSON,
// we emit a structured JSON "data" field instead of "data_base64". This means
// that for byte slices which are both valid JSON and valid binary data
// (for example: []byte("[1,2,3]")), the wire representation changes based on
// whether DataContentType is set:
// - DataContentType == "" -> "data"
// - DataContentType is non-JSON -> "data_base64"
//
// This is intentional for convenience, but can be surprising and may affect
// strict round-trip guarantees for edge cases. Callers that care about
// treating such payloads as opaque binary data should set a non-JSON
// DataContentType explicitly to force "data_base64".

Copilot uses AI. Check for mistakes.
data, err = sjson.SetBytes(data, "data_base64", base64.StdEncoding.EncodeToString(r.Data))
}
if err != nil {
return nil, err
}
Comment on lines 172 to 182
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the CloudEvents JSON format specification, "data" and "data_base64" are mutually exclusive attributes - only one should be present in a CloudEvent JSON representation. However, the MarshalJSON implementation here could theoretically allow both to be emitted if the headerToMap is modified externally or if there's a bug.

The current implementation chooses which field to emit based on conditions (lines 163-174), but it doesn't actively prevent both from being set in the output map. While this is unlikely given the current code flow, consider adding a safeguard or assertion to ensure mutual exclusivity is maintained.

Copilot uses AI. Check for mistakes.
}
return data, nil
}

// Equals returns true if the two CloudEventHeaders share the same IndexKey.
func (c *CloudEventHeader) Equals(other CloudEventHeader) bool {
Expand Down
106 changes: 86 additions & 20 deletions cloudevent_json.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ import (
"github.com/tidwall/sjson"
)

var definedCloudeEventHdrFields = getJSONFieldNames(reflect.TypeOf(CloudEventHeader{}))
var definedCloudeEventHdrFields = getJSONFieldNames(reflect.TypeFor[CloudEventHeader]())

type cloudEventHeader CloudEventHeader

Expand All @@ -19,9 +19,8 @@ func (c *CloudEvent[A]) UnmarshalJSON(data []byte) error {
return err
}

// MarshalJSON implements custom JSON marshaling for CloudEventHeader.
// MarshalJSON implements custom JSON marshaling for CloudEvent[A].
func (c CloudEvent[A]) MarshalJSON() ([]byte, error) {
// Marshal the base struct
data, err := json.Marshal(c.CloudEventHeader)
if err != nil {
return nil, err
Expand All @@ -42,21 +41,18 @@ func (c *CloudEventHeader) UnmarshalJSON(data []byte) error {

// MarshalJSON implements custom JSON marshaling for CloudEventHeader.
func (c CloudEventHeader) MarshalJSON() ([]byte, error) {
// Marshal the base struct
aux := (cloudEventHeader)(c)
aux.SpecVersion = SpecVersion
data, err := json.Marshal(aux)
if err != nil {
return nil, err
}
// Add all extras using sjson]
for k, v := range c.Extras {
data, err = sjson.SetBytes(data, k, v)
if err != nil {
return nil, err
}
}

return data, nil
}

Expand Down Expand Up @@ -90,30 +86,100 @@ func getJSONFieldNames(t reflect.Type) map[string]struct{} {

// unmarshalCloudEvent unmarshals the CloudEventHeader and data field.
func unmarshalCloudEvent(data []byte, dataFunc func(json.RawMessage) error) (CloudEventHeader, error) {
return unmarshalCloudEventWithPayload(data, func(dataRaw json.RawMessage, _ string) error {
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In unmarshalCloudEvent (used by CloudEvent[A]), payloadFunc is invoked even when the input only contains "data_base64". That results in dataFunc being called with a nil json.RawMessage, which makes CloudEvent[A].setDataField attempt json.Unmarshal(nil, &c.Data) and fail. Consider changing the wrapper to only call dataFunc when dataRaw is non-nil (and ignore data_base64 for typed CloudEvent), or explicitly decode data_base64 into a "data" RawMessage when you want CloudEvent[A] to support data_base64.

Suggested change
return unmarshalCloudEventWithPayload(data, func(dataRaw json.RawMessage, _ string) error {
return unmarshalCloudEventWithPayload(data, func(dataRaw json.RawMessage, _ string) error {
if dataRaw == nil {
// No "data" field present (possibly only "data_base64"); for typed CloudEvent[A],
// ignore this and do not call dataFunc to avoid unmarshalling nil.
return nil
}

Copilot uses AI. Check for mistakes.
return dataFunc(dataRaw)
})
}

// unmarshalCloudEventWithPayload unmarshals the CloudEventHeader and returns both
// "data" and "data_base64" for RawEvent.
// Single-pass: decode only into map[string]json.RawMessage, then fill header from raw fields.
func unmarshalCloudEventWithPayload(data []byte, payloadFunc func(dataRaw json.RawMessage, dataBase64 string) error) (CloudEventHeader, error) {
c := CloudEventHeader{}
aux := cloudEventHeader{}
// Unmarshal known fields directly into the struct
if err := json.Unmarshal(data, &aux); err != nil {
return c, err
}
aux.SpecVersion = SpecVersion
c = (CloudEventHeader)(aux)
// Create a map to hold all JSON fields
rawFields := make(map[string]json.RawMessage)
if err := json.Unmarshal(data, &rawFields); err != nil {
return c, err
}

// Separate known and unknown fields
// Populate known header fields from raw values (one small unmarshal per field).
if raw, ok := rawFields["id"]; ok && len(raw) > 0 {
if err := json.Unmarshal(raw, &c.ID); err != nil {
return c, err
}
}
if raw, ok := rawFields["source"]; ok && len(raw) > 0 {
if err := json.Unmarshal(raw, &c.Source); err != nil {
return c, err
}
}
if raw, ok := rawFields["producer"]; ok && len(raw) > 0 {
if err := json.Unmarshal(raw, &c.Producer); err != nil {
return c, err
}
}
c.SpecVersion = SpecVersion
if raw, ok := rawFields["subject"]; ok && len(raw) > 0 {
if err := json.Unmarshal(raw, &c.Subject); err != nil {
return c, err
}
}
if raw, ok := rawFields["time"]; ok && len(raw) > 0 {
if err := json.Unmarshal(raw, &c.Time); err != nil {
return c, err
}
}
if raw, ok := rawFields["type"]; ok && len(raw) > 0 {
if err := json.Unmarshal(raw, &c.Type); err != nil {
return c, err
}
}
if raw, ok := rawFields["datacontenttype"]; ok && len(raw) > 0 {
if err := json.Unmarshal(raw, &c.DataContentType); err != nil {
return c, err
}
}
if raw, ok := rawFields["dataschema"]; ok && len(raw) > 0 {
if err := json.Unmarshal(raw, &c.DataSchema); err != nil {
return c, err
}
}
if raw, ok := rawFields["dataversion"]; ok && len(raw) > 0 {
if err := json.Unmarshal(raw, &c.DataVersion); err != nil {
return c, err
}
}
if raw, ok := rawFields["signature"]; ok && len(raw) > 0 {
if err := json.Unmarshal(raw, &c.Signature); err != nil {
return c, err
}
}
if raw, ok := rawFields["tags"]; ok && len(raw) > 0 {
if err := json.Unmarshal(raw, &c.Tags); err != nil {
return c, err
}
}

var dataRaw json.RawMessage
var dataBase64 string
if raw, ok := rawFields["data_base64"]; ok && len(raw) > 0 {
if err := json.Unmarshal(raw, &dataBase64); err != nil {
return c, err
}
}
if raw, ok := rawFields["data"]; ok {
dataRaw = raw
}
if dataRaw != nil || dataBase64 != "" {
if err := payloadFunc(dataRaw, dataBase64); err != nil {
return c, err
}
}

for key, rawValue := range rawFields {
if _, ok := definedCloudeEventHdrFields[key]; ok {
// Skip defined fields
continue
}
if key == "data" {
if err := dataFunc(rawValue); err != nil {
return c, err
}
if key == "data" || key == "data_base64" {
continue
}
if c.Extras == nil {
Expand Down
35 changes: 35 additions & 0 deletions cloudevent_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -345,3 +345,38 @@ func TestCloudEventHeader_UnmarshalJSON(t *testing.T) {
})
}
}

func TestRawEvent_UnmarshalJSON_InvalidBase64(t *testing.T) {
t.Parallel()
jsonStr := `{"id":"1","source":"s","type":"t","data_base64":"$$not-base64$$"}`
var ev cloudevent.RawEvent
err := json.Unmarshal([]byte(jsonStr), &ev)
require.Error(t, err, "expected error for invalid base64 in data_base64")
}

func TestRawEvent_UnmarshalJSON_BothDataAndDataBase64(t *testing.T) {
t.Parallel()
jsonStr := `{"id":"1","source":"s","type":"t","data":{"x":1},"data_base64":"Zm9v"}`
var ev cloudevent.RawEvent
err := json.Unmarshal([]byte(jsonStr), &ev)
require.Error(t, err, "expected error when both data and data_base64 are present")
assert.Contains(t, err.Error(), "both")
}

func TestCloudEvent_UnmarshalJSON_InvalidTime(t *testing.T) {
t.Parallel()
jsonStr := `{"id":"1","source":"s","type":"t","time":12345,"data":{"message":"hi","count":1}}`
var ev cloudevent.CloudEvent[TestData]
err := json.Unmarshal([]byte(jsonStr), &ev)
require.Error(t, err, "expected error for invalid time field type")
}

func TestCloudEvent_UnmarshalJSON_NoDataField(t *testing.T) {
t.Parallel()
jsonStr := `{"id":"1","source":"s","type":"t","subject":"sub","time":"2025-01-01T00:00:00Z"}`
var ev cloudevent.CloudEvent[TestData]
err := json.Unmarshal([]byte(jsonStr), &ev)
require.NoError(t, err, "CloudEvent without data field should succeed")
assert.Equal(t, "1", ev.ID)
assert.Equal(t, TestData{}, ev.Data)
}