While testing out the DMS transformation, I made a few observations I want to share. Each issue still needs to be specified further with proper steps to reproduce, this is more of a brain dump meta collection. All observations were made on a PostgreSQL RDS source.
DELETE without primary key
Issuing a DELETE FROM my_table (aka truncate) on the source fails to replicate with Unable to invoke DML operation without primary key information. Currently, a WHERE clause with the primary key is required:
https://github.com/daq-tools/commons-codec/blob/e9f17e90bd4ad044cb6e4d942868dd868af26e1b/src/commons_codec/transform/aws_dms.py#L88
ALTER TABLE ADD COLUMN
I tried an ALTER TABLE my_table ADD COLUMN last_updated TIMESTAMP DEFAULT, which was ignored as being an unsupported operation. Subsequent UPDATE my_table SET last_updated = NOW() WHERE id = 1; therefore also fail.
What can be a strategy to handle such situations? Do we need to manually migrate the CrateDB table structure?
Similarly, if my PostgreSQL table looks like this:
CREATE TABLE public.locations (
device_id TEXT NOT NULL PRIMARY KEY,
location TEXT,
attributes JSONB
);
I also cannot have the addition of a new JSON property replicated (again no primary key): UPDATE locations SET attributes['last_updated'] = TO_JSONB(NOW());
Data types within OBJECT
If there is a JSON in PostgreSQL with a timestamp, such as {"last_updated": "2024-08-09T11:54:36.815606Z"}, the OBJECT in CrateDB recognizes last_updated as TEXT. Since there is no type information in the JSON, there is probably nothing the CDC translator can do. For real-world use cases, we need to think of some strategy (run ALTER TABLE ADD COLUMN in CrateDB upfront?).
While testing out the DMS transformation, I made a few observations I want to share. Each issue still needs to be specified further with proper steps to reproduce, this is more of a brain dump meta collection. All observations were made on a PostgreSQL RDS source.
DELETE without primary key
Issuing a
DELETE FROM my_table(aka truncate) on the source fails to replicate withUnable to invoke DML operation without primary key information. Currently, aWHEREclause with the primary key is required:https://github.com/daq-tools/commons-codec/blob/e9f17e90bd4ad044cb6e4d942868dd868af26e1b/src/commons_codec/transform/aws_dms.py#L88
ALTER TABLE ADD COLUMN
I tried an
ALTER TABLE my_table ADD COLUMN last_updated TIMESTAMP DEFAULT, which was ignored as being an unsupported operation. SubsequentUPDATE my_table SET last_updated = NOW() WHERE id = 1;therefore also fail.What can be a strategy to handle such situations? Do we need to manually migrate the CrateDB table structure?
Similarly, if my PostgreSQL table looks like this:
I also cannot have the addition of a new JSON property replicated (again no primary key):
UPDATE locations SET attributes['last_updated'] = TO_JSONB(NOW());Data types within OBJECT
If there is a
JSONin PostgreSQL with a timestamp, such as{"last_updated": "2024-08-09T11:54:36.815606Z"}, theOBJECTin CrateDB recognizeslast_updatedasTEXT. Since there is no type information in the JSON, there is probably nothing the CDC translator can do. For real-world use cases, we need to think of some strategy (runALTER TABLE ADD COLUMNin CrateDB upfront?).