Open
Conversation
# Conflicts: # .github/workflows/build.yaml # pom.xml
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Copied from the readme
Overview
The purpose of this service is to filter unchanged specimen and media events from the source system.
While changed data should progress through the ingestion pipeline, unchanged events only need their
last_checkedvalue to be updated.This service sits between the translator and the name usage service (or processing service) in the
ingestion pipeline.
The translator's purpose is to translate incoming data from the source system to openDS. It includes
the original data in the specimen and media events, as well as the translated openDS version.
This service first checks if an incoming specimen/media exists or not, using the object's unique
local identifier. For specimens, it is the normalised physical specimen id. For media, it is the
access URI.
If the relevant objects exist in the database, we need to check if it is an update from the source
system, or if the data has not changed. This service assesses if data has been changed by comparing
the current values in the
original_datacolumn in the database. This column is updated every timethe source system data changes; it is the raw data from the source system as captured by the
translator. If the incoming
original_datadiffers from what is in the database, we consider thisan update.
RabbitMQ Queues
Consumes from:
source-system-data-checker-queue(from translator)Publishes to:
name-usage-service-queue(to name usage service)Distinguishing between changes in Specimens and Media
A specimen event may include zero or more media objects. An unchanged specimen may not necessarily
mean its media are unchanged, and vice versa. To address this, we may publish objects to one of
three queues:
stops here.
ingestion pipeline,
name-usage-serviceto the
media-queue, bypassing the name usage servicethe
name-usage-service-specimenqueue.