Skip to content

feat(ingest): Add host field for INSDC sequences during ingest#6534

Draft
maverbiest wants to merge 7 commits into
mainfrom
host-field-ingest
Draft

feat(ingest): Add host field for INSDC sequences during ingest#6534
maverbiest wants to merge 7 commits into
mainfrom
host-field-ingest

Conversation

@maverbiest
Copy link
Copy Markdown
Contributor

@maverbiest maverbiest commented May 29, 2026

resolves #6533 #6295

Screenshot

PR Checklist

  • All necessary documentation has been adapted.
  • The implemented feature is covered by appropriate, automated tests.
  • Any manual testing that has been done is documented (i.e. what exactly was tested?)

🚀 Preview: Add preview label to enable

@claude claude Bot added the ingest Ingest pipeline label May 29, 2026
@anna-parker
Copy link
Copy Markdown
Contributor

anna-parker commented May 29, 2026

@maverbiest I would do this in three steps:

  1. have ingest submit new sequences with host and not hostTaxonId and hostScientificName (modify hash code correctly so that the hash doesnt change)
  2. run a db surgery with an sql command to modify the data in the code (this shouldnt be in db migrations IMO as this is code for data that not every loculus user will have)
  3. remove prepro code that handles the INSDC special case.

@maverbiest
Copy link
Copy Markdown
Contributor Author

@anna-parker okay cool, thanks! I will get rid of the DB migration then and we can run it as a surgery. Regarding you point 1., could you clarify what you mean by 'modify the hash'? I thought the aim was to ensure the hash doesn't change so we don't trigger revisions? What I'm currently doing on this branch is collapsing the taxon id and scientific name fields into a single host field after the hash is computed, but could be I misunderstood

@anna-parker
Copy link
Copy Markdown
Contributor

'modify the hash'? I thought the aim was to ensure the hash doesn't change so we don't trigger revisions?

yes I was unclear, I meant to modify the code for the hash calculation so that it doesnt change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ingest Ingest pipeline

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inconstent host processing for INSDC and direct submissions

3 participants