Skip to content

theo explorations#6366

Draft
theosanderson-agent wants to merge 11 commits into
mainfrom
theo-query-service-exploration
Draft

theo explorations#6366
theosanderson-agent wants to merge 11 commits into
mainfrom
theo-query-service-exploration

Conversation

@theosanderson-agent
Copy link
Copy Markdown
Collaborator

@theosanderson-agent theosanderson-agent commented May 7, 2026

What

Inserts a new query-service between the website / CLI / external callers
and the per-organism LAPIS deployments, and migrates everyone onto it.

Implementation

  • query-service/ — Python (FastAPI + httpx) reverse proxy, single
    deployment, one image. Routes /v1/<verb>?organism=<x> to the right
    loculus-lapis-service-<x>. Owns its own verb names (/v1/aggregated,
    /v1/details, /v1/mutations, /v1/sequences, …) and reserved control
    params (organism, format, download, fields, limit, offset,
    include). Accepts both JSON and form-encoded POST bodies.
  • Website — Zodios endpoints, LapisClient, DownloadUrlGenerator,
    lapisClientHooks, all /loculus-info callers migrated. Adds an
    "Include older versions and revocations" checkbox at the top of the
    search filter panel, writing ?include=all to the URL.

Behaviour changes

  • Defaults are applied centrally. Every search request gets
    versionStatus=LATEST_VERSION and isRevocation=false server-side
    unless the caller opts out. The website's hidden-default fields for
    these are gone; the CLI no longer silently shows revoked / older
    versions.
  • One opt-out for "show everything": ?include=all on the API,
    matched in the UI by the new toggle. ?include=revoked /
    ?include=older-versions are also available. Explicit
    versionStatus= / version= / accessionVersion= always wins.
  • Submitters' "released sequences" link now ships in with
    ?include=all — when you just made a revocation or revision, it's
    visible without flipping anything.
  • URL shape: lapis.<host>/<organism>/sample/aggregated is now
    <host>/v1/aggregated?organism=<organism> (verbs renamed, organism
    moved from path to query param). Filter columns and response shapes
    are unchanged.

User-facing writeup: https://gist.github.com/theosanderson-agent/b1e5a7375b865949cc6e3e0be9d1754d

🚀 Preview: Add preview label to enable

…f LAPIS

Adds a new single-deployment service (query-service/) that proxies
/{organism}/{path} to the corresponding loculus-lapis-service-{organism}.
For now it is a transparent passthrough; the point is to give us a
single hop where we can rewrite LAPIS responses in future iterations
without changing the website or LAPIS.

Wires the website (server-side lapisUrls), the lapis ingress, and the
public lapisUrls (used by the browser and CLI) through the new
service, so all external and internal LAPIS calls now go through it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@claude claude Bot added the deployment Code changes targetting the deployment infrastructure label May 7, 2026
@theosanderson theosanderson added the preview Triggers a deployment to argocd label May 7, 2026
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 7, 2026

This PR may be related to: #6265

The query-service introduced here acts as a central reverse proxy routing /{organism}/{lapis_path} requests to per-organism LAPIS instances. This infrastructure could serve as the foundation for solving #6265, which requires calling every LAPIS instance to locate which organism an accession belongs to — a unified routing layer would be the natural place to implement cross-organism accession lookup.

theosanderson and others added 8 commits May 7, 2026 18:23
The query-service is now the only LAPIS-facing surface. The old
`/{organism}/{path}` passthrough is removed.

API shape (see query-service/README.md):
  - Owned verbs under /v1/: aggregated, details, mutations, aaMutations,
    insertions, aaInsertions, alignedSequences[/segment],
    unalignedSequences[/segment], aaSequences/<protein>, info,
    lineageDefinition.
  - `?organism=` is required and single-valued.
  - Reserved control keywords: organism, format (-> dataFormat),
    download (-> downloadAsFile), fields, limit, offset, include,
    reference. Anything else is a metadata-column filter.
  - Implicit defaults applied centrally: versionStatus=LATEST_VERSION,
    isRevocation=false. Override with `include=revoked|older-versions|all`.
    Explicit version filters drop the defaults.

Helm:
  - Removed `lapisUrls` (per-organism map) from the website runtime config;
    replaced with a single `queryServiceUrl` plus an `organisms` list.
  - lapis-ingress is now a single rule that routes everything on the
    lapis hostname to query-service.

Website:
  - Renamed Zodios endpoints to /v1/<verb>; segment / proteinName are
    path components (Zodios needs unique paths per endpoint).
  - LapisClient now takes (queryServiceUrl, organism, schema) and adds
    `?organism=` to every call. Internal name unchanged to limit churn.
  - DownloadUrlGenerator builds /v1/... URLs and adds ?organism=.
  - Dropped manual versionStatus / isRevocation defaults from server-side
    callers (GroupPage, getSeqSetStatistics, getOrganismStatistics) — the
    query-service applies them.
  - /loculus-info now exposes hosts.queryService instead of hosts.lapis.

CLI:
  - Migrated to /v1/ paths with ?organism=.
  - get_lapis_url() retained as a thin wrapper that returns the
    query-service base URL.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… POSTs

Adds the missing pieces to keep the prior search-page UX (where users
can clear the version-status / revocation hidden-field defaults to see
all versions) working against the new query-service defaults.

- lapisApi.ts: every /v1/ endpoint now accepts an optional `include`
  query parameter.
- serviceHooks.lapisClientHooks: takes an `options.include` and threads
  it onto every request as a query string. The search UI passes
  `'all'`; autocomplete and detail views leave it unset so query-service
  defaults still apply there.
- SearchFullUI / serversideSearch / DownloadUrlGenerator: pass
  `include=all`, since the search page manages its own version
  defaults via hiddenFieldValues.
- LapisClient.getAllSequenceEntryHistoryForAccession: passes
  `include=all` (version history needs every version + revocation).
- query-service:
  - reads `include=` from the query string for POSTs too (was only
    looking in the body), so the website's `?include=all` is honoured.
  - parses `application/x-www-form-urlencoded` bodies for the long-query
    download path that submits an HTML form. Uses `getlist` so repeated
    keys (`fields=a&fields=b&fields=c`) are preserved as a list, not
    collapsed to the first value.
- Adds python-multipart to requirements (Starlette's form parser
  dependency).
- Updates DownloadDialog.spec.tsx assertions for the new
  `?organism=ebola&include=all&...` URL prefix.
- Fixes one stale `/sample/details` reference in
  download.dependent.spec.ts.

Integration suite: 98/98 passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Add .github/workflows/query-service-image.yml so the integration
  tests can pull commit-tagged images of the query-service. Mirrors
  preprocessing-dummy-image.yml: hash-based caching, multi-tag
  pushes, ARM build on main.
- Run prettier --write across files touched by the migration.
- Restore the versionStatus / isRevocation regex segments to the
  DownloadDialog spec assertions that exercise hiddenFieldValues —
  those filters _are_ sent in that flow (the test passes them
  explicitly), and the implicit defaults stay opt-out via include=all.
- Update vitest.setup.ts MSW mocks to point at the new sequence
  paths: /v1/alignedSequences[/segment], /v1/unalignedSequences[/segment].

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e-all toggle

Following up on the previous commit so the website stops shipping the
versionStatus / isRevocation hidden defaults that the query-service is
already applying:

- Remove the `hiddenFieldValues` containing `versionStatus=LATEST_VERSION`
  and `isRevocation=false` from the search and submission/released pages.
  The query-service applies those defaults centrally — no need to also
  send them on the wire from the website.
- Drop the blanket `?include=all` from `lapisClientHooks` /
  `serversideSearch` / `DownloadUrlGenerator`. Default search now relies
  on query-service's defaults (latest non-revoked).
- Add an explicit "Include older versions and revocations" checkbox at
  the top of the search form. It writes `?include=all` into the URL,
  which the website then forwards as the `include=` query param. Toggle
  off and the URL drops the param so defaults reapply.
- Update the override-hidden-fields integration test to flip the new
  toggle instead of clearing the (no longer present) hidden fields.
- Replace the "hidden field values are kept in URL params" unit test
  with one that asserts the new `include=` toggle round-trips through
  the URL.
- Bump query-service CPU request/limit a touch (200m / 2 cores).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… include=all

- SearchFullUI: stabilise the empty `hiddenFieldValues` default with
  `useMemo`. The previous `??= {}` reassigned a fresh object every
  render, which destabilised the `useMemo` chain feeding the
  search-results `useEffect` and caused it to refire — the user observed
  this as an infinite loop of /v1/aggregated requests on the search page.
- /[organism]/submission/[groupId]/released: default `?include=all` so
  submitters see every version they've released (including revocations
  they've just made). The `mySequencesPage` route helper builds the
  link with that param and the page itself injects it into
  `initialQueryDict` for direct navigation.
… state

The React URL-syncing hook overwrites SSR-injected state with whatever
is actually in window.location on hydration, so injecting include=all
into initialQueryDict alone gets clobbered. Redirect server-side instead
so the URL itself is the source of truth, and hydration sees include=all.
…do /released redirect

- The 'override hidden fields' test was specifically exercising the
  removed clear-the-hidden-default mechanism. The replacement coverage
  is in search.dependent.spec.ts ('include-all toggle puts include=all
  in the URL'); the autocomplete-timeout flake on the deleted test was
  not adding signal.
- Reverting the server-side redirect to ?include=all on /released:
  some tests (e.g. file-sharing 'bulk revise 2 seqs with files') do
  page.goto(page.url() + '?column_submissionId=true') after navigating
  to /released, and the redirect made page.url() already carry an
  ?include=all so the appended ?column_... corrupted the URL into
  ?include=all?column_submissionId=true and query-service rejected the
  malformed include= value with 400 (which then tripped the
  console-warnings fixture).
  The link helper (routes.mySequencesPage) still adds ?include=all so
  the in-page 'released sequences' link gives submitters the
  see-everything view they need; tests that navigate directly retain
  default behaviour and can append params with ? safely.
@theosanderson theosanderson added preview Triggers a deployment to argocd and removed preview Triggers a deployment to argocd labels May 8, 2026
Conflicts resolved:
- integration-tests/.../override-hidden-fields.spec.ts — kept the
  branch's deletion (the hidden-default override UX no longer exists;
  see search.dependent.spec.ts's 'include-all toggle' test).
- website/src/components/SearchPage/SearchForm.tsx — main introduced a
  multi-field-search variant in the Metadata Filters loop; reapplied
  the branch's organism prop on the SearchField inside it.
- website/src/components/SearchPage/fields/HierarchicalField.spec.tsx
  — main added this new spec; threaded the required `organism` prop
  into every <HierarchicalField> render.
@theosanderson theosanderson added preview Triggers a deployment to argocd and removed preview Triggers a deployment to argocd labels May 12, 2026
The new query-service exposes /v1/aggregated rather than
/{organism}/sample/aggregated, so multi-field-search interceptors were
matching nothing and observing undefined status.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@corneliusroemer corneliusroemer removed the preview Triggers a deployment to argocd label May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deployment Code changes targetting the deployment infrastructure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants