feat(graph): add MS Graph colon-syntax path lookup middleware#2688
feat(graph): add MS Graph colon-syntax path lookup middleware#2688dschmidt wants to merge 7 commits into
Conversation
Adds a chi middleware that detects MS Graph colon-syntax URLs and rewrites
them to the canonical /items/{itemID}/... form before chi performs route
matching. Existing handlers, routes, and GetDriveAndItemIDParam stay
unchanged.
Two URL shapes are recognized at both /v1.0 and /v1beta1:
/drives/{driveID}/root:/<path>[:/<suffix>][:]
/drives/{driveID}/items/{itemID}:/<relativePath>[:/<suffix>][:]
Path resolution runs as the request user via CS3 Stat. Both NOT_FOUND and
PERMISSION_DENIED collapse to a 404 response so existence isn't disclosed
to unauthorized callers. URLs without colon syntax fast-path through with
a single substring check. The original URL is stashed in request context
under OriginalPathContextKey for downstream tracing/logging.
The middleware is registered as a top-level mux.Use so it runs before any
route matching: chi middleware on a sub-router runs after the prefix is
matched but cannot redirect to a different leaf route. Top-level
middleware lets URL rewriting actually re-route the request.
Tests cover regex matching across versions, all rewrite variants
(root/items anchored, with/without suffix, with/without trailing colon,
deep paths), NOT_FOUND -> 404, PERMISSION_DENIED -> 404 (security: no
existence disclosure), and original-URL preservation in request context.
There was a problem hiding this comment.
Pull request overview
Adds a top-level chi middleware to support Microsoft Graph “colon-syntax” drive path addressing by resolving the path via CS3 Stat and rewriting requests to the canonical /drives/{driveID}/items/{itemID}... form before route matching.
Changes:
- Register
ResolveGraphPathas a top-level mux middleware so URL rewriting happens before chi route matching. - Implement colon-syntax detection, CS3
Stat-based path resolution, request path rewrite, and original-path preservation in context. - Add unit tests covering rewrite variants and “no existence disclosure” behavior for
PERMISSION_DENIED.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| services/graph/pkg/service/v0/service.go | Registers the new path-lookup middleware early in the mux pipeline. |
| services/graph/pkg/middleware/path_lookup.go | Implements colon-syntax matching, CS3 Stat path resolution, rewrite logic, and context preservation. |
| services/graph/pkg/middleware/path_lookup_test.go | Adds unit tests for matching/rewriting and security-related status behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Drop double-decoding of URL path components. r.URL.Path is already
decoded by net/http; calling url.PathUnescape again would let crafted
inputs like "%252F" become "/", changing path semantics. Path and
anchor-id strings now go straight to utils.MakeRelativePath /
storagespace.ParseID.
- Distinguish operational errors from "not found". Gateway selection
failure, RPC transport errors, and unexpected CS3 status codes now
surface as 500 (don't mask outages); only NOT_FOUND and PERMISSION_DENIED
collapse to 404 (no existence disclosure). Implemented via a sentinel
errPathNotFound + a 500 fall-through.
- Refactor the two regex-handling branches in rewriteColonPath to share a
resolution path via a small colonMatch struct + matchInto/extract helpers.
Removes the SonarCloud duplication finding without changing behavior.
- Update the docstring on ResolveGraphPath to reflect that the rewrite
preserves the requested API version (/{version}/...) rather than
hard-coding /v1beta1/...
- Test cleanup: register t.Cleanup to remove the per-subtest selector
entry from pool's global selectors map after the subtest, and update
the "unexpected status" case to expect 500 (matches the new error
semantics).
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Two follow-up Copilot review nits on the colon-syntax middleware: - Don't blank r.URL.RawPath after the rewrite. Graph.ServeHTTP sets RawPath = EscapedPath() as a workaround for chi's parameter-binding quirks with `$`/`!` in IDs (see go-chi/chi#641). Clearing RawPath for rewritten requests negates that workaround. Re-establish the same invariant after the rewrite. - Tests previously used log.NewLogger(), which mutates global zerolog state. Switch to log.NopLogger() — order-independent, no global side effects.
72bc29b to
1fc1bd4
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Three more Copilot review nits on the colon-syntax middleware:
- CS3 Stat returning UNAUTHENTICATED now surfaces as HTTP 401 (was 500
via the default-case fallback). Distinct sentinel + handler branch.
- Item-anchored form (/drives/{driveID}/items/{itemID}:/...) now
validates that driveID's storage/space prefix matches the itemID's,
short-circuiting to 400 InvalidRequest on mismatch instead of doing
a CS3 Stat that would only fail at the handler layer.
- ParseID failures on the anchor id now surface as 400 (was 404). In
practice this branch is defensive — storagespace.ParseID is lenient
and only errors on empty input, which the regex already filters out
— but the right semantic for malformed client input is 400, not 404.
Tests: added two new cases (UNAUTHENTICATED → 401, drive/item id
mismatch → 400). The "malformed drive id" case Copilot suggested
isn't reachable in practice given ParseID's leniency, so it's not
covered.
84da024 to
d177df0
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
d1bfb6d to
c31b772
Compare
Cover the rewrite shapes the middleware handles end-to-end against a real OpenCloud server: root-anchored, item-anchored, deep paths, trailing colon, and the "/<path>:/<suffix>" sub-route form. Also assert that NOT_FOUND and PERMISSION_DENIED both collapse to 404. The /permissions sub-route is registered only at /v1beta1, and the v1beta1 GetDriveItem handler is share-jail-only, so the v1beta1 mount of the middleware is exercised through the permissions scenario, since there is no other v1beta1 endpoint that works for regular personal-drive items.
c31b772 to
adce092
Compare
|
|
@dschmidt Thanks! @ScharfViktor I asked @dschmidt to add some API tests. Please review 😄 |
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 47 |
| Duplication | 2 |
AI Reviewer: first review requested successfully. AI can make mistakes. Always validate suggestions.
TIP This summary will be updated as you push new changes.
There was a problem hiding this comment.
Pull Request Overview
This PR introduces a middleware to handle MS Graph's colon-syntax path lookups by performing internal CS3 Stat resolutions. Although the code is technically 'up to standards' according to Codacy, the review identifies significant risks: critical test scenarios for regex matching and error handling are missing, and there is no documentation for the new URL resolution behavior. The security requirement to collapse CS3 PERMISSION_DENIED errors into HTTP 404s must be verified to prevent sensitive path disclosure. Additionally, the high complexity (85) and presence of clones in services/graph/pkg/service/v0/service.go suggest potential maintainability issues in the service layer.
About this PR
- Documentation was not provided for the new colon-syntax URL resolution behavior. This is a significant change in how paths are resolved and should be documented.
Test suggestions
- Regex matching correctly identifies colon syntax for both v1.0 and v1beta1 versions
- URL rewriting handles root-anchored paths with and without suffixes/trailing colons
- URL rewriting handles item-anchored paths with relative paths
- CS3 PERMISSION_DENIED results in an HTTP 404 response to prevent existence disclosure
- CS3 NOT_FOUND results in an HTTP 404 response
- Original path is correctly preserved in the request context for downstream use
- URLs without colon syntax bypass the middleware logic quickly
Prompt proposal for missing tests
Consider implementing these tests if applicable:
1. Regex matching correctly identifies colon syntax for both v1.0 and v1beta1 versions
2. URL rewriting handles root-anchored paths with and without suffixes/trailing colons
3. URL rewriting handles item-anchored paths with relative paths
4. CS3 PERMISSION_DENIED results in an HTTP 404 response to prevent existence disclosure
5. CS3 NOT_FOUND results in an HTTP 404 response
6. Original path is correctly preserved in the request context for downstream use
7. URLs without colon syntax bypass the middleware logic quickly
Low confidence findings
- The middleware is registered at a top-level. Ensure that it includes a fast-path for URLs not matching the colon syntax to avoid performance degradation for unrelated routes.
TIP Improve review quality by adding custom instructions
TIP How was this review? Give us feedback
There was a problem hiding this comment.
Pull Request Overview
The PR successfully implements the MS Graph colon-syntax path lookup middleware, fulfilling all specified acceptance criteria. Codacy results indicate the PR is up to standards. However, a discrepancy exists in the URL rewriting logic: while the implementation notes that special characters like '$' and '!' must be percent-encoded for chi route matching, the current use of r.URL.EscapedPath() does not perform this encoding. This mismatch should be resolved to ensure robust routing. Additionally, services/graph/pkg/service/v0/service.go shows a significant increase in complexity and code duplication that should be monitored for technical debt.
About this PR
- High complexity (85) and new code clones (5) were detected in
services/graph/pkg/service/v0/service.go. Ensure these changes do not introduce excessive technical debt or maintainability issues in the service implementation.
Test suggestions
- Verify that non-colon URLs pass through the middleware without modification.
- Verify root-anchored colon path (e.g., /root:/Documents) rewrites to the canonical items URL.
- Verify item-anchored colon path (e.g., /items/{id}:/subfolder) rewrites to the canonical items URL.
- Verify that item-anchored requests return 400 if the item ID does not belong to the specified drive ID.
- Verify handling of optional trailing colons and suffixes like /children or /createLink.
- Verify that PERMISSION_DENIED from storage returns 404 (Security: no existence disclosure).
- Verify that UNAUTHENTICATED from storage returns 401.
- Verify that internal/unexpected CS3 errors return 500.
- Verify that the original URL is correctly stashed in the request context.
TIP Improve review quality by adding custom instructions
TIP How was this review? Give us feedback
Codacy flagged the colon-path middleware comment claiming both `$` and `!` need percent-encoding for chi's tree match, while the implementation only calls r.URL.RawPath = r.URL.EscapedPath() which does not encode either character per the suggestion's reading. In practice EscapedPath does encode `!` as `%21` (only `?` is the hardcoded escape, `!` is escaped because Go's net/url treats it as needing encoding outside specific contexts). It leaves `$` literal, which chi handles fine. chi.URLParam returns the encoded segment verbatim, and downstream OpenCloud handlers (parseIDParam, GetDriveAndItemIDParam) already PathUnescape before parsing IDs, so the round-trip works end-to-end. The acceptance tests on this branch already exercise this with real `$`/`!`-containing IDs. Add a focused unit test that mounts the middleware behind chi, sends a colon URL, and asserts the actual contract: - driveID (only `$`): chi.URLParam returns it literal - itemID (with `!`): PathUnescape(chi.URLParam(...)) == original Update the misleading comment so future readers (and reviewers) see what the encoding actually does and which downstream contract it relies on.


Description
Adds a chi middleware that detects MS Graph colon-syntax URLs and rewrites them to the canonical /items/{itemID}/... form before chi performs route matching. Existing handlers, routes, and GetDriveAndItemIDParam stay unchanged.
Two URL shapes are recognized at both /v1.0 and /v1beta1:
/drives/{driveID}/root:/[:/][:]
/drives/{driveID}/items/{itemID}:/[:/][:]
Path resolution runs as the request user via CS3 Stat. Both NOT_FOUND and PERMISSION_DENIED collapse to a 404 response so existence isn't disclosed to unauthorized callers. URLs without colon syntax fast-path through with a single substring check. The original URL is stashed in request context under OriginalPathContextKey for downstream tracing/logging.
The middleware is registered as a top-level mux.Use so it runs before any route matching: chi middleware on a sub-router runs after the prefix is matched but cannot redirect to a different leaf route. Top-level middleware lets URL rewriting actually re-route the request.
Tests cover regex matching across versions, all rewrite variants (root/items anchored, with/without suffix, with/without trailing colon, deep paths), NOT_FOUND -> 404, PERMISSION_DENIED -> 404 (security: no existence disclosure), and original-URL preservation in request context.
Unfortunately chi itself is not able to match this kind of routes (gorilla/mux has a regex matcher that could be used).
So this is the least intrusive change I could come up with.
I haven't found a clean way to document this in OpenAPI/Libre Graph Spec - so I'm suggesting to just put it in the route comments there, if you accept this change
Related Issue
Motivation and Context
It's annoying to implement path walking in a client and it's slower than a single internal cs3 path resolve call :)
How Has This Been Tested?
Screenshots (if appropriate):
Types of changes
Checklist: