diff --git a/README.md b/README.md index 5897f41..558faaf 100644 --- a/README.md +++ b/README.md @@ -61,13 +61,13 @@ Implemented today: Lab API response contract, `/api/compare`, `/api/analyze` in- Runtime identity polish: when a Forge manifest is applied, Runtime now preserves the manifest `source_model.path` identity for comparison naming. A TensorRT artifact such as `model.engine` can therefore keep `compare_model_name=yolov8n` and `compare_key=yolov8n__b1__h640w640__fp32` instead of degrading to `model__...`. This is provenance/compare-readiness polish, not production SaaS infrastructure. -Not implemented yet: real worker daemon, full automated Forge/Runtime execution from production Lab workers, DB/Redis/queue, file upload, SaaS frontend, and production auth/billing/deployment controls. +Not implemented yet: real worker daemon, full automated Forge/Runtime execution from production Lab workers, DB/Redis/queue, file upload, production frontend beyond Local Studio, and production auth/billing/deployment controls. Portfolio entry points: [portfolio submission](docs/portfolio/inferedge_portfolio_submission.md) · [resume/interview summary](docs/portfolio/inferedge_resume_interview_summary.md) · [1-page architecture summary](docs/portfolio/inferedge_1page_architecture.md) · [pipeline status](docs/portfolio/inferedge_pipeline_status.md) Interview one-liner: **InferEdge is an end-to-end inference validation pipeline that converts, runs, compares, diagnoses, and decides whether an edge AI model candidate is ready to deploy.** -Final interview angle: InferEdge has both macOS ONNX Runtime CPU smoke and Jetson Orin Nano TensorRT smoke evidence, while production worker daemon, persistent queue/database, frontend, auth, and billing remain future work. +Final interview angle: InferEdge has both macOS ONNX Runtime CPU smoke and Jetson Orin Nano TensorRT smoke evidence, while production worker daemon, persistent queue/database, production frontend, auth, and billing remain future work. --- @@ -84,6 +84,23 @@ TensorRT Jetson was 4.6x faster than ONNX Runtime CPU in this real image input b The benchmark uses end-to-end Runtime latency, not trtexec GPU-only latency. The full pipeline portfolio summary is available at [docs/portfolio/inferedge_pipeline_portfolio.md](docs/portfolio/inferedge_pipeline_portfolio.md), and the detailed Runtime comparison report is available at [docs/portfolio/runtime_compare_yolov8n.md](docs/portfolio/runtime_compare_yolov8n.md). +## Local Studio Demo Evidence + +InferEdge Local Studio can replay the bundled portfolio evidence without requiring a live Jetson device during an interview walkthrough. +The `Load Demo Evidence` flow imports the ONNX Runtime CPU and TensorRT Jetson Runtime JSON fixtures from [examples/studio_demo](examples/studio_demo), refreshes Compare View, and keeps the demo pair selectable in Recent jobs while the local server process is running. + +![InferEdge Local Studio demo evidence](assets/images/local-studio-demo-evidence.png) + +Verified demo fixture values: + +| Backend | Device | Mean ms | P99 ms | FPS | Compare Key | +|---|---|---:|---:|---:|---| +| ONNX Runtime | CPU | 45.4299 | 49.2128 | 22.0119 | `yolov8n__b1__h640w640__fp32` | +| TensorRT | Jetson | 9.9375 | 15.5231 | 100.6293 | `yolov8n__b1__h640w640__fp32` | + +Studio reports this as a `4.57x` TensorRT speedup for the bundled demo pair. +AIGuard remains optional in this local Studio path; if Guard evidence is not loaded, the deployment decision explains that the Lab comparison is available but diagnosis evidence is not provided. + --- ## Reproducible Review Flow @@ -372,7 +389,7 @@ More details: [FastAPI API usage guide](docs/api/api_usage.md) ## Local Studio -InferEdge Local Studio is a local-first browser interface for inspecting the existing CLI workflow, API/job contracts, result metrics, and Lab-owned deployment decision structure. +InferEdge Local Studio is a local-first browser interface for inspecting the existing CLI workflow, API/job contracts, Runtime evidence, Compare View, Jetson command helper, and Lab-owned deployment decision structure. It runs on the user's machine through the FastAPI server and is intended as a local workflow UI foundation, not a production SaaS dashboard or cloud dashboard. ### Run Local Studio @@ -387,7 +404,17 @@ Open: http://localhost:8000/studio ``` -The first Studio skeleton uses local static assets only and renders demo placeholders for the pipeline flow, evidence summary, result metrics, and deployment decision. Future work can connect these cards to real `/api/jobs`, `/api/compare`, and `/api/analyze` responses while keeping DB/queue/upload/auth/billing outside the current scope. +What works today: + +- Run creates an in-memory analyze job through the existing `/api/analyze` contract. +- Import accepts a Runtime result JSON path or pasted JSON payload and adds it to the in-memory compare-ready evidence set. +- Load Demo Evidence imports the bundled ONNX Runtime CPU and TensorRT Jetson fixtures for a stable browser demo. +- Compare View shows TensorRT vs ONNX Runtime mean latency, p99, FPS, latency diff, and speedup when compatible evidence is loaded. +- Jetson Helper shows the local command shape for running the Runtime on a Jetson device. +- Deployment Decision stays Lab-owned; AIGuard is optional deterministic diagnosis evidence. + +Current non-goals remain unchanged: no DB, queue, upload service, production auth, billing, or production SaaS worker orchestration. +Jobs and imported Studio evidence are in-memory and reset when the local server process restarts. --- diff --git a/Roadmap.md b/Roadmap.md index 03289b6..ebc25de 100644 --- a/Roadmap.md +++ b/Roadmap.md @@ -89,7 +89,7 @@ Improve usability, discoverability, and expansion paths beyond the core CLI work - [x] Provide richer CLI presentation with Rich - [x] Generate HTML benchmark and validation reports - [x] Run automated benchmark / validation checks in CI -- [ ] Add a web dashboard mode +- [x] Add a local-first Studio workflow UI for portfolio demo and browser inspection --- @@ -124,6 +124,56 @@ Improve usability, discoverability, and expansion paths beyond the core CLI work ## 🔭 Future Direction - [ ] Complete full RKNN runtime backend integration so curated and runtime validation share one end-to-end device workflow -- [ ] Evolve the current API adapter into a foundation for a web dashboard or SaaS-style validation surface +- [ ] Keep Local Studio as a local-first workflow UI, and only later evaluate whether a production dashboard is justified - [ ] Add memory profiling so deployment decisions are informed by both latency and resource pressure - [ ] Explore multi-device distributed benchmarking for larger validation fleets and lab-scale experimentation + +--- + +## Cross-Repository Roadmap + +The current portfolio boundary is intentionally local-first and evidence-driven. The items below are future development directions, not current claims. + +### InferEdgeForge + +Forge should stay focused on build provenance and artifact handoff. + +- [x] Emit manifest and metadata records for source model, artifact, backend, target, precision, shape, preset, and build id +- [x] Provide worker/runtime summary data that can feed Lab and Runtime contracts +- [ ] Add stronger build reproducibility checks across repeated artifact builds +- [ ] Expand preset coverage for Jetson TensorRT and RKNN build targets +- [ ] Add artifact package export suitable for sharing with Runtime without manual path coordination + +### InferEdgeRuntime + +Runtime should stay focused on real execution, profiling, and Lab-compatible result export. + +- [x] Provide C++ execution/result export boundary +- [x] Validate Lab worker request payloads in dry-run mode +- [x] Export compare-ready Runtime result JSON for ONNX Runtime CPU and TensorRT Jetson evidence +- [x] Preserve source model identity for manifest-backed TensorRT engine results +- [ ] Harden Runtime execution error reporting for failed engine/model loads +- [ ] Add memory/resource profiling to complement latency, p99, and FPS +- [ ] Complete RKNN runtime execution so curated RKNN evidence and live Runtime execution share one path + +### InferEdgeLab + +Lab should remain the comparison, reporting, API/job contract, Local Studio, and deployment decision owner. + +- [x] Compare Runtime result JSON by `compare_key` and `backend_key` +- [x] Generate Markdown/HTML reports and API response bundles +- [x] Provide in-memory `/api/analyze`, `/api/jobs/{job_id}`, and worker request/response mapping contracts +- [x] Provide Local Studio for Run, Import, Demo Evidence, Compare View, Deployment Decision, and Jetson command helper +- [ ] Add optional persisted result storage after the portfolio demo boundary is stable +- [ ] Add production worker daemon integration only after Forge/Runtime handoff is reliable +- [ ] Improve multi-model evidence browsing without turning Studio into a production SaaS surface + +### InferEdgeAIGuard + +AIGuard should stay optional and deterministic. It should explain evidence risks, not replace Lab's final decision ownership. + +- [x] Diagnose provenance mismatch with rule/evidence based detectors +- [x] Preserve `guard_analysis` in Lab reports/API/deployment decision bundles +- [ ] Add more detector coverage for missing manifest fields, backend mismatch, precision mismatch, and suspicious result deltas +- [ ] Add clearer guard evidence examples for interview demos +- [ ] Keep AIGuard optional in Studio until the evidence contract is strong enough to justify a UI action diff --git a/assets/images/local-studio-demo-evidence.png b/assets/images/local-studio-demo-evidence.png new file mode 100644 index 0000000..abf2f38 Binary files /dev/null and b/assets/images/local-studio-demo-evidence.png differ diff --git a/docs/portfolio/inferedge_1page_architecture.md b/docs/portfolio/inferedge_1page_architecture.md index ffd3a14..ad1e17e 100644 --- a/docs/portfolio/inferedge_1page_architecture.md +++ b/docs/portfolio/inferedge_1page_architecture.md @@ -40,6 +40,7 @@ ONNX model - Lab `worker_request` / `worker_response` boundary - Lab -> Runtime dev-only minimal execution smoke using `yolov8n.onnx` (ONNX Runtime CPU, success, mean about 47.97 ms, p95 about 51.80 ms, about 20.85 FPS) - Jetson Orin Nano TensorRT Runtime smoke using Forge manifest + TensorRT engine artifact (success, manifest applied, mean about 14.00 ms, p99 about 15.50 ms, about 71.44 FPS) +- Local Studio demo evidence replay at `/studio` using bundled ONNX Runtime CPU and TensorRT Jetson result fixtures: 45.4299 ms vs 9.9375 ms mean latency, 49.2128 ms vs 15.5231 ms p99, 22.0119 vs 100.6293 FPS, and a 4.57x TensorRT speedup for the demo pair - Runtime source-model identity polish for manifest-backed TensorRT engine results (`model.engine` can still keep `compare_model_name=yolov8n` and `compare_key=yolov8n__b1__h640w640__fp32`) - Runtime `worker_request` validation and `worker_response` dry-run export - Forge worker/runtime summary @@ -53,7 +54,7 @@ ONNX model - full automated Forge/Runtime execution from a production Lab worker - database, Redis, or queue - file upload -- frontend +- production frontend beyond the local Studio workflow UI - production authentication, billing, and deployment controls ## Interview Explanation diff --git a/docs/portfolio/inferedge_pipeline_status.md b/docs/portfolio/inferedge_pipeline_status.md index 8669498..077eb6a 100644 --- a/docs/portfolio/inferedge_pipeline_status.md +++ b/docs/portfolio/inferedge_pipeline_status.md @@ -61,7 +61,7 @@ Current role: - runs compare, compare-latest, report, and deployment decision flows - exposes `/api/compare` with the SaaS API response contract - exposes in-memory `/api/analyze` and `/api/jobs/{job_id}` workflow stubs -- exposes a local-first `/studio` skeleton that presents the CLI/API/job/deployment decision workflow in the browser +- exposes a local-first `/studio` workflow UI for Run, Import, Compare View, Jetson command helper, demo evidence replay, and deployment decision inspection - maps analyze jobs to worker requests and worker responses back to job results - preserves optional AIGuard evidence while keeping Lab as the final decision owner @@ -95,7 +95,7 @@ The current cross-repository loop is covered by documentation, fixtures, and smo - Forge summary-origin Lab worker request validation in Runtime - AIGuard worker provenance mismatch diagnosis - Lab deployment decision/report evidence smoke for AIGuard worker provenance diagnosis -- Local Studio skeleton for viewing the Forge -> Runtime -> Lab -> optional AIGuard workflow, smoke evidence, metrics placeholders, and deployment decision ownership from a local browser +- Local Studio local-first workflow UI for viewing Forge -> Runtime -> Lab -> optional AIGuard state, creating in-memory analyze jobs, importing Runtime result JSON, replaying bundled demo evidence, comparing backends, and inspecting Lab-owned deployment decision context This means the current product boundary is testable without running the production worker infrastructure. @@ -124,7 +124,7 @@ Demo readiness: `scripts/demo_pipeline_full.sh` now provides a guided end-to-end - Manual Jetson TensorRT Runtime smoke using Forge manifest and TensorRT engine artifact - Runtime compare-key identity polish for manifest-backed engine artifacts - Guided end-to-end demo entrypoint for portfolio and interview walkthroughs -- Local Studio skeleton at `/studio` for a local-first browser view of the workflow foundation +- Local Studio at `/studio` for a local-first browser view of Run / Import / Demo Evidence / Compare / Decision / Jetson Helper workflows - Cross-repo fixture compatibility across Forge, Runtime, Lab, and AIGuard - Rule/evidence based provenance mismatch diagnosis @@ -136,7 +136,7 @@ Demo readiness: `scripts/demo_pipeline_full.sh` now provides a guided end-to-end - database persistence - Redis, Celery, or another queue - file upload handling -- production frontend beyond the local Studio skeleton +- production frontend beyond the local Studio workflow UI - production authentication, billing, and deployment controls These gaps are intentional. The current project fixes the contracts first, then leaves infrastructure choices for later. diff --git a/docs/portfolio/inferedge_portfolio_submission.md b/docs/portfolio/inferedge_portfolio_submission.md index ef0ca7e..3d29176 100644 --- a/docs/portfolio/inferedge_portfolio_submission.md +++ b/docs/portfolio/inferedge_portfolio_submission.md @@ -14,7 +14,7 @@ InferEdge is not a benchmarking tool, but an end-to-end validation pipeline that - InferEdge 전체 흐름은 Forge build provenance -> Runtime real execution -> Lab compare/report/API/job/deployment_decision -> optional AIGuard diagnosis evidence로 구성된다. - Lab은 InferEdgeForge provenance metadata, InferEdge-Runtime C++ execution output, optional InferEdgeAIGuard diagnostic evidence를 하나의 검증 bundle로 연결한다. - `yolov8n.onnx` manual smoke에서 Lab -> C++ Runtime CLI -> ONNX Runtime CPU execution -> Lab job result ingestion 경로가 dev-only minimal Runtime execution path로 검증되었다. -- 현재 상태는 portfolio-grade pipeline foundation이며, production worker daemon, persistent queue/database, file upload, frontend, auth/billing은 future work로 명확히 분리한다. +- 현재 상태는 portfolio-grade pipeline foundation이며, production worker daemon, persistent queue/database, file upload, production frontend beyond Local Studio, auth/billing은 future work로 명확히 분리한다. Pipeline: @@ -93,13 +93,14 @@ Rule + evidence diagnosis layer. Forge summary, Runtime worker_response, Lab res - Runtime worker_response compatibility ingest in Lab - AIGuard worker provenance mismatch diagnosis - AIGuard guard_analysis preservation in Lab deployment decision/report smoke +- Local Studio browser workflow for Run, Import, Jetson command helper, demo evidence replay, Compare View, and Lab-owned Deployment Decision inspection - 4개 repository README pipeline summary sync ## 6. Validation Evidence Recent validation evidence: -- InferEdgeLab: `poetry run python3 -m pytest -q` -> 245 passed +- InferEdgeLab: `poetry run python3 -m pytest -q` -> 262 passed - InferEdgeForge: `python -m pytest -q` -> 89 passed - InferEdgeRuntime: `python3 tests/test_lab_worker_adapter_contract.py` -> 12 tests passed - InferEdgeRuntime: `scripts/smoke_default.sh` -> success @@ -110,6 +111,7 @@ Recent validation evidence: - Jetson TensorRT Runtime smoke: on Jetson Orin Nano (`Linux 5.15.148-tegra`, `aarch64`), the C++ Runtime CLI in `~/InferEdge-Runtime` executed Forge manifest `/home/risenano01/InferEdgeForge/builds/yolov8n__jetson__tensorrt__jetson_fp16/manifest.json` and TensorRT engine artifact `/home/risenano01/InferEdgeForge/builds/yolov8n__jetson__tensorrt__jetson_fp16/model.engine`. The output `results/jetson/yolov8n_jetson_tensorrt_manifest_smoke.json` reported `success: true`, `engine_backend: tensorrt`, `device_name: jetson`, `manifest_applied: true`, input shape `[1, 3, 640, 640]`, output shape `[1, 84, 8400]`, mean latency about 14.00 ms, p99 about 15.50 ms, and about 71.44 FPS. - Runtime compare-key identity polish: InferEdgeRuntime now preserves Forge manifest source model identity for compare naming. If `manifest.source_model.path` is `models/onnx/yolov8n.onnx` and the explicit TensorRT artifact path is `model.engine`, Runtime can keep `compare_model_name=yolov8n` and `compare_key=yolov8n__b1__h640w640__fp32`. - Guided demo entrypoint: `scripts/demo_pipeline_full.sh` summarizes the full Forge -> Runtime -> Lab -> optional AIGuard flow and can print the Jetson TensorRT Runtime command without claiming production worker or SaaS readiness. +- Local Studio demo evidence: `/studio` can load bundled ONNX Runtime CPU and TensorRT Jetson Runtime result fixtures from `examples/studio_demo`, keep the demo pair selectable in Recent jobs while the local server process is alive, and show TensorRT Jetson vs ONNX Runtime CPU comparison in the browser. The fixture-backed evidence records ONNX Runtime CPU at mean 45.4299 ms / p99 49.2128 ms / 22.0119 FPS and TensorRT Jetson at mean 9.9375 ms / p99 15.5231 ms / 100.6293 FPS, a 4.57x TensorRT speedup for this demo pair. The direct Runtime execution result includes `deployment_decision`. Its `unknown` value is expected before Lab compare/report because the worker response has not yet been compared by Lab. @@ -136,6 +138,7 @@ Forge summary - **SaaS-ready API + async job workflow:** Lab has API response contracts, in-memory async job stubs, and worker request/response mapping without prematurely adding DB/queue infrastructure. - **Deterministic rule-based diagnosis:** AIGuard uses rule + evidence detectors instead of vague LLM judgement. - **Deployment decision ownership:** Lab keeps final deploy/review/blocked ownership while preserving optional guard evidence. +- **Local-first Studio demo:** The browser UI can replay real validation evidence locally without adding DB, queue, upload, auth, billing, or production SaaS infrastructure. ## 8. Current Limitations and Next Steps @@ -147,7 +150,7 @@ Current planned production work: - full automated Forge/Runtime execution from a production Lab worker - database, Redis, or queue - file upload flow -- SaaS frontend +- production frontend beyond Local Studio - production authentication, billing, and deployment controls Next practical step: @@ -164,5 +167,5 @@ Next practical step: - "macOS ONNX Runtime CPU smoke와 Jetson Orin Nano TensorRT smoke를 모두 확보했고, Jetson에서는 Forge manifest + TensorRT `model.engine` + C++ Runtime CLI 실행으로 mean 약 14.00 ms, p99 약 15.50 ms, FPS 약 71.44 evidence를 확보했습니다." - "Runtime source identity polish 이후에는 manifest-backed TensorRT engine artifact도 `compare_model_name=yolov8n`, `compare_key=yolov8n__b1__h640w640__fp32`를 유지할 수 있습니다." - "AIGuard는 LLM 추측이 아니라 artifact hash, source hash, precision, shape 같은 evidence를 비교하는 deterministic detector 구조입니다." -- "아직 production worker, DB/Redis/queue, frontend, auth/billing은 계획 단계로 명확히 구분했고, 먼저 contract와 smoke coverage를 안정화했습니다." +- "아직 production worker, DB/Redis/queue, production frontend, auth/billing은 계획 단계로 명확히 구분했고, 먼저 contract와 smoke coverage를 안정화했습니다." - "이 프로젝트는 AI inference engineer 포트폴리오 관점에서 C++ runtime, Python orchestration, schema contract, provenance validation, SaaS API boundary를 하나의 제품형 pipeline으로 연결한 사례입니다." diff --git a/docs/portfolio/inferedge_resume_interview_summary.md b/docs/portfolio/inferedge_resume_interview_summary.md index cc9069d..1e6a954 100644 --- a/docs/portfolio/inferedge_resume_interview_summary.md +++ b/docs/portfolio/inferedge_resume_interview_summary.md @@ -5,6 +5,7 @@ - Built InferEdge, an end-to-end Edge AI inference validation pipeline that connects Forge build provenance, C++ Runtime execution, Lab comparison/report/API/job workflows, optional AIGuard diagnosis evidence, and Lab-owned deployment decisions. - Validated real execution paths on both macOS and edge hardware: `yolov8n.onnx` through Lab -> C++ Runtime -> ONNX Runtime CPU -> Lab job result ingestion, and Jetson Orin Nano TensorRT execution through Forge manifest + `model.engine` + C++ Runtime CLI. - Documented Jetson TensorRT smoke evidence with mean latency about 14.00 ms, p99 about 15.50 ms, and about 71.44 FPS on a Forge-generated TensorRT engine artifact. +- Added Local Studio as a local-first browser workflow UI that can replay bundled ONNX Runtime CPU and TensorRT Jetson demo evidence, showing 45.4299 ms vs 9.9375 ms mean latency and a 4.57x TensorRT speedup without claiming production SaaS readiness. - Polished Runtime provenance readiness so manifest-backed TensorRT artifacts preserve source identity: `model.engine` can keep `compare_model_name=yolov8n` and `compare_key=yolov8n__b1__h640w640__fp32`. ## Role-Specific Resume Versions @@ -19,7 +20,7 @@ Built a multi-repository edge inference validation workflow that connects model ### Backend / AI Platform -Built the Lab-side orchestration and contract foundation for an edge AI validation platform. InferEdgeLab exposes compare/API/job/deployment-decision boundaries, maps analyze jobs to worker requests, ingests worker responses, preserves optional AIGuard evidence, and keeps production infrastructure concerns separate from the contract layer. Current scope is SaaS/API/job contract foundation plus dev-only Runtime execution smoke; production worker daemon, persistent queue/database, file upload, frontend, auth, and billing remain future work. +Built the Lab-side orchestration and contract foundation for an edge AI validation platform. InferEdgeLab exposes compare/API/job/deployment-decision boundaries, maps analyze jobs to worker requests, ingests worker responses, preserves optional AIGuard evidence, and provides a local-first Studio UI for browser-based workflow inspection. Current scope is SaaS/API/job contract foundation plus dev-only Runtime execution smoke and local Studio demo evidence; production worker daemon, persistent queue/database, file upload, production frontend, auth, and billing remain future work. ## Interview: First 30 Seconds diff --git a/examples/studio_demo/onnxruntime_cpu_result.json b/examples/studio_demo/onnxruntime_cpu_result.json new file mode 100644 index 0000000..5dd20d4 --- /dev/null +++ b/examples/studio_demo/onnxruntime_cpu_result.json @@ -0,0 +1,38 @@ +{ + "runtime_role": "runtime-result", + "model": "yolov8n.onnx", + "engine": "onnxruntime", + "engine_backend": "onnxruntime", + "device": "cpu", + "device_name": "cpu", + "precision": "fp32", + "batch": 1, + "height": 640, + "width": 640, + "mean_ms": 45.4299, + "p99_ms": 49.2128, + "fps_value": 22.0119, + "success": true, + "status": "success", + "timestamp": "2026-04-29T12:00:00Z", + "compare_key": "yolov8n__b1__h640w640__fp32", + "backend_key": "onnxruntime__cpu", + "system": { + "os": "macOS", + "machine": "arm64" + }, + "run_config": { + "warmup": 1, + "runs": 5, + "mode": "image", + "task": "detection" + }, + "accuracy": {}, + "extra": { + "input_mode": "image", + "input_preprocess": "opencv_bgr_to_rgb_resize_float32_nchw", + "effective_batch": 1, + "effective_height": 640, + "effective_width": 640 + } +} diff --git a/examples/studio_demo/tensorrt_jetson_result.json b/examples/studio_demo/tensorrt_jetson_result.json new file mode 100644 index 0000000..8fb682b --- /dev/null +++ b/examples/studio_demo/tensorrt_jetson_result.json @@ -0,0 +1,40 @@ +{ + "runtime_role": "runtime-result", + "model": "yolov8n.onnx", + "engine": "tensorrt", + "engine_backend": "tensorrt", + "device": "jetson", + "device_name": "jetson", + "precision": "fp32", + "batch": 1, + "height": 640, + "width": 640, + "mean_ms": 9.9375, + "p99_ms": 15.5231, + "fps_value": 100.6293, + "success": true, + "status": "success", + "timestamp": "2026-04-29T12:01:00Z", + "compare_key": "yolov8n__b1__h640w640__fp32", + "backend_key": "tensorrt__jetson", + "system": { + "os": "Linux 5.15.148-tegra", + "machine": "aarch64" + }, + "run_config": { + "warmup": 1, + "runs": 5, + "mode": "image", + "task": "detection" + }, + "accuracy": {}, + "extra": { + "input_mode": "image", + "input_preprocess": "opencv_bgr_to_rgb_resize_float32_nchw", + "manifest_applied": true, + "runtime_artifact_path": "~/InferEdgeForge/builds/yolov8n__jetson__tensorrt__jetson_fp16/model.engine", + "effective_batch": 1, + "effective_height": 640, + "effective_width": 640 + } +} diff --git a/inferedgelab/studio/routes.py b/inferedgelab/studio/routes.py index 6cdf126..5f9acb6 100644 --- a/inferedgelab/studio/routes.py +++ b/inferedgelab/studio/routes.py @@ -1,5 +1,7 @@ from __future__ import annotations +import json +from datetime import datetime, timezone from pathlib import Path from typing import Any @@ -9,6 +11,7 @@ from fastapi import Request from fastapi import Body from fastapi.responses import FileResponse +from fastapi.responses import RedirectResponse from inferedgelab.compare.comparator import compare_results from inferedgelab.compare.judgement import judge_comparison @@ -17,6 +20,12 @@ from inferedgelab.services.deployment_decision import build_deployment_decision STATIC_DIR = Path(__file__).resolve().parent / "static" +DEMO_EVIDENCE_DIR = Path(__file__).resolve().parents[2] / "examples" / "studio_demo" +DEMO_EVIDENCE_FILES = ( + "onnxruntime_cpu_result.json", + "tensorrt_jetson_result.json", +) +DEMO_JOB_ID = "demo_yolov8n_trt_vs_onnx" STATIC_ASSETS = { "app.js": "application/javascript", "style.css": "text/css", @@ -34,6 +43,11 @@ def studio_index() -> FileResponse: ) +@router.get("/studio로", include_in_schema=False) +def studio_korean_particle_redirect() -> RedirectResponse: + return RedirectResponse(url="/studio", status_code=307) + + @router.get("/studio/static/{asset_name}", include_in_schema=False) def studio_static(asset_name: str) -> FileResponse: media_type = STATIC_ASSETS.get(asset_name) @@ -51,11 +65,13 @@ def studio_jobs(request: Request) -> dict[str, Any]: store = _get_studio_job_store(request) jobs = [] if store is not None: - jobs = sorted( - getattr(store, "_jobs", {}).values(), - key=lambda job: str(job.get("updated_at") or job.get("created_at") or ""), - reverse=True, - ) + jobs.extend(getattr(store, "_jobs", {}).values()) + jobs.extend(_get_demo_jobs(request).values()) + jobs = sorted( + jobs, + key=lambda job: str(job.get("updated_at") or job.get("created_at") or ""), + reverse=True, + ) return { "source": "/api/jobs", "count": len(jobs), @@ -65,6 +81,10 @@ def studio_jobs(request: Request) -> dict[str, Any]: @router.get("/studio/api/job/{job_id}", include_in_schema=False) def studio_job_detail(request: Request, job_id: str) -> dict[str, Any]: + demo_job = _get_demo_jobs(request).get(job_id) + if demo_job is not None: + return demo_job + endpoint = _get_api_endpoint(request.app, "/api/jobs/{job_id}") return endpoint(job_id=job_id) @@ -101,7 +121,15 @@ def studio_run(request: Request, payload: dict[str, Any] = Body(...)) -> dict[st raise HTTPException(status_code=400, detail="model_path is required") endpoint = _get_api_endpoint(request.app, "/api/analyze") - job = endpoint(payload={"model_path": model_path.strip(), "notes": "Created from Local Studio Run"}) + analyze_payload: dict[str, Any] = { + "model_path": model_path.strip(), + "notes": "Created from Local Studio Run", + } + options = payload.get("options") + if isinstance(options, dict): + analyze_payload["options"] = dict(options) + job = endpoint(payload=analyze_payload) + job["display_name"] = _build_analyze_display_name(job) return { "status": "created", "source": "/api/analyze", @@ -113,6 +141,7 @@ def studio_run(request: Request, payload: dict[str, Any] = Body(...)) -> dict[st @router.post("/studio/api/import", include_in_schema=False) def studio_import(request: Request, payload: dict[str, Any] = Body(...)) -> dict[str, Any]: result = _load_import_payload(payload) + result = _apply_backend_override(result, payload.get("backend_override")) imported_results = _get_imported_results(request) imported_results.append(result) return { @@ -124,6 +153,27 @@ def studio_import(request: Request, payload: dict[str, Any] = Body(...)) -> dict } +@router.get("/studio/api/demo-evidence", include_in_schema=False) +def studio_demo_evidence(request: Request) -> dict[str, Any]: + results = [_load_demo_result(file_name) for file_name in DEMO_EVIDENCE_FILES] + imported_results = _get_imported_results(request) + imported_results.extend(results) + compare = _build_imported_compare_response(results[0], results[1]) + demo_job = _build_demo_job(results, compare) + _get_demo_jobs(request)[DEMO_JOB_ID] = demo_job + return { + "status": "loaded", + "source": "examples/studio_demo", + "job_id": DEMO_JOB_ID, + "job": demo_job, + "count": len(results), + "results": results, + "compare_ready": True, + "compare": compare, + "deployment_decision": compare["deployment_decision"], + } + + @router.get("/studio/api/jetson-command", include_in_schema=False) def studio_jetson_command() -> dict[str, str]: command = "\n".join( @@ -141,9 +191,19 @@ def studio_jetson_command() -> dict[str, str]: return {"command": command} +@router.get("/studio{suffix:path}", include_in_schema=False) +def studio_path_fallback(suffix: str) -> RedirectResponse: + if suffix.startswith("/api") or suffix.startswith("/static"): + raise HTTPException(status_code=404, detail="studio route not found") + if suffix: + return RedirectResponse(url="/studio", status_code=307) + return RedirectResponse(url="/studio", status_code=307) + + def register_studio(app: FastAPI, job_store: Any | None = None) -> None: app.state.studio_job_store = job_store app.state.studio_imported_results = [] + app.state.studio_demo_jobs = {} app.include_router(router) @@ -159,6 +219,14 @@ def _get_imported_results(request: Request) -> list[dict[str, Any]]: return imported_results +def _get_demo_jobs(request: Request) -> dict[str, dict[str, Any]]: + demo_jobs = getattr(request.app.state, "studio_demo_jobs", None) + if demo_jobs is None: + demo_jobs = {} + request.app.state.studio_demo_jobs = demo_jobs + return demo_jobs + + def _get_api_endpoint(app: FastAPI, path: str) -> Any: for route in app.routes: if getattr(route, "path", None) == path: @@ -210,6 +278,73 @@ def _build_imported_compare_response(base: dict[str, Any], new: dict[str, Any]) } +def _load_demo_result(file_name: str) -> dict[str, Any]: + path = DEMO_EVIDENCE_DIR / file_name + try: + raw = json.loads(path.read_text(encoding="utf-8")) + except OSError as exc: + raise HTTPException(status_code=500, detail=f"demo evidence not found: {file_name}") from exc + except json.JSONDecodeError as exc: + raise HTTPException(status_code=500, detail=f"demo evidence is invalid JSON: {file_name}") from exc + + try: + result = normalize_result_schema(raw) + except (TypeError, ValueError) as exc: + raise HTTPException(status_code=500, detail=f"demo evidence schema error: {file_name}") from exc + result.setdefault("legacy_result", False) + result["_source_path"] = str(path.relative_to(DEMO_EVIDENCE_DIR.parents[1])) + return _with_compare_keys(result) + + +def _build_demo_job(results: list[dict[str, Any]], compare: dict[str, Any]) -> dict[str, Any]: + now = _utc_now_iso() + runtime_result = results[-1] if results else {} + return { + "job_id": DEMO_JOB_ID, + "display_name": "Demo: TensorRT vs ONNX Runtime", + "status": "completed", + "created_at": now, + "updated_at": now, + "input_summary": { + "workflow": "studio_demo_evidence", + "model_path": "examples/studio_demo/*.json", + "notes": "Bundled Local Studio demo evidence", + }, + "result": { + "runtime_result": runtime_result, + "comparison": compare, + "deployment_decision": compare["deployment_decision"], + "summary": compare["judgement"]["summary"], + }, + "error": None, + "links": { + "self": f"/studio/api/job/{DEMO_JOB_ID}", + "compare": "/studio/api/compare/latest", + }, + "next_actions": ["review_compare"], + } + + +def _build_analyze_display_name(job: dict[str, Any]) -> str: + input_summary = job.get("input_summary") or {} + model_path = _first_display_value(input_summary.get("model_path"), input_summary.get("artifact_path")) + model_name = Path(model_path).name if model_path else "analyze job" + options = input_summary.get("options") if isinstance(input_summary.get("options"), dict) else {} + backend = _first_display_value(options.get("backend")) + device = _first_display_value(options.get("device")) + suffix = f" ({backend}/{device})" if backend or device else "" + return f"Analyze {model_name}{suffix}" + + +def _utc_now_iso() -> str: + return ( + datetime.now(timezone.utc) + .replace(microsecond=0) + .isoformat() + .replace("+00:00", "Z") + ) + + def _with_compare_keys(result: dict[str, Any]) -> dict[str, Any]: enriched = dict(result) if not enriched.get("backend_key"): @@ -232,6 +367,29 @@ def _with_compare_keys(result: dict[str, Any]) -> dict[str, Any]: return enriched +def _apply_backend_override(result: dict[str, Any], override: Any) -> dict[str, Any]: + if not isinstance(override, str) or not override.strip(): + return result + + override = override.strip() + if override == "onnxruntime__cpu": + engine = "onnxruntime" + device = "cpu" + elif override == "tensorrt__jetson": + engine = "tensorrt" + device = "jetson" + else: + raise HTTPException(status_code=400, detail="unsupported backend_override") + + enriched = dict(result) + enriched["engine"] = engine + enriched["engine_backend"] = engine + enriched["device"] = device + enriched["device_name"] = device + enriched["backend_key"] = override + return _with_compare_keys(enriched) + + def _first_display_value(*values: Any) -> str: for value in values: display_value = _display_value(value) diff --git a/inferedgelab/studio/static/app.js b/inferedgelab/studio/static/app.js index 87abdea..5cd2618 100644 --- a/inferedgelab/studio/static/app.js +++ b/inferedgelab/studio/static/app.js @@ -28,6 +28,7 @@ let selectedJobId = null; let compareData = null; let activeDecision = null; let importedResult = null; +const importedResultsByJobId = {}; function createElement(tagName, className, textContent) { const element = document.createElement(tagName); @@ -119,6 +120,12 @@ function assertHttpStudio() { } } +function markFileMode() { + if (window.location.protocol === "file:") { + document.body.classList.add("file-mode"); + } +} + async function responseErrorMessage(response) { const fallback = `Request failed: ${response.status}`; try { @@ -230,7 +237,10 @@ async function runModel() { setStatus("#run-status", "Loading: creating analyze job...", "loading"); renderPipeline(); try { - const payload = await postJson("/studio/api/run", { model_path: modelPath }); + const payload = await postJson("/studio/api/run", { + model_path: modelPath, + options: runOptions(), + }); selectedJobId = payload.job_id; selectedJob = payload.job || null; setStatus("#run-status", `Success: created ${payload.job_id}`, "success"); @@ -261,6 +271,7 @@ async function importResult() { try { const payload = await postJson("/studio/api/import", buildImportPayload(jsonPath, jsonPayload)); importedResult = payload.result; + rememberImportedResultForSelectedJob(importedResult); setStatus( "#import-status", payload.compare_ready @@ -283,14 +294,38 @@ async function importResult() { } function buildImportPayload(path, jsonPayload) { + const payload = {}; if (jsonPayload) { try { - return { result: JSON.parse(jsonPayload) }; + payload.result = JSON.parse(jsonPayload); } catch (error) { throw new Error("JSON payload is not valid JSON."); } + } else { + payload.path = path; + } + const backendOverride = document.querySelector("#import-backend-preset")?.value || ""; + if (backendOverride) { + payload.backend_override = backendOverride; + } + if (selectedJobId) { + payload.job_id = selectedJobId; + } + return payload; +} + +function runOptions() { + return { + backend: document.querySelector("#run-backend")?.value || "onnxruntime", + device: document.querySelector("#run-device")?.value || "cpu", + }; +} + +function rememberImportedResultForSelectedJob(result) { + if (!selectedJobId || !result) { + return; } - return { path }; + importedResultsByJobId[selectedJobId] = result; } async function loadJetsonCommand() { @@ -320,6 +355,36 @@ async function copyJetsonCommand() { } } +async function loadDemoEvidence() { + const button = document.querySelector("#load-demo-evidence"); + button.disabled = true; + setState("#demo-state", "running"); + setStatus("#demo-status", "Loading: importing bundled Runtime evidence...", "loading"); + renderPipeline(); + try { + const payload = await fetchJson("/studio/api/demo-evidence"); + const results = Array.isArray(payload.results) ? payload.results : []; + importedResult = results[results.length - 1] || null; + compareData = payload.compare || null; + selectedJobId = payload.job_id || payload.job?.job_id || selectedJobId; + selectedJob = payload.job || selectedJob; + setState("#demo-state", "completed"); + setState("#import-state", "completed"); + setStatus("#demo-status", "Success: demo evidence loaded.", "success"); + setStatus("#import-status", "Success: demo ONNX Runtime + TensorRT evidence imported.", "success"); + renderImportEvidence({ result: importedResult }); + renderImportedResult(); + await loadJobs(selectedJobId); + await loadCompare(); + } catch (error) { + setState("#demo-state", "idle"); + setStatus("#demo-status", `Error: ${formatError(error)}`, "error"); + } finally { + button.disabled = false; + renderPipeline(); + } +} + function renderPipeline() { const target = document.querySelector("#pipeline-flow"); target.replaceChildren(); @@ -338,6 +403,7 @@ function renderPipeline() { card.append(top, title, detail); if (stage.optional) { card.append(createElement("span", "soft-label", "optional")); + card.append(createElement("p", "stage-note", "No guard run is required for local validation.")); } target.append(card); }); @@ -375,9 +441,24 @@ function renderRunPanel() { document.querySelector("#run-button").onclick = runModel; document.querySelector("#import-button").onclick = importResult; document.querySelector("#copy-jetson-command").onclick = copyJetsonCommand; + document.querySelector("#load-demo-evidence").onclick = loadDemoEvidence; setState("#run-state", "idle"); setState("#import-state", "idle"); setState("#jetson-state", "idle"); + setState("#demo-state", "idle"); +} + +function resetTransientInputs() { + ["#run-model-path", "#import-json-path", "#import-json-payload"].forEach((selector) => { + const target = document.querySelector(selector); + if (target) { + target.value = ""; + } + }); + const importPreset = document.querySelector("#import-backend-preset"); + if (importPreset) { + importPreset.value = ""; + } } function renderJobList() { @@ -391,7 +472,7 @@ function renderJobList() { return; } - currentJobs.forEach((job) => { + currentJobs.forEach((job, index) => { const row = createElement("button", "job-row"); row.type = "button"; if (selectedJobId === job.job_id) { @@ -401,14 +482,35 @@ function renderJobList() { const main = createElement("span", "job-main"); main.append( - createElement("strong", "", job.job_id || "-"), - createElement("span", "caption", job.updated_at || job.created_at || "-"), + createElement("strong", "", jobDisplayName(job, index)), + createElement("span", "caption", jobCaption(job)), ); row.append(main, createElement("span", `state-pill ${normalizeState(job.status)}`, job.status || "idle")); target.append(row); }); } +function jobDisplayName(job, index) { + if (job.display_name) { + return job.display_name; + } + const input = job.input_summary || {}; + const modelPath = input.model_path || input.artifact_path; + const modelName = modelPath ? modelPath.split("/").pop() : ""; + const prefix = modelName ? `Analyze ${modelName}` : `Analyze job ${index + 1}`; + const options = input.options || {}; + const backend = firstDisplayValue(options.backend); + const device = firstDisplayValue(options.device); + const suffix = backend || device ? ` (${[backend, device].filter(Boolean).join("/")})` : ""; + return `${prefix}${suffix}`; +} + +function jobCaption(job) { + const timestamp = job.updated_at || job.created_at || "-"; + const jobId = job.job_id || "-"; + return `${jobId} · ${timestamp}`; +} + function renderJobDetail(emptyMessage = "Select a job or import a Runtime result.") { const target = document.querySelector("#job-detail"); const selectedStatus = document.querySelector("#selected-job-status"); @@ -425,30 +527,44 @@ function renderJobDetail(emptyMessage = "Select a job or import a Runtime result const result = selectedJob.result || {}; const runtimeResult = extractRuntimeResult(selectedJob); + const importedForJob = importedResultsByJobId[selectedJob.job_id] || {}; + const displayResult = hasRuntimeMetrics(runtimeResult) ? runtimeResult : importedForJob; const compareMetrics = result.comparison?.result?.metrics || result.data?.result?.metrics || {}; const input = selectedJob.input_summary || {}; + const inputOptions = input.options || {}; + const hasMetrics = hasRuntimeMetrics(displayResult); - const metrics = [ - ["model", runtimeModelName(runtimeResult) || input.model_path || input.artifact_path], - ["backend", runtimeBackendName(runtimeResult)], - ["device", runtimeDeviceName(runtimeResult)], - ["mean", runtimeResult.mean_ms ?? compareMetrics.mean_ms?.new], - ["p99", runtimeResult.p99_ms ?? compareMetrics.p99_ms?.new], - ["fps", runtimeResult.fps || runtimeResult.fps_value], - ["compare_key", runtimeResult.compare_key], - ["backend_key", runtimeResult.backend_key], + const identityMetrics = [ + ["model", runtimeModelName(displayResult) || input.model_path || input.artifact_path], + ["backend", runtimeBackendName(displayResult) || inputOptions.backend], + ["device", runtimeDeviceName(displayResult) || inputOptions.device], + ]; + const resultMetrics = [ + ["mean", displayResult.mean_ms ?? compareMetrics.mean_ms?.new], + ["p99", displayResult.p99_ms ?? compareMetrics.p99_ms?.new], + ["fps", displayResult.fps || displayResult.fps_value], + ["compare_key", displayResult.compare_key], + ["backend_key", displayResult.backend_key || normalizedBackendKey(displayResult)], ]; + const metrics = hasMetrics ? identityMetrics.concat(resultMetrics) : identityMetrics; metrics.forEach(([label, value]) => { target.append(metricTile(label, formatValue(value))); }); const status = String(selectedJob.status || "").toLowerCase(); - if (!selectedJob.result && status === "queued") { + if (!selectedJob.result && status === "queued" && !hasRuntimeMetrics(displayResult)) { target.append( detailNote( "Queued job", - "The local API accepted this analyze job. Runtime metrics will appear after a worker/dev completion flow or after importing Runtime result JSON.", + "This is a request record only. Runtime metrics are not attached to this job yet; use Import or Load Demo Evidence to inspect actual validation evidence.", + ), + ); + } else if (!selectedJob.result && hasRuntimeMetrics(displayResult)) { + target.append( + detailNote( + "Imported evidence linked in Studio", + "This queued analyze job is showing the latest Runtime result imported while the job was selected. The backend contract remains local and in-memory.", ), ); } else if (selectedJob.error) { @@ -507,8 +623,8 @@ function renderCompare() { const sameBackend = normalizedBackendKey(base) && normalizedBackendKey(base) === normalizedBackendKey(newer); target.append( - compareMetricCard("TensorRT", tensorRt?.mean_ms, normalizedBackendKey(tensorRt) || "tensorrt"), - compareMetricCard("ONNX Runtime", onnx?.mean_ms, normalizedBackendKey(onnx) || "onnxruntime"), + compareMetricCard("TensorRT", tensorRt, normalizedBackendKey(tensorRt) || "tensorrt"), + compareMetricCard("ONNX Runtime", onnx, normalizedBackendKey(onnx) || "onnxruntime"), compareSummaryCard(meanMetric, speedup, base, newer, sameBackend, judgement.overall), ); } @@ -522,7 +638,8 @@ function renderDecision(decision) { target.append( createElement("p", "caption", "Decision"), createElement("h3", "", "UNKNOWN"), - createElement("p", "body-text", "No deployment decision is available yet."), + createElement("p", "body-text", "No Lab comparison decision is available yet."), + createElement("p", "caption", "Load demo evidence or import two compatible Runtime results. AIGuard remains optional."), ); return; } @@ -533,7 +650,7 @@ function renderDecision(decision) { createElement("p", "caption", "Decision"), createElement("h3", "", decisionName.toUpperCase()), createElement("p", "body-text", decisionReason(decision)), - createElement("p", "caption", decision.notes || decision.recommended_action || ""), + createElement("p", "caption", decisionNotes(decision)), ); } @@ -554,16 +671,33 @@ function detailNote(title, message) { return note; } -function compareMetricCard(label, meanMs, backendKey) { +function compareMetricCard(label, result, backendKey) { const card = createElement("article", "compare-card"); + const meanMs = result?.mean_ms; card.append( createElement("p", "caption", backendKey), createElement("h3", "", label), createElement("strong", "compare-value", meanMs === undefined || meanMs === null ? "-" : `${formatNumber(meanMs)} ms`), + compareStatList(result), ); return card; } +function compareStatList(result = {}) { + const list = createElement("div", "compare-stat-list"); + list.append( + compareStat("p99", result?.p99_ms === undefined ? "-" : `${formatNumber(result.p99_ms)} ms`), + compareStat("fps", result?.fps_value ?? result?.fps ?? "-"), + ); + return list; +} + +function compareStat(label, value) { + const row = createElement("div", "compare-stat"); + row.append(createElement("span", "", label), createElement("strong", "", formatValue(value))); + return row; +} + function compareSummaryCard(metric, speedup, base, newer, sameBackend = false, overall = "unknown") { const card = createElement("article", `compare-card highlight ${compareTone(overall)}`); const diff = formatLatencyDiff(metric); @@ -608,11 +742,19 @@ function extractDecision(payload) { function decisionReason(decision) { const decisionName = String(decision?.decision || "unknown").toLowerCase(); if (decisionName === "unknown" && !decision?.guard_status) { - return "AIGuard evidence not provided."; + return "Lab comparison is available, but AIGuard diagnosis evidence was not loaded for this local demo."; } return decision?.reason || "-"; } +function decisionNotes(decision) { + const decisionName = String(decision?.decision || "unknown").toLowerCase(); + if (decisionName === "unknown" && !decision?.guard_status) { + return "This is expected: AIGuard is optional and only needed when guard-backed diagnosis evidence is part of the review."; + } + return decision?.notes || decision?.recommended_action || ""; +} + function extractRuntimeResult(job) { const result = job?.result; if (!result) { @@ -628,6 +770,18 @@ function extractRuntimeResult(job) { ); } +function hasRuntimeMetrics(result = {}) { + return Boolean( + result && + (result.mean_ms !== undefined || + result.p99_ms !== undefined || + result.fps !== undefined || + result.fps_value !== undefined || + result.compare_key || + normalizedBackendKey(result)), + ); +} + function pipelineStatus() { const anyRunning = currentJobs.some((job) => job.status === "queued" || job.status === "running"); const anyCompleted = currentJobs.some((job) => job.status === "completed") || Boolean(importedResult); @@ -800,6 +954,8 @@ function formatValue(value) { async function initLocalStudio() { try { + markFileMode(); + resetTransientInputs(); renderRunPanel(); renderPipeline(); renderJobDetail(); diff --git a/inferedgelab/studio/static/index.html b/inferedgelab/studio/static/index.html index 882bd4f..495feba 100644 --- a/inferedgelab/studio/static/index.html +++ b/inferedgelab/studio/static/index.html @@ -32,7 +32,6 @@ } .studio-header, - .section-heading, .card-header, .panel-title-row { display: flex; @@ -40,6 +39,12 @@ gap: 16px; } + .section-heading { + display: flex; + justify-content: flex-start; + gap: 14px; + } + .studio-header, .section-heading { align-items: start; @@ -132,7 +137,8 @@ } } - + +
@@ -145,6 +151,11 @@

InferEdge Local Studio

SaaS-ready (local mode)
+
+ Open Local Studio through the local server. +

Use http://127.0.0.1:8000/studio after running inferedgelab serve. Direct file mode can preview layout only; Run, Import, Compare, and Jetson helpers require the local API.

+
+

01

@@ -175,7 +186,23 @@

Create analyze job

- + +
+
+ + +
+
+ + +
+

@@ -191,11 +218,18 @@

Runtime result JSON

- + + + @@ -219,6 +253,21 @@

Runtime command

+ +
+
+
+

Demo Evidence

+

Replay validation evidence

+
+ idle +
+

Load the bundled ONNX Runtime CPU and TensorRT Jetson results to reproduce Compare and Decision locally.

+
+ +

+
+
@@ -265,14 +314,14 @@

Compare View

05

Deployment Decision

-

Lab remains the final decision owner; AIGuard evidence is optional.

+

Lab's local gate for deploy, review, or block. AIGuard evidence is optional and not required for this Studio flow.

-
+

Later

Future Work

@@ -282,6 +331,7 @@

Future Work

- + + diff --git a/inferedgelab/studio/static/style.css b/inferedgelab/studio/static/style.css index a602fb1..4f6e681 100644 --- a/inferedgelab/studio/static/style.css +++ b/inferedgelab/studio/static/style.css @@ -47,6 +47,31 @@ body { margin-bottom: 28px; } +.file-protocol-warning { + display: none; + border: 1px solid rgba(234, 179, 8, 0.38); + border-radius: 12px; + background: rgba(234, 179, 8, 0.08); + color: var(--ink); + margin-bottom: 18px; + padding: 14px 16px; +} + +.file-protocol-warning strong { + display: block; + margin-bottom: 6px; +} + +.file-protocol-warning p { + color: var(--muted); + line-height: 1.5; + margin: 0; +} + +body.file-mode .file-protocol-warning { + display: block; +} + .title-block h1, .section-heading h2, .panel h3, @@ -103,10 +128,39 @@ body { .section-heading { display: flex; align-items: start; + justify-content: flex-start; gap: 14px; margin-bottom: 14px; } +.future-heading { + display: grid; + grid-template-columns: auto minmax(0, 1fr); + align-items: start; + gap: 16px; +} + +.future-heading .section-kicker { + width: auto; + min-width: 62px; + padding-inline: 12px; +} + +.future-heading h2 { + margin: 0; + font-size: 1.2rem; +} + +.future-heading p:last-child { + margin: 6px 0 0; + color: var(--muted); + line-height: 1.5; +} + +.section-heading > div { + min-width: 0; +} + .section-heading h2 { font-size: 1.28rem; } @@ -119,6 +173,7 @@ body { .section-kicker { display: inline-grid; + flex: 0 0 auto; width: 38px; height: 28px; place-items: center; @@ -165,7 +220,26 @@ body { transition: transform 160ms ease, border-color 160ms ease, - background 160ms ease; + background 160ms ease; +} + +.demo-card { + display: grid; + grid-column: 1 / -1; + grid-template-columns: minmax(0, 1fr) minmax(240px, 320px); + align-items: center; + gap: 16px; +} + +.demo-card .card-header, +.demo-card .body-text { + grid-column: 1; +} + +.demo-card .form-stack { + grid-column: 2; + grid-row: 1 / span 2; + margin-top: 0; } .panel:hover, @@ -227,6 +301,13 @@ body { line-height: 1.5; } +.stage-note { + margin-top: 10px; + color: var(--caption); + font-size: 0.78rem; + line-height: 1.45; +} + .state-pill { display: inline-flex; align-items: center; @@ -280,7 +361,8 @@ body { } .form-stack input, -.form-stack textarea { +.form-stack textarea, +.form-stack select { width: 100%; border: 1px solid var(--line-strong); border-radius: 10px; @@ -292,10 +374,30 @@ body { } .form-stack input:focus, -.form-stack textarea:focus { +.form-stack textarea:focus, +.form-stack select:focus { border-color: var(--primary); } +.form-stack select { + appearance: none; + background: + linear-gradient(45deg, transparent 50%, var(--primary) 50%) calc(100% - 18px) 50% / 6px 6px no-repeat, + linear-gradient(135deg, var(--primary) 50%, transparent 50%) calc(100% - 13px) 50% / 6px 6px no-repeat, + #0b1220; +} + +.inline-fields { + display: grid; + grid-template-columns: repeat(2, minmax(0, 1fr)); + gap: 10px; +} + +.inline-fields > div { + display: grid; + gap: 8px; +} + .form-stack textarea { min-height: 146px; resize: vertical; @@ -502,6 +604,28 @@ body { line-height: 1; } +.compare-stat-list { + display: grid; + gap: 8px; + margin-top: 14px; +} + +.compare-stat { + display: flex; + justify-content: space-between; + gap: 12px; + border-top: 1px solid rgba(148, 163, 184, 0.14); + color: var(--muted); + font-size: 0.86rem; + padding-top: 8px; +} + +.compare-stat strong { + color: var(--ink); + font-family: ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, "Liberation Mono", + monospace; +} + .decision-card { min-height: 156px; } @@ -542,6 +666,23 @@ body { min-height: auto; } + .demo-card, + .demo-card .card-header, + .demo-card .body-text, + .demo-card .form-stack { + display: grid; + grid-column: auto; + grid-row: auto; + } + + .demo-card { + grid-template-columns: 1fr; + } + + .demo-card .form-stack { + margin-top: 0; + } + .pipeline-card::after { content: ""; } @@ -565,4 +706,9 @@ body { .evidence-summary { grid-template-columns: 1fr; } + + .inline-fields, + .future-heading { + grid-template-columns: 1fr; + } } diff --git a/tests/test_studio_routes.py b/tests/test_studio_routes.py index cf65750..625bb3b 100644 --- a/tests/test_studio_routes.py +++ b/tests/test_studio_routes.py @@ -4,6 +4,7 @@ from types import SimpleNamespace from fastapi.responses import FileResponse +from fastapi.responses import RedirectResponse import inferedgelab.api as api @@ -59,10 +60,22 @@ def test_studio_route_returns_local_studio_html(): assert "Import" in html assert "Jetson Helper" in html assert 'data-critical="studio-dark"' in html - assert 'href="/studio/static/style.css?v=9"' in html - assert 'src="/studio/static/app.js?v=9"' in html - assert 'value="results/latest.json"' in html + assert 'href="/studio/static/style.css?v=15"' in html + assert 'href="style.css?v=15"' in html + assert 'src="/studio/static/app.js?v=15"' in html + assert 'src="app.js?v=15"' in html + assert "file-protocol-warning" in html + assert 'placeholder="results/latest.json"' in html + assert 'value="results/latest.json"' not in html assert 'id="import-json-payload"' in html + assert 'autocomplete="off"' in html + assert 'id="run-backend"' in html + assert 'id="run-device"' in html + assert 'id="import-backend-preset"' in html + assert "TensorRT / Jetson" in html + assert "Lab's local gate" in html + assert "Load Demo Evidence" in html + assert 'id="demo-state"' in html def test_studio_static_assets_are_served(): @@ -97,21 +110,38 @@ def test_studio_static_assets_include_redesigned_ui_contracts(): assert "DOMContentLoaded" in app_text assert "Open Studio from http://127.0.0.1:8000/studio" in app_text assert "responseErrorMessage" in app_text + assert "markFileMode" in app_text assert "parseJsonResponse" in app_text assert "renderImportEvidence" in app_text - assert "AIGuard evidence not provided" in app_text + assert "AIGuard diagnosis evidence was not loaded" in app_text assert "compareTone" in app_text assert "runtimeModelName" in app_text assert "Same backend" in app_text assert "hasImportedEvidence" in app_text + assert "importedResultsByJobId" in app_text + assert "rememberImportedResultForSelectedJob" in app_text + assert "runOptions" in app_text + assert "resetTransientInputs" in app_text + assert "No guard run is required" in app_text + assert "decisionNotes" in app_text + assert "request record only" in app_text + assert "loadDemoEvidence" in app_text + assert "/studio/api/demo-evidence" in app_text + assert "jobDisplayName" in app_text + assert "jobCaption" in app_text + assert "compareStatList" in app_text assert 'aiguard: hasGuardEvidence ? "completed" : "optional"' in app_text assert "#0b0f14" in style_text assert "grid-template-columns" in style_text assert ".form-stack button" in style_text assert ".tool-card" in style_text assert ".state-pill.optional" in style_text + assert ".file-protocol-warning" in style_text assert ".evidence-summary" in style_text assert ".compare-card.improvement" in style_text + assert ".demo-card" in style_text + assert ".compare-stat-list" in style_text + assert "justify-content: flex-start" in style_text def test_studio_app_preserves_selected_job_detail_contract(): @@ -126,8 +156,11 @@ def test_studio_app_preserves_selected_job_detail_contract(): assert "selectedJobId" in app_text assert "loadJobs(payload.job_id)" in app_text assert "Queued job" in app_text - assert "Runtime metrics will appear" in app_text + assert "Runtime metrics are not attached" in app_text assert ".detail-note" in style_text + assert ".inline-fields" in style_text + assert ".future-heading" in style_text + assert "min-width: 62px" in style_text def test_studio_jobs_api_returns_json_structure(): @@ -143,6 +176,17 @@ def test_studio_jobs_api_returns_json_structure(): assert response["jobs"] == [] +def test_studio_malformed_path_redirects_to_studio(): + app = api.create_app() + route = _get_route(app, "/studio로") + + response = route.endpoint() + + assert isinstance(response, RedirectResponse) + assert response.status_code == 307 + assert response.headers["location"] == "/studio" + + def test_studio_compare_latest_api_returns_json_structure(): app = api.create_app() route = _get_route(app, "/studio/api/compare/latest") @@ -163,12 +207,23 @@ def test_studio_run_api_creates_analyze_job(): route = _get_route(app, "/studio/api/run") request = SimpleNamespace(app=app) - response = route.endpoint(request=request, payload={"model_path": "models/yolov8n.onnx"}) + response = route.endpoint( + request=request, + payload={ + "model_path": "models/yolov8n.onnx", + "options": {"backend": "tensorrt", "device": "jetson"}, + }, + ) assert response["status"] == "created" assert response["source"] == "/api/analyze" assert response["job_id"].startswith("job_") + assert response["job"]["display_name"] == "Analyze yolov8n.onnx (tensorrt/jetson)" assert response["job"]["input_summary"]["model_path"] == "models/yolov8n.onnx" + assert response["job"]["input_summary"]["options"] == { + "backend": "tensorrt", + "device": "jetson", + } def test_studio_run_job_can_be_listed_and_selected(): @@ -184,6 +239,7 @@ def test_studio_run_job_can_be_listed_and_selected(): assert jobs["count"] == 1 assert jobs["jobs"][0]["job_id"] == created["job_id"] + assert jobs["jobs"][0]["display_name"] == "Analyze yolov8n.onnx" assert detail["job_id"] == created["job_id"] assert detail["status"] == "queued" assert detail["result"] is None @@ -204,6 +260,24 @@ def test_studio_import_api_accepts_runtime_result_json(): assert response["compare_ready"] is False +def test_studio_import_api_applies_backend_override(): + app = api.create_app() + route = _get_route(app, "/studio/api/import") + request = SimpleNamespace(app=app) + result = _runtime_result(engine="onnxruntime", device="cpu", mean_ms=9.9) + + response = route.endpoint( + request=request, + payload={"result": result, "backend_override": "tensorrt__jetson"}, + ) + + assert response["status"] == "imported" + assert response["result"]["engine"] == "tensorrt" + assert response["result"]["engine_backend"] == "tensorrt" + assert response["result"]["device"] == "jetson" + assert response["result"]["backend_key"] == "tensorrt__jetson" + + def test_studio_import_api_generates_missing_compare_keys(): app = api.create_app() route = _get_route(app, "/studio/api/import") @@ -244,6 +318,55 @@ def test_studio_jetson_command_api_returns_command(): assert "--output results/jetson/" in response["command"] +def test_studio_demo_evidence_loads_compare_ready_pair(): + app = api.create_app() + route = _get_route(app, "/studio/api/demo-evidence") + compare_route = _get_route(app, "/studio/api/compare/latest") + request = SimpleNamespace(app=app) + + response = route.endpoint(request=request) + compare = compare_route.endpoint(request=request) + + assert response["status"] == "loaded" + assert response["source"] == "examples/studio_demo" + assert response["job_id"] == "demo_yolov8n_trt_vs_onnx" + assert response["job"]["display_name"] == "Demo: TensorRT vs ONNX Runtime" + assert response["job"]["status"] == "completed" + assert response["count"] == 2 + assert response["compare_ready"] is True + assert response["results"][0]["backend_key"] == "onnxruntime__cpu" + assert response["results"][1]["backend_key"] == "tensorrt__jetson" + assert response["results"][0]["mean_ms"] == 45.4299 + assert response["results"][1]["mean_ms"] == 9.9375 + assert response["compare"]["status"] == "ok" + assert response["compare"]["judgement"]["overall"] == "improvement" + assert response["deployment_decision"]["decision"] == "unknown" + assert compare["status"] == "ok" + assert compare["base"]["backend_key"] == "onnxruntime__cpu" + assert compare["new"]["backend_key"] == "tensorrt__jetson" + + +def test_studio_demo_evidence_is_listed_and_selectable_as_job(): + app = api.create_app() + request = SimpleNamespace(app=app) + demo_route = _get_route(app, "/studio/api/demo-evidence") + jobs_route = _get_route(app, "/studio/api/jobs") + detail_route = _get_route(app, "/studio/api/job/{job_id}") + + demo = demo_route.endpoint(request=request) + jobs = jobs_route.endpoint(request=request) + detail = detail_route.endpoint(request=request, job_id=demo["job_id"]) + + assert jobs["count"] == 1 + assert jobs["jobs"][0]["job_id"] == "demo_yolov8n_trt_vs_onnx" + assert jobs["jobs"][0]["display_name"] == "Demo: TensorRT vs ONNX Runtime" + assert detail["job_id"] == "demo_yolov8n_trt_vs_onnx" + assert detail["status"] == "completed" + assert detail["result"]["runtime_result"]["backend_key"] == "tensorrt__jetson" + assert detail["result"]["comparison"]["base"]["backend_key"] == "onnxruntime__cpu" + assert detail["result"]["comparison"]["new"]["backend_key"] == "tensorrt__jetson" + + def test_studio_importing_two_compatible_results_returns_compare_data(): app = api.create_app() request = SimpleNamespace(app=app)