diff --git a/README.ko.md b/README.ko.md index 72089b1..76c4ef1 100644 --- a/README.ko.md +++ b/README.ko.md @@ -16,6 +16,9 @@ ONNX model -> InferEdgeLab analysis/API/job/deployment_decision -> optional InferEdgeAIGuard deterministic diagnosis evidence -> deploy / review / blocked decision + +Supporting sidecar: +InferEdgeEnv -> local-first run evidence registry / comparability checker ``` ## Summary @@ -24,6 +27,7 @@ ONNX model - Real device execution: Jetson TensorRT + ONNX Runtime CPU - Structured comparison: latency, accuracy, validation evidence - Deployment decision: deployable / review / blocked +- Sidecar evidence registry: InferEdgeEnv는 Lab decision과 분리된 local benchmark evidence와 comparability를 기록 - Local Studio: inference validation을 브라우저에서 확인하는 local-first workflow UI ## What Makes InferEdge Different? @@ -109,6 +113,9 @@ bash scripts/demo_pipeline_full.sh --run-jetson-command-print - **InferEdge-Runtime:** Forge artifact 또는 Lab worker request를 받아 C++ 실행/검증 결과 JSON을 생성합니다. - **InferEdgeLab:** 결과를 비교/리포트/API/job/deployment decision으로 정리하는 owner입니다. - **InferEdgeAIGuard:** provenance mismatch나 suspicious result를 rule/evidence 기반으로 진단하는 optional evidence layer입니다. +- **InferEdgeEnv:** Edge AI inference benchmark result를 local artifact와 SQLite registry로 고정하고 비교 가능성을 판정하는 local-first run evidence registry입니다. + +포트폴리오 경계: InferEdgeLab은 validation / decision layer이고, InferEdgeEnv는 run evidence registry / comparability layer입니다. InferEdge는 모델이 배포 가능한지 검증하고, InferEdgeEnv는 benchmark evidence가 신뢰 가능하고 비교 가능한 형태로 기록됐는지 관리합니다. ## 현재 범위와 future work diff --git a/README.md b/README.md index b478feb..7a3a11d 100644 --- a/README.md +++ b/README.md @@ -17,6 +17,7 @@ Language: English | [한국어](README.ko.md) - Real device execution: Jetson TensorRT + ONNX Runtime CPU - Structured comparison: latency, accuracy, and validation evidence - Deployment decision: deployable / review / blocked +- Sidecar evidence registry: InferEdgeEnv records local benchmark evidence and comparability separately from Lab decisions - Local Studio: interactive workflow UI for inference validation ## What Makes InferEdge Different? @@ -44,6 +45,9 @@ ONNX model -> InferEdgeLab compare / API / job workflow / deployment_decision -> optional InferEdgeAIGuard provenance diagnosis -> deploy / review / blocked decision + +Supporting sidecar: +InferEdgeEnv -> local-first run evidence registry / comparability checker ``` Repository roles are deliberately split: @@ -52,6 +56,9 @@ Repository roles are deliberately split: - **InferEdgeRuntime:** C++ execution, profiling, result export, and worker response boundary. - **InferEdgeLab:** compare/report/API/job workflow and final deployment decision ownership. - **InferEdgeAIGuard:** optional rule + evidence based failure and provenance diagnosis. +- **InferEdgeEnv:** local-first run evidence registry and comparability checker for Edge AI inference benchmark results. + +Portfolio boundary: InferEdgeLab is the validation / decision layer. InferEdgeEnv is the run evidence registry / comparability layer. InferEdge validates whether a model is deployable; InferEdgeEnv records whether benchmark evidence can be trusted and compared. Implemented today: Lab API response contract, `/api/compare`, `/api/analyze` in-memory jobs, worker request/response mappings, Runtime dry-run validation/export, Forge worker/runtime summary, AIGuard provenance mismatch diagnosis, Lab decision/report evidence smoke coverage, dev-only Lab -> Runtime ONNX Runtime smoke using `yolov8n.onnx`, manual Jetson TensorRT Runtime smoke using a Forge manifest plus TensorRT engine artifact, and Runtime source-model identity preservation for compare-ready TensorRT engine results. diff --git a/docs/portfolio/inferedge_1page_architecture.md b/docs/portfolio/inferedge_1page_architecture.md index 8252493..4c6727f 100644 --- a/docs/portfolio/inferedge_1page_architecture.md +++ b/docs/portfolio/inferedge_1page_architecture.md @@ -4,6 +4,8 @@ InferEdge is an end-to-end Edge AI inference validation pipeline that builds deployment artifacts, runs edge inference, compares results, diagnoses provenance issues, and produces deployment decisions. +Supporting sidecar: InferEdgeEnv is a local-first run evidence registry and comparability checker for Edge AI inference benchmark results. + PDF-ready portfolio draft: [InferEdge Portfolio Submission](inferedge_portfolio_submission.md). Local PDF export uses pandoc + xelatex through `bash scripts/export_portfolio_pdf.sh`. ## Problem @@ -23,6 +25,9 @@ ONNX model -> optional InferEdgeAIGuard -> rule + evidence provenance diagnosis -> deploy / review / blocked decision + +Supporting sidecar: +InferEdgeEnv -> local-first run evidence registry / comparability checker ``` ## Repository Roles @@ -31,6 +36,9 @@ ONNX model - **InferEdgeRuntime:** C++ execution/result export layer. Validates or runs model/artifact inputs, measures runtime latency, exports Lab-compatible result JSON, and dry-run exports worker responses. - **InferEdgeLab:** analysis/API/job/deployment decision owner. Compares Runtime results, generates reports, exposes API/job workflow contracts, preserves optional guard evidence, and owns the final `deployment_decision`. - **InferEdgeAIGuard:** optional rule + evidence diagnosis layer. Detects provenance/artifact/config mismatches and returns deterministic `guard_analysis` evidence for Lab to consume. +- **InferEdgeEnv:** run evidence registry / comparability checker. Records benchmark artifacts, SQLite registry entries, evidence bundles, and comparability judgement without owning Lab deployment decisions. + +Portfolio boundary: InferEdgeLab is the validation / decision layer. InferEdgeEnv is the run evidence registry / comparability layer. ## Implemented Evidence diff --git a/docs/portfolio/inferedge_pipeline_status.md b/docs/portfolio/inferedge_pipeline_status.md index 2f8fae2..a2f3f47 100644 --- a/docs/portfolio/inferedge_pipeline_status.md +++ b/docs/portfolio/inferedge_pipeline_status.md @@ -6,6 +6,8 @@ This document summarizes the current portfolio status of the InferEdge multi-rep InferEdge is an end-to-end edge AI inference validation pipeline. It is designed to turn an ONNX model into deployment evidence by connecting artifact build provenance, runtime profiling, Lab comparison/reporting, optional rule-based diagnosis, and a final deployment decision. +Supporting sidecar: InferEdgeEnv is the local-first run evidence registry / comparability checker. InferEdgeLab remains the validation / decision layer; InferEdgeEnv records whether benchmark evidence can be trusted and compared. + For a compressed recruiter/interviewer entry point, see [InferEdge 1-Page Architecture Summary](inferedge_1page_architecture.md). For the current portfolio completion checkpoint, see [InferEdge Final Validation Completion Pass](final_validation_completion.md). @@ -23,6 +25,9 @@ ONNX model -> InferEdgeLab compare / API / job workflow / deployment_decision -> optional InferEdgeAIGuard provenance diagnosis -> deploy / review / blocked decision + +Supporting sidecar: +InferEdgeEnv -> local-first run evidence registry / comparability checker ``` The goal is not only to measure latency. The goal is to create reproducible evidence that can support deployment review. @@ -77,6 +82,18 @@ Current role: - diagnoses artifact/source hash mismatch, precision/shape/backend/target mismatch, and missing provenance - emits `guard_analysis` that Lab can preserve in report/API bundles and reflect in deployment decisions +### InferEdgeEnv + +Env owns local run evidence registry and comparability judgement. + +Current role: + +- stores Edge AI benchmark runs as local artifacts and SQLite registry rows +- preserves result bundles through manifest/checksum based export/import +- records sampler/resource metadata as supplemental evidence +- judges same-condition / conditional / no comparability without producing a leaderboard or composite score +- stays separate from Lab's validation / decision ownership + ## Implemented Connections The current cross-repository loop is covered by documentation, fixtures, and smoke tests: diff --git a/docs/portfolio/inferedge_portfolio_submission.md b/docs/portfolio/inferedge_portfolio_submission.md index 4078783..42132cb 100644 --- a/docs/portfolio/inferedge_portfolio_submission.md +++ b/docs/portfolio/inferedge_portfolio_submission.md @@ -6,6 +6,8 @@ InferEdge는 edge AI 모델을 변환, 실행, 비교, 진단하고 최종 배 InferEdge is not a benchmarking tool, but an end-to-end validation pipeline that connects artifact provenance, runtime behavior, and deployment decisions. +InferEdgeEnv complements this pipeline as a local-first run evidence registry and comparability checker. Lab remains the validation / decision layer; Env records whether benchmark evidence can be trusted and compared. + 이 프로젝트는 단순 latency benchmark가 아니라 artifact provenance, runtime result compatibility, deployment decision까지 연결한다. 목표는 "빠른 숫자"를 보여주는 것이 아니라, 어떤 모델과 산출물이 어떤 환경에서 실행되었고 그 결과를 배포해도 되는지 검토 가능한 evidence로 남기는 것이다. 채용 포트폴리오용 5줄 요약: @@ -26,6 +28,9 @@ ONNX model -> InferEdgeLab compare / API / job workflow / deployment_decision -> optional InferEdgeAIGuard provenance diagnosis -> deploy / review / blocked decision + +Supporting sidecar: +InferEdgeEnv -> local-first run evidence registry / comparability checker ``` ## 2. Problem Statement @@ -44,16 +49,17 @@ InferEdge는 이 질문들을 CLI, JSON schema, report, API contract, worker bou ## 3. System Architecture -InferEdge는 4개 repository를 하나의 pipeline으로 분리한다. +InferEdge는 4개 core repository를 하나의 validation/decision pipeline으로 분리하고, InferEdgeEnv를 supporting run evidence sidecar로 둔다. ```text Forge = build / provenance Runtime = C++ execution / result export Lab = compare / report / API / deployment decision AIGuard = optional rule + evidence diagnosis +Env = local run evidence registry / comparability checker ``` -이 구조의 핵심은 responsibility boundary다. Forge는 artifact를 만들고 provenance를 남긴다. Runtime은 실제 실행과 profiling evidence를 만든다. Lab은 결과를 비교하고 report/API bundle과 deployment decision을 생성한다. AIGuard는 optional evidence로 provenance mismatch나 failure signal을 진단한다. +이 구조의 핵심은 responsibility boundary다. Forge는 artifact를 만들고 provenance를 남긴다. Runtime은 실제 실행과 profiling evidence를 만든다. Lab은 결과를 비교하고 report/API bundle과 deployment decision을 생성한다. AIGuard는 optional evidence로 provenance mismatch나 failure signal을 진단한다. InferEdgeEnv는 Lab decision과 분리된 local benchmark artifact, registry row, evidence bundle, comparability judgement를 관리한다. ## 4. Repository Roles @@ -65,6 +71,7 @@ AIGuard = optional rule + evidence diagnosis | InferEdge-Runtime | C++ runtime execution and result export layer for ONNX Runtime/TensorRT edge inference validation. | | InferEdgeLab | Analysis/API layer for end-to-end Edge AI inference validation, reports, jobs, and deployment decisions. | | InferEdgeAIGuard | Optional deterministic diagnosis layer for provenance mismatch and suspicious inference result evidence. | +| InferEdgeEnv | Local-first run evidence registry and comparability checker for Edge AI inference benchmark results. | **InferEdgeForge** Build/provenance layer. ONNX 모델을 TensorRT/RKNN 등 edge deployment artifact로 변환하고, `metadata.json`, `manifest.json`, `worker_runtime_summary`로 source hash, artifact hash, backend, target, precision, shape, preset 정보를 보존한다. @@ -78,6 +85,9 @@ Analysis/API/job/deployment decision owner. Runtime result JSON을 비교하고 **InferEdgeAIGuard** Rule + evidence diagnosis layer. Forge summary, Runtime worker_response, Lab result를 기반으로 artifact/source hash mismatch, backend/target/precision/shape mismatch, insufficient provenance 등을 deterministic detector로 진단한다. AIGuard는 LLM 추측이 아니라 rule + evidence 기반 detector 구조다. +**InferEdgeEnv** +Run evidence registry / comparability checker. Edge AI inference benchmark result를 local artifact와 SQLite registry로 고정하고, same-condition / conditional / no comparability judgement를 제공한다. Env는 deployment decision을 소유하지 않으며, Lab의 validation / decision layer와 분리된 evidence portability boundary다. + ## 5. Key Implemented Features - Lab API response contract @@ -94,7 +104,8 @@ Rule + evidence diagnosis layer. Forge summary, Runtime worker_response, Lab res - AIGuard worker provenance mismatch diagnosis - AIGuard guard_analysis preservation in Lab deployment decision/report smoke - Local Studio browser workflow for Run, Import, Jetson command helper, demo evidence replay, Compare View, and Lab-owned Deployment Decision inspection -- 4개 repository README pipeline summary sync +- InferEdgeEnv run artifact bundle, SQLite registry, export/import, sampler metadata, resource lookup, and comparability-first report UX +- Core repository README pipeline summary sync plus InferEdgeEnv sidecar positioning ## 6. Validation Evidence diff --git a/docs/portfolio/inferedge_resume_interview_summary.md b/docs/portfolio/inferedge_resume_interview_summary.md index fe85a07..936029f 100644 --- a/docs/portfolio/inferedge_resume_interview_summary.md +++ b/docs/portfolio/inferedge_resume_interview_summary.md @@ -3,6 +3,7 @@ ## Final Resume Bullets - Built InferEdge, an end-to-end Edge AI inference validation pipeline that connects Forge build provenance, C++ Runtime execution, Lab comparison/report/API/job workflows, optional AIGuard diagnosis evidence, and Lab-owned deployment decisions. +- Positioned InferEdgeEnv as the supporting local-first run evidence registry / comparability checker, separate from InferEdgeLab's validation / decision ownership. - Validated real execution paths on both macOS and edge hardware: `yolov8n.onnx` through Lab -> C++ Runtime -> ONNX Runtime CPU -> Lab job result ingestion, and Jetson Orin Nano TensorRT execution through Forge manifest + `model.engine` + C++ Runtime CLI. - Documented Jetson TensorRT FP16 evidence with 25W mean `10.066401 ms`, p99 `15.548438 ms`, FPS `99.340373`, plus 15W power-mode comparison evidence. - Added Local Studio as a local-first browser workflow UI that can replay bundled ONNX Runtime CPU and TensorRT Jetson demo evidence, showing 45.4299 ms vs 10.066401 ms mean latency and about a 4.51x TensorRT speedup without claiming production SaaS readiness. @@ -24,7 +25,7 @@ Built the Lab-side orchestration and contract foundation for an edge AI validati ## Interview: First 30 Seconds -InferEdge는 단순 benchmark tool이 아니라 edge AI 모델의 build provenance, 실제 Runtime execution, 비교/report, optional diagnosis evidence, deployment decision을 연결하는 end-to-end validation pipeline입니다. 저는 macOS에서 `yolov8n.onnx`를 Lab -> C++ Runtime -> ONNX Runtime CPU -> Lab job result로 검증했고, Jetson Orin Nano에서는 Forge manifest와 TensorRT `model.engine`를 C++ Runtime CLI로 실행해 FP16 25W mean 10.066401 ms, p99 15.548438 ms, FPS 99.340373의 evidence를 확보했습니다. 최근에는 Runtime이 manifest source model identity를 보존하도록 보완해, engine artifact도 source model 기반 `compare_key`를 유지할 수 있게 했습니다. +InferEdge는 단순 benchmark tool이 아니라 edge AI 모델의 build provenance, 실제 Runtime execution, 비교/report, optional diagnosis evidence, deployment decision을 연결하는 end-to-end validation pipeline입니다. InferEdgeLab은 validation / decision layer이고, InferEdgeEnv는 run evidence registry / comparability layer입니다. 저는 macOS에서 `yolov8n.onnx`를 Lab -> C++ Runtime -> ONNX Runtime CPU -> Lab job result로 검증했고, Jetson Orin Nano에서는 Forge manifest와 TensorRT `model.engine`를 C++ Runtime CLI로 실행해 FP16 25W mean 10.066401 ms, p99 15.548438 ms, FPS 99.340373의 evidence를 확보했습니다. ## Interview: What Actually Works? @@ -38,16 +39,19 @@ InferEdge는 단순 benchmark tool이 아니라 edge AI 모델의 build provenan - Built InferEdge, an end-to-end Edge AI Inference Validation Pipeline that connects model artifact provenance, real runtime execution, result analysis, diagnosis evidence, and deployment decisions. - Implemented InferEdgeLab as the analysis layer that turns Runtime benchmark outputs into comparison reports, API responses, async job results, and Lab-owned deployment decisions. +- Positioned InferEdgeEnv as a local-first run evidence registry / comparability checker that records benchmark evidence separately from Lab deployment decisions. - Aligned Forge, Runtime, Lab, and AIGuard through JSON contracts: Forge for build/provenance, Runtime for C++ execution/result export, Lab for analysis/API/job/decision, and AIGuard for optional deterministic diagnosis evidence. - Validated two runtime evidence paths: Lab -> C++ Runtime -> ONNX Runtime CPU -> Lab job result ingestion on macOS with `yolov8n.onnx`, and Jetson Orin Nano TensorRT execution using a Forge manifest plus TensorRT engine artifact. - Current scope is a portfolio-grade SaaS API/job/worker contract foundation with dev-only Runtime execution smoke; production worker daemon, persistent queue/database, upload flow, frontend, auth, and billing remain future work. ## 2. Resume Project Entry: Detailed Version -InferEdge is an end-to-end Edge AI Inference Validation Pipeline built across four repositories. The system is designed to answer a deployment-oriented question: not just "how fast did this model run?", but "which artifact was built, how was it executed, what evidence was produced, and is it safe to deploy?" +InferEdge is an end-to-end Edge AI Inference Validation Pipeline built across four core repositories, with InferEdgeEnv as a supporting run evidence registry / comparability sidecar. The system is designed to answer a deployment-oriented question: not just "how fast did this model run?", but "which artifact was built, how was it executed, what evidence was produced, can that evidence be compared, and is it safe to deploy?" InferEdgeForge owns build artifact provenance. It records metadata and manifests such as source model hash, artifact hash, backend, target, precision, shape, preset, and build id. InferEdgeRuntime owns the C++ execution and result export boundary. It validates Lab worker requests, exports Lab-compatible Runtime results, and supports dry-run worker response export. InferEdgeLab owns comparison, reporting, SaaS API/job contracts, and the final deployment decision. InferEdgeAIGuard is optional and provides deterministic rule/evidence diagnosis for provenance or artifact mismatch cases. +InferEdgeEnv is intentionally separate from Lab's validation / decision layer. Env records local benchmark artifacts, SQLite registry entries, export/import evidence bundles, sampler/resource metadata, and same-condition / conditional / no comparability judgement so benchmark evidence can be trusted and moved without turning it into a ranking system. + The Lab side includes `/api/compare`, `/api/analyze`, in-memory job stubs, worker request/response mapping, API response contracts, deployment decision bundles, and report evidence preservation. A recent manual smoke validated a real dev-only Runtime execution path using `yolov8n.onnx`: Lab created an analyze job, invoked the C++ Runtime CLI through subprocess, ONNX Runtime CPU executed the model, and the result JSON was ingested back into the Lab job result. The smoke completed successfully with mean latency about 47.97 ms, p95/p99 about 51.80 ms, and about 20.85 FPS. I also validated a Jetson Orin Nano TensorRT Runtime smoke. On Linux `5.15.148-tegra` / `aarch64`, the C++ Runtime CLI in `~/InferEdge-Runtime` executed a Forge-generated manifest and TensorRT engine artifact from `yolov8n__jetson__tensorrt__jetson_fp16`. The current Jetson Evidence Track records FP16 25W at mean `10.066401 ms`, p99 `15.548438 ms`, FPS `99.340373`, and FP16 15W at mean `10.799106 ms`, p99 `15.529218 ms`, FPS `92.600262`. Runtime also preserves the Forge manifest source model identity for compare naming, so a `model.engine` artifact can keep `compare_model_name=yolov8n` and source-model-based `compare_key`. @@ -62,7 +66,7 @@ InferEdge is my end-to-end Edge AI inference validation pipeline. It does more t InferEdge started from a simple edge inference benchmarking problem, but I expanded it into an end-to-end validation pipeline. In edge AI, a raw latency number is not enough. You need to know which model produced which artifact, what backend and precision were used, whether the runtime result is compatible with the analysis layer, and whether the result should be deployed, reviewed, or blocked. -I split the system into four repositories with clear responsibilities. InferEdgeForge is the build/provenance layer. It records metadata, manifests, hashes, precision, target, shape, preset, and build id. InferEdgeRuntime is the C++ execution/result export layer. It validates worker request payloads and produces Lab-compatible runtime result or worker response JSON. InferEdgeLab is the analysis and decision owner. It provides compare/report flows, SaaS API response contracts, in-memory job workflow stubs, worker request/response mapping, and deployment decision output. InferEdgeAIGuard remains optional and performs rule/evidence based diagnosis, such as artifact or provenance mismatch detection. +I split the system into four core repositories with clear responsibilities, plus InferEdgeEnv as a supporting evidence sidecar. InferEdgeForge is the build/provenance layer. It records metadata, manifests, hashes, precision, target, shape, preset, and build id. InferEdgeRuntime is the C++ execution/result export layer. It validates worker request payloads and produces Lab-compatible runtime result or worker response JSON. InferEdgeLab is the validation / decision owner. It provides compare/report flows, SaaS API response contracts, in-memory job workflow stubs, worker request/response mapping, and deployment decision output. InferEdgeAIGuard remains optional and performs rule/evidence based diagnosis, such as artifact or provenance mismatch detection. InferEdgeEnv records whether benchmark evidence can be trusted and compared. The important recent validation is that this is no longer only contract-level documentation. I ran a manual dev-only smoke using `yolov8n.onnx`: `/api/analyze` created a Lab job, `/api/jobs/{job_id}/run-runtime-dev` invoked the C++ Runtime CLI through subprocess, ONNX Runtime CPU executed the model, and the Runtime JSON was ingested back into the Lab job result. The result completed successfully, with mean latency about 47.97 ms, p95/p99 about 51.80 ms, and about 20.85 FPS. The deployment decision is `unknown` at that direct execution stage because the result has not yet gone through Lab compare/report, which is expected behavior. @@ -116,6 +120,8 @@ So the accurate description is: InferEdgeLab has a SaaS/API/job contract foundat **InferEdgeRuntime** is the C++ execution/result export layer. It validates or executes model/artifact inputs and emits Lab-compatible runtime result JSON or worker response payloads. -**InferEdgeLab** is the analysis, API/job workflow, report, and deployment decision owner. It consumes Runtime results, compares them, creates reports and API bundles, tracks job state, preserves optional diagnosis evidence, and owns the final deployment decision. +**InferEdgeLab** is the validation / decision layer and owns analysis, API/job workflow, reports, and deployment decisions. It consumes Runtime results, compares them, creates reports and API bundles, tracks job state, preserves optional diagnosis evidence, and owns the final deployment decision. **InferEdgeAIGuard** is the optional deterministic diagnosis layer. It is not an LLM guessing system. It uses rule/evidence based detectors to identify artifact mismatch, source model mismatch, precision/shape/backend mismatch, and insufficient provenance evidence. Lab can consume its `guard_analysis`, but Lab remains the final decision owner. + +**InferEdgeEnv** is the run evidence registry / comparability layer. It stores Edge AI inference benchmark runs as local artifacts plus SQLite registry rows and judges whether two runs are same-condition comparable, conditionally comparable, or not comparable.