-
Notifications
You must be signed in to change notification settings - Fork 0
feat(examples): living app (Arrow-first) #31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughAdds a new living-app example (TS ↔ Python) with profiling/drift logic, CI job and npm scripts, runtime serialization fallbacks, generator/mapper/config loader tweaks, and a TypeScript config-loading test. Changes
Sequence Diagram(s)sequenceDiagram
participant TS as Node/TypeScript (index.ts)
participant NB as NodeBridge runtime
participant PY as Python (living_app.app)
TS->>NB: initialize bridge (venv, timeout)
NB->>PY: import `living_app.app`
TS->>PY: write_synthetic_events_csv(baseline, rows, seed)
PY-->>TS: baseline.csv path
TS->>PY: write_synthetic_events_csv(current, rows, drift)
PY-->>TS: current.csv path
TS->>PY: profile_csv(baseline.csv, ProfileConfig)
PY-->>TS: DatasetProfile
TS->>PY: profile_csv(current.csv, ProfileConfig)
PY-->>TS: DatasetProfile
TS->>PY: drift_report(baseline.csv, current.csv, DriftConfig)
PY-->>TS: DriftReport
TS->>PY: top_users_by_spend(current.csv, top_n)
PY-->>TS: topUsers
TS->>NB: dispose bridge / clear decoders
NB-->>TS: cleanup complete
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
📜 Recent review detailsConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used🧬 Code graph analysis (1)src/config/index.ts (1)
🪛 GitHub Check: lintsrc/config/index.ts[warning] 317-317: 🔇 Additional comments (5)
✏️ Tip: You can disable this entire section by setting Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/core/mapper.ts (1)
347-365: Edge cases innormalizeCustomTypefor malformed dotted names.The logic handles well-formed dotted names like
"module.ClassName", but edge cases may produce unexpected results:
"foo."→{ name: "", module: "foo" }(empty name)".foo"→{ name: "foo", module: "" }(empty string module instead ofundefined)"."→{ name: "", module: "" }(both empty)If malformed inputs are possible, consider adding validation or normalizing empty strings to
undefined.🔧 Optional defensive fix
private normalizeCustomType(type: { name: string; module?: string }): { name: string; module?: string; rawName: string; } { const rawName = type.name; if (type.module) { return { name: type.name, module: type.module, rawName }; } if (rawName.includes('.')) { const parts = rawName.split('.'); + const extractedName = parts[parts.length - 1]; + const extractedModule = parts.slice(0, -1).join('.'); return { - name: parts[parts.length - 1] ?? rawName, - module: parts.slice(0, -1).join('.'), + name: extractedName || rawName, + module: extractedModule || undefined, rawName, }; } return { name: rawName, module: undefined, rawName }; }
🤖 Fix all issues with AI agents
In `@examples/living-app/living_app/app.py`:
- Around line 275-342: The function drift_report currently declares ->
DriftReport but returns report.model_dump(by_alias=True) (a dict); update the
signature to reflect the actual return type (e.g., -> dict[str, Any] or
Mapping[str, Any]) or alternatively return the DriftReport instance (return
report) to match the annotation; adjust the return expression and the function
annotation in drift_report (and mirror the same fix used for profile_csv) so the
declared type and the actual returned value match.
- Around line 168-260: The function profile_csv currently declares a return type
of DatasetProfile but actually returns a dict from
profile.model_dump(by_alias=True); update the signature and/or return to be
consistent: either change the annotated return type to dict (or a suitable
TypedDict/Mapping) if you intend to return the serialized dict, or return the
DatasetProfile instance (i.e., return profile) and let the caller/bridge handle
serialization; modify the annotation on profile_csv and any callers to match (or
remove .model_dump and return profile) and keep references to DatasetProfile,
profile.model_dump, and profile_csv when making the change.
In `@examples/living-app/requirements.txt`:
- Around line 1-3: The requirements pin for pandas (pandas==2.2.2) is vulnerable
to CVE-2024-42992; update the pandas entry in requirements.txt to a patched
release (e.g., pandas>=2.2.3 or a later secure version) so the dependency no
longer uses 2.2.2; leave numpy==1.26.4 and pydantic==2.7.4 unchanged.
In `@examples/living-app/src/index.ts`:
- Around line 36-38: The temp directory created with mkdtempSync (assigned to
tmp, used to build baselinePath and currentPath) is never removed; wrap the code
that uses baselinePath/currentPath in a try/finally (or ensure process exit
handlers) and remove the tmp directory in the finally using fs.rmSync or
fs.rmdirSync with recursive/force options to clean up the temp directory after
the script completes or on error.
In `@runtime/python_bridge.py`:
- Around line 13-20: The try/except block around os.getcwd() silently swallows
errors which hinders debugging; replace the bare "except Exception: pass" with a
narrowed exception catch (e.g., except OSError, FileNotFoundError) or catch
Exception as e and log the error using the module logger (or
logging.getLogger(__name__)) while keeping the insertion logic for cwd and
sys.path.insert(0, cwd) unchanged so failures are visible but do not crash the
bridge.
📜 Review details
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (12)
.github/workflows/ci.ymlexamples/living-app/README.mdexamples/living-app/living_app/__init__.pyexamples/living-app/living_app/app.pyexamples/living-app/requirements.txtexamples/living-app/src/index.tsexamples/living-app/tsconfig.jsonexamples/living-app/tywrap.config.tspackage.jsonruntime/python_bridge.pysrc/core/generator.tssrc/core/mapper.ts
🧰 Additional context used
🧬 Code graph analysis (1)
examples/living-app/src/index.ts (1)
examples/living-app/living_app/app.py (2)
ProfileConfig(20-27)DriftConfig(75-80)
🪛 markdownlint-cli2 (0.18.1)
examples/living-app/README.md
23-23: Ordered list item prefix
Expected: 1; Actual: 2; Style: 1/1/1
(MD029, ol-prefix)
32-32: Ordered list item prefix
Expected: 1; Actual: 3; Style: 1/1/1
(MD029, ol-prefix)
38-38: Ordered list item prefix
Expected: 1; Actual: 4; Style: 1/1/1
(MD029, ol-prefix)
🪛 Ruff (0.14.11)
runtime/python_bridge.py
19-20: try-except-pass detected, consider logging the exception
(S110)
19-19: Do not catch blind exception: Exception
(BLE001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: os (windows-latest)
🔇 Additional comments (10)
src/core/mapper.ts (1)
270-271: LGTM!Using the normalized type ensures consistent handling of dotted names and explicit modules when mapping custom Python types to TypeScript.
src/core/generator.ts (1)
148-155: LGTM!The loop boundary change (
<=instead of<) correctly generates an additional overload covering the case where all optional positional parameters are provided. This improves TypeScript type inference for callers using the full parameter list.examples/living-app/living_app/__init__.py (1)
1-2: LGTM!Simple and appropriate package initializer with a descriptive docstring.
examples/living-app/tsconfig.json (1)
1-13: LGTM!The TypeScript configuration is well-structured:
- Correctly extends the root config for consistency
- Includes both
src/andgenerated/directories as required by the tywrap workflow- Disables declarations appropriately for an example app
- Enables source maps for debugging
.github/workflows/ci.yml (1)
111-139: LGTM - well-structured CI job for the living-app smoke test.The job correctly:
- Creates an isolated virtualenv at the expected path (
examples/living-app/.venv), which is properly resolved by the tywrap.config.ts relative path when the generate script changes into that directory- Installs both
tywrap_irand the living-app requirements- Uses appropriate pip flags to speed up installation
The single Python 3.12 / Node 22 combination is reasonable for a smoke test (versus the matrix testing in other jobs).
examples/living-app/tywrap.config.ts (1)
1-20: LGTM!The configuration is well-structured with appropriate settings for the living example:
- Type presets correctly match the Python dependencies (numpy, pandas, pydantic)
- The 30-second timeout is reasonable for the profiling operations
- ESM format aligns with the project's module system
package.json (1)
60-63: LGTM!The script chain is well-organized:
smokescript correctly sequences all dependencies (build → generate → build example → run)- Using
cdfor the generate script ensures the config is resolved relative to the example directory- Consistent with existing script patterns in the project
examples/living-app/src/index.ts (2)
24-68: Well-structured main function with proper resource cleanup.Good practices observed:
- Conditional venv detection (line 30)
- Bridge disposal in
finallyblock ensures cleanup even on errors- Config objects correctly use camelCase keys matching Python's alias generator
70-74: LGTM!Using
process.exitCode = 1instead ofprocess.exit(1)is the correct pattern here—it allows thefinallyblock to complete cleanup before the process terminates.examples/living-app/README.md (1)
1-48: Documentation is clear and comprehensive.The README provides complete setup instructions that align with the npm scripts and CI workflow. The JSON codec fallback note correctly matches the
enableJsonFallback: trueconfiguration inindex.ts.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@examples/living-app/requirements-arrow.txt`:
- Around line 1-2: Update the pinned pyarrow version in requirements-arrow.txt
from "pyarrow==15.0.2" to "pyarrow==22.0.0"; locate the exact dependency line
(pyarrow==15.0.2) and replace it with the new version string so the project
installs pyarrow 22.0.0.
In `@examples/living-app/src/index.ts`:
- Around line 42-51: The require call in enableArrowDecoder can throw if the
apache-arrow package is missing; wrap the
createRequire(...).require('apache-arrow') invocation in a try/catch inside
enableArrowDecoder, catch the module-not-found error and rethrow or throw a new
Error with a clear, actionable message (mention the missing package and
suggestion to install e.g. "npm install apache-arrow"), then proceed to validate
arrow.tableFromIPC as currently done; keep references to createRequire,
require('apache-arrow'), and registerArrowDecoder in your fix.
In `@package.json`:
- Line 118: Update the apache-arrow dependency in package.json from
"apache-arrow": "^14.0.2" to the current major version (e.g. "apache-arrow":
"^21.1.0"), then regenerate lockfiles by running your package manager (npm
install or yarn install) so package-lock.json / yarn.lock is updated; afterwards
run the test suite and build to surface any breaking changes and fix any API
incompatibilities introduced by the upgrade in code that imports/uses
apache-arrow.
♻️ Duplicate comments (2)
runtime/python_bridge.py (1)
13-20: Broad exception handling already flagged in past review.This defensive shim for adding cwd to sys.path was previously reviewed. The suggestion to either narrow the exception types or add minimal logging remains valid.
examples/living-app/src/index.ts (1)
108-110: Temp directory cleanup already flagged in past review.The
mkdtempSyncdirectory persists after script exit. The previous suggestion to clean up in thefinallyblock remains applicable.
📜 Review details
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (10)
.github/workflows/ci.ymldocs/runtimes/nodejs.mdexamples/living-app/README.mdexamples/living-app/requirements-arrow.txtexamples/living-app/src/index.tsexamples/living-app/tywrap.config.tspackage.jsonruntime/python_bridge.pysrc/config/index.tstest/config_loading.test.ts
🧰 Additional context used
🧬 Code graph analysis (3)
examples/living-app/tywrap.config.ts (1)
src/config/index.ts (1)
defineConfig(312-314)
test/config_loading.test.ts (1)
src/config/index.ts (1)
loadConfigFile(226-302)
src/config/index.ts (1)
src/utils/runtime.ts (2)
resolve(302-320)writeFile(386-416)
🪛 GitHub Check: lint
src/config/index.ts
[warning] 292-292:
Found writeFile from package "node:fs/promises" with non literal argument at index 0
🪛 markdownlint-cli2 (0.18.1)
examples/living-app/README.md
23-23: Ordered list item prefix
Expected: 1; Actual: 2; Style: 1/1/1
(MD029, ol-prefix)
32-32: Ordered list item prefix
Expected: 1; Actual: 3; Style: 1/1/1
(MD029, ol-prefix)
38-38: Ordered list item prefix
Expected: 1; Actual: 4; Style: 1/1/1
(MD029, ol-prefix)
🪛 Ruff (0.14.11)
runtime/python_bridge.py
19-20: try-except-pass detected, consider logging the exception
(S110)
19-19: Do not catch blind exception: Exception
(BLE001)
102-104: Avoid specifying long messages outside the exception class
(TRY003)
145-147: Avoid specifying long messages outside the exception class
(TRY003)
185-187: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (16)
examples/living-app/README.md (1)
1-52: LGTM!The README provides clear, comprehensive setup instructions for the living-app example. The documentation covers both JSON fallback and Arrow mode paths, making it easy for users to get started from a fresh checkout.
Note: The static analysis tool flags the ordered list numbering style (using 1, 2, 3, 4 vs. 1, 1, 1, 1), but incrementing numbers are a valid and commonly used convention that improves readability.
src/config/index.ts (2)
287-298: Temp file approach for ES module loading looks correct.The implementation correctly handles ES module TypeScript configs by:
- Writing transpiled output to a unique temp file
- Dynamically importing via
pathToFileURL- Cleaning up in a
finallyblockThe unique filename pattern (
Date.now()+ random hex) provides sufficient collision resistance for this use case.
271-285: CTS in-memory compilation uses Node.js internals.The use of
Module._compileand_nodeModulePathsrelies on private Node.js APIs. While this is a known pattern for in-memory CommonJS execution, be aware these APIs aren't part of Node's public contract and could change in future Node versions.This is acceptable for tooling but worth documenting or adding a comment noting the Node.js version compatibility expectations.
docs/runtimes/nodejs.md (1)
180-186: LGTM!The updated Arrow decoder registration example is cleaner and more straightforward. Using
createRequire(import.meta.url)is the idiomatic pattern for requiring CommonJS modules from ESM contexts, and the synchronous approach is appropriate for documentation examples.test/config_loading.test.ts (1)
8-30: Test effectively validates ES module config loading and cleanup behavior.Good test coverage for:
- Loading a
.tsconfig with ES module syntax (import { defineConfig })- Verifying temp file cleanup (line 26) — important regression test for transpilation workflow
- Proper cleanup in
finallyblockThe test relies on standard Node module resolution: the dynamically generated config file imports
tywrapfrom a temp directory, which resolves throughnode_modulesafter the package is built. This works correctly in typical workflows (npm install triggers thepreparescript to build, beforenpm testruns) and does not indicate a test flaw.package.json (1)
60-65: LGTM! Well-structured script sequencing.The smoke test scripts properly chain build → generate → compile → run, ensuring a clean end-to-end validation. The separation of
:smokeand:smoke:arrowvariants allows testing both codec paths..github/workflows/ci.yml (1)
111-141: LGTM! CI job is well-structured.The job correctly:
- Isolates Python dependencies in a dedicated virtualenv
- Runs the full smoke test first (which builds everything), then only the Arrow-mode run step
- Follows the existing workflow patterns with consistent tooling versions
runtime/python_bridge.py (3)
97-122: LGTM! Clean JSON fallback and improved Arrow serialization for ndarray.The early return for
FALLBACK_JSONavoids unnecessary pyarrow imports. Wrapping the array in aTableand usingpa.ipc.new_streamensures the output is compatible with JS Arrow decoders that expect IPC stream format rather than raw array buffers.
138-164: LGTM! Uncompressed Feather is the right choice for JS interop.Using
compression='uncompressed'ensures theapache-arrowJS library can decode without optional compression dependencies (lz4, zstd). The inline comment documents this rationale clearly.
179-205: LGTM! Series serialization mirrors the ndarray approach.Consistent use of
Table.from_arrays+ IPC stream writer for Series ensures uniform decoding behavior on the JS side.examples/living-app/tywrap.config.ts (1)
1-22: LGTM! Configuration is well-aligned with the example setup.The config correctly:
- Uses
defineConfigfor type safety- Sets timeout consistent with the NodeBridge instantiation in
index.ts- Includes all relevant type presets for the Python libraries in use
- Points to the
.venvlocation matching CI setupexamples/living-app/src/index.ts (5)
21-24: LGTM! Path resolution is correct.When executed from
dist/src/index.js, resolving..twice correctly lands atexamples/living-app.
74-83: LGTM! BigInt serialization handles edge cases correctly.Converting to
Numberonly whenisSafeIntegeris true, and falling back to string representation otherwise, prevents silent precision loss.
149-152: LGTM! Proper cleanup in finally block.Disposing the bridge and clearing the Arrow decoder ensures resources are released regardless of success or failure.
155-159: LGTM! Top-level error handling is appropriate for a CLI entrypoint.Using
process.exitCode = 1(rather thanprocess.exit(1)) allows pending I/O to flush before the process terminates.
112-113: The return value ofwriteSyntheticEventsCsvis correct. The function has a return type annotation of-> str:and returnsstr(out_path), providing the file path string expected byprofileCsv,driftReport, andtopUsersBySpend.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
runtime/python_bridge.py (3)
126-129: Dead code:FALLBACK_JSONcheck in exception handler is unreachable.If
FALLBACK_JSONisTrue, the function returns early at line 105. Therefore, when the exception handler at line 126 executes,FALLBACK_JSONis guaranteed to beFalse, making lines 127-128 unreachable.🧹 Proposed fix
except Exception as exc: - if FALLBACK_JSON: - return serialize_ndarray_json(obj) raise RuntimeError('Arrow encoding failed for ndarray') from exc
168-184: Good fix for JS decoder compatibility; same dead code issue in exception handler.The explicit
compression='uncompressed'on line 173 correctly addresses the JS apache-arrow limitation. However, lines 182-183 have the same unreachable code issue: ifFALLBACK_JSONisTrue, the function returns at line 160.🧹 Proposed fix for dead code
except Exception as exc: - if FALLBACK_JSON: - return serialize_dataframe_json(obj) raise RuntimeError('Arrow encoding failed for pandas.DataFrame') from exc
220-237: Table-wrapped IPC approach is correct; same dead code issue.The change to wrap the Series as a single-column Table (lines 222-225) correctly addresses the pyarrow IPC writer's Schema requirement and aligns with the JS decoder contract. Same unreachable code issue at lines 235-236.
🧹 Proposed fix for dead code
except Exception as exc: - if FALLBACK_JSON: - return serialize_series_json(obj) raise RuntimeError('Arrow encoding failed for pandas.Series') from exc
♻️ Duplicate comments (3)
runtime/python_bridge.py (1)
13-20: Consider narrowing the exception type or logging suppressed failures.This was flagged in a previous review. The broad
except Exception: passpattern silently swallows all errors, which can make debugging difficult ifos.getcwd()fails unexpectedly (e.g., deleted working directory, permission issues). Consider narrowing toexcept (OSError, ValueError)or adding minimal logging.examples/living-app/src/index.ts (2)
51-60: Previous review noted: consider explicit error handling for missingapache-arrow.The current implementation throws a generic module-not-found error if
apache-arrowisn't installed. A wrapped try-catch with a descriptive message would improve developer experience, but this is optional for a smoke test example.
131-136: Previous review noted: temp directory is not cleaned up.The temp directory created by
mkdtempSyncpersists after the script exits. For a smoke test this is minor, but repeated runs will accumulate orphan directories. Consider adding cleanup in the finally block.
📜 Review details
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (8)
.github/workflows/ci.ymlexamples/living-app/src/index.tsexamples/living-app/tywrap.config.tsruntime/python_bridge.pysrc/config/index.tssrc/core/generator.tssrc/core/mapper.tstest/config_loading.test.ts
🧰 Additional context used
🧬 Code graph analysis (4)
examples/living-app/src/index.ts (1)
examples/living-app/living_app/app.py (2)
ProfileConfig(20-27)DriftConfig(75-80)
test/config_loading.test.ts (1)
src/config/index.ts (1)
loadConfigFile(226-308)
examples/living-app/tywrap.config.ts (1)
src/config/index.ts (1)
defineConfig(318-320)
src/config/index.ts (1)
src/utils/runtime.ts (2)
resolve(302-320)writeFile(386-416)
🪛 GitHub Check: lint
src/config/index.ts
[warning] 298-298:
Found writeFile from package "node:fs/promises" with non literal argument at index 0
🪛 Ruff (0.14.11)
runtime/python_bridge.py
19-20: try-except-pass detected, consider logging the exception
(S110)
19-19: Do not catch blind exception: Exception
(BLE001)
109-111: Avoid specifying long messages outside the exception class
(TRY003)
165-167: Avoid specifying long messages outside the exception class
(TRY003)
217-219: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (15)
runtime/python_bridge.py (3)
132-148: LGTM!The JSON fallback implementation is clean and the docstring clearly explains the trade-offs (larger payloads, potential dtype loss).
187-202: LGTM!Clean JSON fallback with appropriate docstring explaining the trade-offs.
240-259: LGTM!The fallback chain from
to_list()toto_dict()handles edge cases gracefully, and the docstring explains the dtype/NA trade-offs..github/workflows/ci.yml (1)
111-142: LGTM! Well-structured CI job for the living-app example.The job correctly sets up both Node and Python environments, creates an isolated virtualenv, and runs smoke tests for both JSON and Arrow codec modes. The comment on line 137 helpfully explains why pyarrow is installed.
One minor observation: unlike the
testjob which uses a matrix strategy for multiple Node/Python versions, this job pins to a single configuration (Node 22, Python 3.12). This is reasonable for a smoke test, but consider if broader coverage is desired later.test/config_loading.test.ts (1)
1-35: LGTM! Solid test coverage for the ESM config loading path.The test effectively validates:
- Loading a
.tsconfig that imports from the ESMtywrappackage- Correct evaluation of the config object
- Cleanup of the temporary
.mjsfile after loadingThe
Whycomments clearly explain the rationale for each assertion.examples/living-app/tywrap.config.ts (1)
1-24: LGTM! Clean and well-documented configuration.The config correctly specifies:
- ESM output format matching the TypeScript runner
- Virtual environment path consistent with CI setup
- Appropriate type presets for the Python dependencies (numpy, pandas, pydantic)
- 30-second timeout matching the runtime bridge configuration in
index.tssrc/config/index.ts (2)
261-288: LGTM! Correct handling of CommonJS vs ESM module evaluation.The logic correctly differentiates:
.cts→ CommonJS (usesModule._compile).ts/.mts→ ES2020 modules (temp file approach)The
Whycomment clearly explains the rationale: ESM configs importing from the ESMtywrappackage would fail if transpiled to CommonJS.Note: The
Module._compileand_nodeModulePathsAPIs are internal Node.js APIs. While they work reliably, they're not part of the public API contract.
290-304: LGTM! Safe temporary file approach for ESM evaluation.The implementation correctly:
- Places the temp file next to the source so relative imports resolve properly
- Uses timestamp + random suffix to avoid collisions
- Cleans up in a
finallyblock ensuring no orphaned filesRegarding the static analysis warning on line 298: this is a false positive. The
tmpPathis constructed fromdirname(resolved)(the config file's directory) and a sanitized filename pattern. There's no user-controlled path injection risk here.examples/living-app/src/index.ts (4)
21-42: LGTM! Clean codec mode resolution with sensible defaults.The precedence order (CLI args → env var → default JSON) is well-designed:
--arrow/--jsonflags allow explicit controlTYWRAP_LIVING_APP_CODECenv var enables CI/scripted configuration- JSON default ensures the example works without pyarrow installed
68-104: LGTM! Robust helpers for Arrow-safe JSON output.Good defensive patterns:
toJsonSafe: Uses duck typing for Arrow table detection, truncates output to 20 rowsjsonReplacer: Safely handles BigInt by downcasting when withinNumber.MAX_SAFE_INTEGER, otherwise stringifies
106-129: LGTM! Well-structured bridge initialization with fail-fast Arrow validation.The implementation correctly:
- Conditionally enables JSON fallback based on codec mode
- Validates pyarrow availability in Arrow mode before proceeding
- Provides a clear error message directing users to either install pyarrow or use
--json
172-184: LGTM! Proper cleanup and error handling.Good practices:
finallyblock ensuresbridge.dispose()runs even on errorclearArrowDecoder()prevents global state leaking to other testsprocess.exitCode = 1(vsprocess.exit(1)) allows cleanup to completesrc/core/mapper.ts (2)
270-271: LGTM! Correctly normalizes dotted type names before constructing the custom type.This properly handles qualified Python type names (e.g.,
"pandas.DataFrame") that arrive as a single dotted string, splitting them into a leafnameandmodulepath to produce valid TypeScript identifiers.
352-355: Good documentation of the normalization rationale.The comment clearly explains why normalization is needed (dotted names invalid in TS identifiers, stable test expectations).
src/core/generator.ts (1)
144-153: LGTM! Correctly adds the full-positional-signature overload.The change from
i < positional.lengthtoi <= positional.lengthensures an overload is generated for calls that supply all positional parameters. Without this, whenvarArgsorkwArgsare present, the full-arity call wouldn't match any overload signature.Example with
fn(a, b?, c?, ...rest):
- Before: overloads for
(a, ...rest),(a, b, ...rest)— missing(a, b, c, ...rest)- After: includes
(a, b, c, ...rest)as well
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
- Upgrade apache-arrow/pyarrow + pandas pins\n- Improve living-app robustness (optional deps, tmp cleanup)\n- Serialize Pydantic models in python bridge\n- Harden custom type normalization edge cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/core/mapper.ts (1)
360-371: Redundant nullish coalescing on line 366.After filtering empty strings and checking
parts.length === 0, the array is guaranteed to have at least one non-empty element, making?? rawNameunreachable. Consider removing it for clarity.Suggested simplification
return { - name: parts[parts.length - 1] ?? rawName, + name: parts[parts.length - 1]!, module: parts.length > 1 ? parts.slice(0, -1).join('.') : undefined, rawName, };
🤖 Fix all issues with AI agents
In `@src/config/index.ts`:
- Around line 274-292: The current in-memory CommonJS evaluation path inside the
emitCommonJs branch uses private Node internals (moduleCtor.Module,
moduleCtor._nodeModulePaths, mod._compile) without guarding for their absence;
add a defensive check that verifies moduleCtor.Module,
moduleCtor._nodeModulePaths, and mod._compile are functions before using them,
and if any are missing throw a clear, descriptive error mentioning emitCommonJs
and the missing private API (or fall back to writing a temp .cjs file and
requiring it), then ensure the error is logged/propagated so ensureConfigObject
receives a meaningful failure reason for resolved.
♻️ Duplicate comments (2)
examples/living-app/src/index.ts (2)
50-71: LGTM — previous concern addressed.The error handling for the missing
apache-arrowpackage has been properly implemented with a descriptive error message guiding users to install the dependency.
184-197: LGTM — previous concern addressed.The temp directory cleanup has been properly implemented in the
finallyblock, and the silent catch for cleanup errors is acceptable for best-effort resource cleanup. The cleanup sequence (temp files → bridge → decoder state) is appropriate.
📜 Review details
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (8)
examples/living-app/living_app/app.pyexamples/living-app/requirements-arrow.txtexamples/living-app/requirements.txtexamples/living-app/src/index.tspackage.jsonruntime/python_bridge.pysrc/config/index.tssrc/core/mapper.ts
🧰 Additional context used
🧬 Code graph analysis (2)
src/config/index.ts (1)
src/utils/runtime.ts (2)
resolve(302-320)writeFile(386-416)
examples/living-app/src/index.ts (2)
src/utils/runtime.ts (2)
resolve(302-320)join(265-273)examples/living-app/living_app/app.py (2)
ProfileConfig(20-27)DriftConfig(75-80)
🪛 GitHub Check: lint
src/config/index.ts
[warning] 302-302:
Found writeFile from package "node:fs/promises" with non literal argument at index 0
🪛 Ruff (0.14.11)
runtime/python_bridge.py
113-115: Avoid specifying long messages outside the exception class
(TRY003)
169-171: Avoid specifying long messages outside the exception class
(TRY003)
221-223: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (22)
src/core/mapper.ts (2)
270-271: LGTM!The integration correctly applies normalization before returning the custom type, ensuring dotted Python names like
"pandas.DataFrame"are properly split into{ name: "DataFrame", module: "pandas" }for valid TypeScript identifiers.
352-355: Clear and helpful documentation.The comment accurately explains the edge case this normalization addresses (dotted names in Python IR) and the motivation (TypeScript identifier validity, stable test expectations).
src/config/index.ts (2)
261-272: LGTM on the ESM/CommonJS detection logic.The extension-based module kind selection correctly aligns with Node.js conventions (
.cts→ CommonJS,.ts/.mts→ ESM). The inline comments clearly explain the rationale for this approach—avoidingrequire('tywrap')failures when the package is ESM-only.
297-308: Static analysis warning is a false positive here.The warning about
writeFilewith a non-literal path (line 302) is expected. The path is intentionally constructed at runtime to place the temp file adjacent to the source config, ensuring relative imports resolve correctly. The filename component (.tywrap.config.{timestamp}.{random}.mjs) is fully controlled by this code, and the content is the transpiled config—not user-supplied data written to an arbitrary location.The cleanup in
finallywith{ force: true }correctly handles all exit paths.One minor observation: if the process is killed between
writeFileandrm, an orphan temp file may remain. This is unlikely to cause issues in practice, and the test suite already checks for stray.tywrap.config.*files.examples/living-app/requirements.txt (1)
1-3: Security vulnerability addressed — dependencies look good.The pandas version has been upgraded from 2.2.2 to 2.3.3, which addresses the CVE-2024-42992 vulnerability flagged in the previous review. numpy==1.26.4 and pydantic==2.7.4 remain secure with no known CVEs.
examples/living-app/requirements-arrow.txt (1)
1-2: LGTM — pyarrow upgraded to current version.The pyarrow dependency has been updated from 15.0.2 to 22.0.0 as recommended in the previous review. The
-r requirements.txtdirective correctly chains the base dependencies.package.json (2)
60-65: Well-structured living-app scripts.The new npm scripts follow a logical progression:
generate→build→run, with thesmokescripts correctly chaining the full workflow. The--arrowvariant for Arrow mode testing is a nice addition.
118-118: apache-arrow upgraded to current version.The dependency has been updated from ^14.0.2 to ^21.1.0 as recommended in the previous review.
runtime/python_bridge.py (5)
13-24: Exception handling improved as suggested.The exception types have been narrowed from bare
Exceptionto specific types (OSError, ValueError, TypeError, AttributeError), and the warning is now logged to stderr. The nested try/except for the stderr write itself is a nice defensive touch.
100-133: Arrow IPC serialization for ndarray looks correct.The implementation properly wraps the array in a single-column Table for IPC stream compatibility. The early
FALLBACK_JSONcheck at line 108 makes the fallback check at line 131 unreachable, but this is harmless defensive code.
155-188: Uncompressed Feather output is correctly documented.The explicit
compression='uncompressed'at line 177 ensures JS decoders (apache-arrow) can read the payload without optional compression dependencies. Good documentation in the docstring explaining the rationale.
367-388: Clean Pydantic serialization without hard dependency.The duck-typing approach using
getattr(obj, 'model_dump', None)avoids importing Pydantic at runtime. Theby_alias=Trueparameter correctly aligns with the camelCase aliasing used in the living-app models, and theTypeErrorfallback handles older Pydantic versions gracefully.
390-409: Correct dispatch order for Pydantic serialization.The Pydantic check is placed after specialized ML types but before stdlib types, which is the correct precedence. The identity comparison
is not _NO_PYDANTICproperly checks the sentinel value.examples/living-app/living_app/app.py (5)
168-260: Return type annotation now matches — returns Pydantic model directly.The function now correctly returns the
DatasetProfileinstance (line 260) rather than callingmodel_dump(). The docstring at lines 172-173 correctly explains that the bridge handles serialization viamodel_dump. This resolves the previous review concern about type mismatch.
275-342: Return type annotation matches implementation.Similar to
profile_csv, this function now returns theDriftReportinstance directly (line 342), correctly matching the-> DriftReporttype annotation. The bridge will handle serialization.
345-367: Defensive handling for missing sort column implemented.The function now checks if
spend_usd_last_7dexists before sorting (lines 365-366), returning an unsorted head if the column is missing. This addresses the previous review suggestion and makes the function robust for custom CSVs.
11-13: Camel case converter handles edge cases correctly.For single-word inputs (no underscores),
split("_")returns a single-element list, makingtailempty and the function returnsheadunchanged. This is the correct behavior for field names likepathorrows.
106-165: Well-designed synthetic data generator.The function uses a seeded RNG for reproducibility, properly creates parent directories, and the
driftparameter meaningfully affects multiple features for realistic drift detection testing. Country probabilities correctly sum to 1.0.examples/living-app/src/index.ts (4)
79-98: LGTM!The duck-typing approach for detecting Arrow tables and truncating output to 20 rows is pragmatic for demo output. The implementation correctly handles edge cases.
106-115: LGTM!The BigInt handling correctly uses
Number.isSafeIntegerto prevent precision loss, falling back to string representation for values outside the safe integer range.
142-154: LGTM!Configuration values are within the valid constraints defined in the Python models (e.g.,
topKin [1, 25],sampleRows≥ 50,categoricalL1Thresholdin [0, 1]).
200-204: LGTM!Using
process.exitCode = 1instead ofprocess.exit(1)is the correct approach—it allows thefinallyblock to complete before the process terminates.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
- Guard Node private Module APIs for .cts in-memory eval\n- Simplify normalizeCustomType without non-null assertion
- Default runner to Arrow mode; JSON only when requested\n- Update docs/scripts/CI to match (add :run:json and :smoke:json)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/core/mapper.ts (1)
347-376: LGTM — solid edge-case handling.The normalization logic correctly handles:
- Already-qualified types (module provided) — returned as-is
- Dotted names like
"pandas.DataFrame"— split into leaf name + module- Malformed inputs (
"...",".foo","foo.") — filtered empty parts with safe fallbacksThe type guard at lines 366-368 satisfies TypeScript's
string | undefinedarray indexing without resorting to non-null assertions.Minor:
rawNameis computed but never consumed by callers (mapCustomTypeandmapPresetTypeonly usenameandmodule). Consider removing it unless it's retained for debugging or future logging.♻️ Optional: remove unused rawName
private normalizeCustomType(type: { name: string; module?: string }): { name: string; module?: string; - rawName: string; } { - const rawName = type.name; if (type.module) { - return { name: type.name, module: type.module, rawName }; + return { name: type.name, module: type.module }; } - if (rawName.includes('.')) { - const parts = rawName.split('.').filter(part => part.length > 0); + if (type.name.includes('.')) { + const parts = type.name.split('.').filter(part => part.length > 0); if (parts.length === 0) { - return { name: rawName, module: undefined, rawName }; + return { name: type.name, module: undefined }; } const name = parts[parts.length - 1]; if (!name) { - return { name: rawName, module: undefined, rawName }; + return { name: type.name, module: undefined }; } return { name, module: parts.length > 1 ? parts.slice(0, -1).join('.') : undefined, - rawName, }; } - return { name: rawName, module: undefined, rawName }; + return { name: type.name, module: undefined }; }
🤖 Fix all issues with AI agents
In `@src/config/index.ts`:
- Around line 314-325: Replace the fragile Date.now()+Math.random suffix used
when building tmpPath with a crypto.randomUUID()-based suffix to reduce
collision risk: update the tmpPath construction (where resolve(...) creates
`.tywrap.config.${Date.now()}.${Math.random().toString(16).slice(2)}.mjs`) to
use crypto.randomUUID() instead, and add the appropriate import of crypto (or
use global crypto) so writeFile(tmpPath,...), the dynamic import via
import(pathToFileURL(tmpPath).href), and the cleanup rm(tmpPath,{ force: true })
continue to work unchanged.
♻️ Duplicate comments (3)
package.json (1)
120-120: Apache-arrow dependency upgrade addressed.The upgrade to
^21.1.0addresses the previous review feedback about the outdated version.examples/living-app/src/index.ts (2)
58-79: Arrow decoder registration with proper error handling.The try/catch with descriptive error message for
MODULE_NOT_FOUNDaddresses the previous review feedback. ThetableFromIPCvalidation ensures the expected API is available before registration.
170-211: Proper resource cleanup in finally block.The cleanup addresses previous review feedback:
- Temp directory is removed with
rmSyncin a try/catch to handle edge cases- Bridge is disposed
- Arrow decoder is cleared to avoid global state leakage
The empty catch block on lines 203-204 is acceptable for cleanup code where failure shouldn't prevent other cleanup steps.
📜 Review details
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (6)
.github/workflows/ci.ymlexamples/living-app/README.mdexamples/living-app/src/index.tspackage.jsonsrc/config/index.tssrc/core/mapper.ts
🧰 Additional context used
🧬 Code graph analysis (2)
src/config/index.ts (1)
src/utils/runtime.ts (2)
resolve(302-320)writeFile(386-416)
examples/living-app/src/index.ts (1)
examples/living-app/living_app/app.py (2)
ProfileConfig(20-27)DriftConfig(75-80)
🪛 GitHub Check: lint
src/config/index.ts
[warning] 319-319:
Found writeFile from package "node:fs/promises" with non literal argument at index 0
🪛 markdownlint-cli2 (0.18.1)
examples/living-app/README.md
23-23: Ordered list item prefix
Expected: 1; Actual: 2; Style: 1/1/1
(MD029, ol-prefix)
32-32: Ordered list item prefix
Expected: 1; Actual: 3; Style: 1/1/1
(MD029, ol-prefix)
38-38: Ordered list item prefix
Expected: 1; Actual: 4; Style: 1/1/1
(MD029, ol-prefix)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: os (windows-latest)
🔇 Additional comments (11)
src/core/mapper.ts (1)
270-271: LGTM!Clean refactoring that delegates name/module derivation to the new
normalizeCustomTypehelper, ensuring consistent handling of dotted qualified names across bothmapCustomTypeandmapPresetType.examples/living-app/README.md (1)
1-52: LGTM! Clear and comprehensive setup documentation.The README effectively documents the living-app example's purpose, setup steps, and usage modes. The instructions are accurate and the Arrow vs JSON mode distinction is well-explained.
Regarding the markdownlint warnings about ordered list prefixes (MD029): the explicit
1. 2. 3. 4.numbering style is valid and often preferred for readability in setup guides where step order matters. This is a stylistic choice and can be safely ignored..github/workflows/ci.yml (1)
111-142: LGTM! Well-structured CI job for living-app validation.The job correctly:
- Isolates Python dependencies in a dedicated virtualenv
- Installs Arrow dependencies to exercise the primary transport path
- Runs both Arrow mode (via
smoke) and JSON mode (viarun:json)The use of
run:jsoninstead ofsmoke:jsonfor the second test is efficient since it reuses the build artifacts from the first smoke test.package.json (1)
60-67: LGTM! Comprehensive script coverage for living-app lifecycle.The scripts provide good ergonomics for development and CI:
- Separate
generate,build,runsteps for granular control- Explicit
--arrowand--jsonvariants for testing both codec modessmokevariants that chain the full workflowThe
smoke:arrowaliasing tosmokeis appropriate since Arrow mode is now the default.examples/living-app/src/index.ts (6)
1-23: LGTM! Clean imports and helper setup.The imports are well-organized, separating Node built-ins, tywrap modules, and generated code. The
resolveExampleRoothelper correctly navigates from the compiled output location back to the example root.
27-49: LGTM! Robust codec mode resolution.The codec selection logic properly:
- Enforces mutual exclusion of
--arrowand--jsonflags- Supports environment variable override via
TYWRAP_LIVING_APP_CODEC- Defaults to Arrow mode as intended for exercising the primary transport path
87-123: LGTM! Safe JSON serialization helpers.Good handling of Arrow-specific types:
toJsonSafeconverts Arrow tables to a bounded representation (max 20 rows)jsonReplacersafely handles BigInt withNumber.isSafeIntegercheck before downcast
125-154: LGTM! Defensive bridge initialization.The setup correctly:
- Detects virtualenv presence before configuring the bridge
- Forces
TYWRAP_CODEC_FALLBACKbased on codec mode for deterministic behavior- Validates
arrowAvailablefrom bridge info before enabling Arrow decoder, preventing confusing decode errors
156-168: Config objects align with Python models.The
ProfileConfigandDriftConfigobjects match the Pydantic model definitions inliving_app/app.py, with appropriate defaults within the defined constraints.
214-218: LGTM! Proper top-level error handling.Using
process.exitCode = 1instead ofprocess.exit(1)allows pending I/O to complete before the process exits, which is the recommended pattern.src/config/index.ts (1)
274-309: LGTM! Defensive guards for Node private APIs properly implemented.The comprehensive checks for
moduleCtor,_nodeModulePaths, and_compileaddress the fragility concern with Node internals. The error messages are clear and provide actionable context for debugging.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
Suggested by CodeRabbit: avoid Date.now+Math.random collisions when writing transient .mjs configs.
Fixes #22