AiTask model config handling #151

sroussey · 2025-12-20T19:41:37Z

Introduce ModelConfig and ModelRecord schemas, allowing AiTask to accept full model configurations directly and enabling job workers to run without a model repository.

The original AiTask input schema suggested it could take a ModelConfig object, but the implementation only supported a string identifier, requiring a model repository lookup at runtime. This change aligns the implementation with the schema and decouples job execution from the model repository by embedding the necessary ModelConfig directly into job payloads. ModelConfig is a lighter schema for task inputs, while ModelRecord is used for repository persistence.

Note

Allow passing full ModelConfig in task/job inputs (no repo lookup), introduce ModelConfigSchema vs ModelRecordSchema, update provider run fns and repositories accordingly.

Models:
- Introduce ModelConfigSchema (for task/job inputs) and ModelRecordSchema (for persistence); export ModelConfig/ModelRecord.
- Repositories (InMemoryModelRepository, ModelRepository, test bindings) now use ModelRecordSchema.
Task/Job Execution:
- AiTask/AiVisionTask accept string | ModelConfig, resolve inline configs without repository, and embed model config into AiJobInput.
- Validation/narrowing updated to handle inline configs and task compatibility; default queue derives from model.provider when config given.
- AiJob passes ModelConfig directly to run funcs; no repository lookup.
- AiProviderRegistry types updated to use ModelConfig; worker wrapper forwards [input, model].
Providers:
- HF Transformers and TF MediaPipe job run fns now typed with HfTransformersOnnxModelConfig/TFMPModelConfig.
- HF pipeline cache key/unload use model.providerConfig.modelPath:pipeline (not model_id).
Schemas/Utilities:
- AiTaskSchemas.TypeModel* now build from ModelConfigSchema.
Tests:
- Update tests to pass inline model objects and remove repository dependency in AiJob execution.

^{Written by Cursor Bugbot for commit a29226e. This will update automatically on new commits. Configure here.}

Co-authored-by: sroussey <sroussey@gmail.com>

cursor · 2025-12-20T19:41:38Z

Cursor Agent can help with this pull request. Just @cursor in comments and I'll start working on changes in this branch.
_{Learn more about Cursor Agents}

- Replaced instances of `ModelRecord as ModelRecordSchema` with `ModelRecordSchema` for consistency across multiple files. - Updated the model schema definitions to use `ModelConfigSchema` and `ModelRecordSchema` in relevant areas. - Enhanced code formatting for better readability in various task and model repository files.

Copilot

Pull request overview

This PR introduces a clearer separation between ModelConfig (for task/job inputs) and ModelRecord (for repository persistence), enabling job workers to execute AI tasks without requiring access to a model repository at runtime. The key architectural change allows AiTask to accept and pass full model configurations directly in job payloads, aligning the implementation with the original schema design intent.

Introduces ModelConfigSchema with minimal required fields (provider, providerConfig) and ModelRecordSchema with full requirements for persistence
Updates AiTask to accept both string model identifiers and ModelConfig objects, resolving strings via repository while passing configs directly to jobs
Modifies job execution to use embedded ModelConfig instead of performing runtime repository lookups

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
packages/ai/src/model/ModelSchema.ts	Introduces `ModelConfigSchema` (lightweight) and `ModelRecordSchema` (persistence) with distinct required fields
packages/ai/src/task/base/AiTask.ts	Adds `getModelConfigForInput()` method, updates validation to handle both string and object models, modifies job input generation to embed configs
packages/ai/src/task/base/AiTaskSchemas.ts	Updates model type schema reference from `ModelSchema` to `ModelConfigSchema`
packages/ai/src/task/base/AiVisionTask.ts	Updates type signatures to accept `ModelConfig` in addition to strings
packages/ai/src/job/AiJob.ts	Removes model repository lookup, directly uses `ModelConfig` from job input
packages/ai/src/provider/AiProviderRegistry.ts	Updates run function type signature to accept `ModelConfig` instead of `ModelRecord`
packages/ai/src/model/ModelRepository.ts	Updates repository to use `ModelRecordSchema` for persistence
packages/ai/src/model/InMemoryModelRepository.ts	Updates constructor to use `ModelRecordSchema`
packages/test/src/binding/SqliteModelRepository.ts	Updates tabular repository instantiation with `ModelRecordSchema`
packages/test/src/binding/PostgresModelRepository.ts	Updates tabular repository instantiation with `ModelRecordSchema`
packages/test/src/binding/IndexedDbModelRepository.ts	Updates tabular repository instantiation with `ModelRecordSchema`
packages/ai-provider/src/tf-mediapipe/common/TFMP_ModelSchema.ts	Adds `TFMPModelConfig` type extending the base model config schema
packages/ai-provider/src/tf-mediapipe/common/TFMP_JobRunFns.ts	Updates all run function signatures to accept `TFMPModelConfig`
packages/ai-provider/src/hf-transformers/common/HFT_ModelSchema.ts	Adds `HfTransformersOnnxModelConfig` type with proper schema separation
packages/ai-provider/src/hf-transformers/common/HFT_JobRunFns.ts	Updates type signatures and cache key logic to use `modelPath:pipeline` instead of `model_id`
packages/test/src/test/ai-provider/AiProviderRegistry.test.ts	Updates test to pass full model config object instead of string identifier

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

packages/ai-provider/src/hf-transformers/common/HFT_JobRunFns.ts

packages/ai/src/task/base/AiTask.ts

Copilot · 2025-12-20T20:15:39Z

packages/ai/src/task/base/AiTask.ts

+        const combined: (string | ModelConfig)[] = [...requestedInline, ...usingStrings];
+
        // we alter input to be the models that were found for this kind of input
-        usingModels = usingModels.length > 1 ? usingModels : usingModels[0];
-        (input as any)[key] = usingModels;
+        (input as any)[key] = combined.length > 1 ? combined : combined[0];


If both requestedInline and usingStrings are empty, the combined array will be empty, and combined[0] will be undefined. This could lead to unexpected behavior when a single model is expected but all requested models were filtered out. Consider adding a check to handle this case explicitly or maintaining the original value when no compatible models are found.

Copilot · 2025-12-20T20:15:40Z

packages/ai-provider/src/tf-mediapipe/common/TFMP_ModelSchema.ts

 */

-import { ModelSchema } from "@workglow/ai";
+import { ModelRecordSchema } from "@workglow/ai";


The import statement is missing ModelConfigSchema which should be imported from "@workglow/ai" alongside ModelRecordSchema. This is needed to properly extend the ModelConfig schema for the TFMPModelConfig type.

Copilot · 2025-12-20T20:15:40Z

packages/ai-provider/src/tf-mediapipe/common/TFMP_ModelSchema.ts

+const ExtendedModelConfigSchema = {
+  type: "object",
+  properties: {
+    ...ModelRecordSchema.properties,
+    ...TFMPModelSchema.properties,
+  },
+  required: [...ModelRecordSchema.required, ...TFMPModelSchema.required],


The ExtendedModelConfigSchema should extend ModelConfigSchema instead of ModelRecordSchema. This is inconsistent with the HFT implementation and defeats the purpose of having a lighter ModelConfig schema for task inputs. The required fields should be [...ModelConfigSchema.required, ...TFMPModelSchema.required] instead of [...ModelRecordSchema.required, ...].

packages/ai/src/task/base/AiTask.ts

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

- Replaced `ModelRecordSchema` with `ModelConfigSchema` in the ExtendedModelConfigSchema definition. - Adjusted required properties to align with the new schema structure. - Improved code consistency and readability in the model schema file.

cursor

This PR is being reviewed by Cursor Bugbot

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

cursor · 2025-12-21T01:22:35Z

packages/ai/src/task/base/AiTask.ts

          taskModels?.find((m) => m.model_id === model)
        );

+        const combined: (string | ModelConfig)[] = [...requestedInline, ...usingStrings];


Inline models not filtered for task compatibility in narrowInput

The narrowInput method filters string models based on task compatibility but keeps all inline ModelConfig objects unfiltered. However, validateInput rejects inline configs with a tasks array that doesn't include the current task type. This inconsistency means incompatible inline configs pass through narrowInput but fail validation later. The purpose of narrowInput is to strip incompatible models in dataflows so validation succeeds when at least some models are compatible. Inline configs with explicit incompatible tasks arrays should be filtered out in narrowInput to maintain consistency with validateInput.

Additional Locations (1)

packages/ai/src/task/base/AiTask.ts#L210-L219

cursor · 2025-12-21T01:22:35Z

packages/ai-provider/src/hf-transformers/common/HFT_JobRunFns.ts

  progressScaleMax: number = 10
 ) => {
-  const cacheKey = `${model.model_id}:${model.providerConfig.pipeline}`;
+  const cacheKey = `${model.providerConfig.modelPath}:${model.providerConfig.pipeline}`;


Pipeline cache key ignores device and dtype configuration

The pipeline cache key changed from model.model_id:pipeline to model.providerConfig.modelPath:pipeline. However, the pipeline is created with additional configuration options including dType (dtype) and device (lines 372, 376). Since these options aren't included in the cache key, two model configs with the same modelPath and pipeline but different dType or device settings will incorrectly share the same cached pipeline. This could cause models to run on the wrong device (e.g., CPU instead of WebGPU) or with the wrong precision, leading to incorrect behavior or errors.

sroussey · 2025-12-21T01:27:01Z

Closes #149

Refactor: Use ModelConfig for task inputs and ModelRecord for storage

c6dbebf

Co-authored-by: sroussey <sroussey@gmail.com>

github-actions bot assigned sroussey Dec 20, 2025

sroussey requested a review from Copilot December 20, 2025 20:11

Copilot started reviewing on behalf of sroussey December 20, 2025 20:12 View session

Copilot AI reviewed Dec 20, 2025

View reviewed changes

sroussey and others added 3 commits December 21, 2025 01:10

Update packages/ai/src/task/base/AiTask.ts

5057f71

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update packages/ai/src/task/base/AiTask.ts

431d769

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

sroussey marked this pull request as ready for review December 21, 2025 01:15

cursor bot reviewed Dec 21, 2025

View reviewed changes

sroussey merged commit b3091ce into main Dec 21, 2025
2 checks passed

sroussey deleted the cursor/aitask-model-config-handling-bdcf branch December 21, 2025 01:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AiTask model config handling #151

AiTask model config handling #151

Uh oh!

sroussey commented Dec 20, 2025 •

edited by cursor bot

Loading

Uh oh!

cursor bot commented Dec 20, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Dec 20, 2025

Uh oh!

Copilot AI Dec 20, 2025

Uh oh!

Copilot AI Dec 20, 2025

Uh oh!

Uh oh!

cursor bot left a comment

Uh oh!

cursor bot Dec 21, 2025

Uh oh!

cursor bot Dec 21, 2025

Uh oh!

Uh oh!

sroussey commented Dec 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AiTask model config handling #151

AiTask model config handling #151

Uh oh!

Conversation

sroussey commented Dec 20, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cursor bot commented Dec 20, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

This PR is being reviewed by Cursor Bugbot

Uh oh!

cursor bot Dec 21, 2025

Choose a reason for hiding this comment

Inline models not filtered for task compatibility in narrowInput

Uh oh!

cursor bot Dec 21, 2025

Choose a reason for hiding this comment

Pipeline cache key ignores device and dtype configuration

Uh oh!

Uh oh!

sroussey commented Dec 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sroussey commented Dec 20, 2025 •

edited by cursor bot

Loading