-
Notifications
You must be signed in to change notification settings - Fork 1
AiTask model config handling #151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: sroussey <sroussey@gmail.com>
|
Cursor Agent can help with this pull request. Just |
- Replaced instances of `ModelRecord as ModelRecordSchema` with `ModelRecordSchema` for consistency across multiple files. - Updated the model schema definitions to use `ModelConfigSchema` and `ModelRecordSchema` in relevant areas. - Enhanced code formatting for better readability in various task and model repository files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces a clearer separation between ModelConfig (for task/job inputs) and ModelRecord (for repository persistence), enabling job workers to execute AI tasks without requiring access to a model repository at runtime. The key architectural change allows AiTask to accept and pass full model configurations directly in job payloads, aligning the implementation with the original schema design intent.
- Introduces
ModelConfigSchemawith minimal required fields (provider, providerConfig) andModelRecordSchemawith full requirements for persistence - Updates
AiTaskto accept both string model identifiers andModelConfigobjects, resolving strings via repository while passing configs directly to jobs - Modifies job execution to use embedded
ModelConfiginstead of performing runtime repository lookups
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/ai/src/model/ModelSchema.ts | Introduces ModelConfigSchema (lightweight) and ModelRecordSchema (persistence) with distinct required fields |
| packages/ai/src/task/base/AiTask.ts | Adds getModelConfigForInput() method, updates validation to handle both string and object models, modifies job input generation to embed configs |
| packages/ai/src/task/base/AiTaskSchemas.ts | Updates model type schema reference from ModelSchema to ModelConfigSchema |
| packages/ai/src/task/base/AiVisionTask.ts | Updates type signatures to accept ModelConfig in addition to strings |
| packages/ai/src/job/AiJob.ts | Removes model repository lookup, directly uses ModelConfig from job input |
| packages/ai/src/provider/AiProviderRegistry.ts | Updates run function type signature to accept ModelConfig instead of ModelRecord |
| packages/ai/src/model/ModelRepository.ts | Updates repository to use ModelRecordSchema for persistence |
| packages/ai/src/model/InMemoryModelRepository.ts | Updates constructor to use ModelRecordSchema |
| packages/test/src/binding/SqliteModelRepository.ts | Updates tabular repository instantiation with ModelRecordSchema |
| packages/test/src/binding/PostgresModelRepository.ts | Updates tabular repository instantiation with ModelRecordSchema |
| packages/test/src/binding/IndexedDbModelRepository.ts | Updates tabular repository instantiation with ModelRecordSchema |
| packages/ai-provider/src/tf-mediapipe/common/TFMP_ModelSchema.ts | Adds TFMPModelConfig type extending the base model config schema |
| packages/ai-provider/src/tf-mediapipe/common/TFMP_JobRunFns.ts | Updates all run function signatures to accept TFMPModelConfig |
| packages/ai-provider/src/hf-transformers/common/HFT_ModelSchema.ts | Adds HfTransformersOnnxModelConfig type with proper schema separation |
| packages/ai-provider/src/hf-transformers/common/HFT_JobRunFns.ts | Updates type signatures and cache key logic to use modelPath:pipeline instead of model_id |
| packages/test/src/test/ai-provider/AiProviderRegistry.test.ts | Updates test to pass full model config object instead of string identifier |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const combined: (string | ModelConfig)[] = [...requestedInline, ...usingStrings]; | ||
|
|
||
| // we alter input to be the models that were found for this kind of input | ||
| usingModels = usingModels.length > 1 ? usingModels : usingModels[0]; | ||
| (input as any)[key] = usingModels; | ||
| (input as any)[key] = combined.length > 1 ? combined : combined[0]; |
Copilot
AI
Dec 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If both requestedInline and usingStrings are empty, the combined array will be empty, and combined[0] will be undefined. This could lead to unexpected behavior when a single model is expected but all requested models were filtered out. Consider adding a check to handle this case explicitly or maintaining the original value when no compatible models are found.
| */ | ||
|
|
||
| import { ModelSchema } from "@workglow/ai"; | ||
| import { ModelRecordSchema } from "@workglow/ai"; |
Copilot
AI
Dec 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The import statement is missing ModelConfigSchema which should be imported from "@workglow/ai" alongside ModelRecordSchema. This is needed to properly extend the ModelConfig schema for the TFMPModelConfig type.
| const ExtendedModelConfigSchema = { | ||
| type: "object", | ||
| properties: { | ||
| ...ModelRecordSchema.properties, | ||
| ...TFMPModelSchema.properties, | ||
| }, | ||
| required: [...ModelRecordSchema.required, ...TFMPModelSchema.required], |
Copilot
AI
Dec 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ExtendedModelConfigSchema should extend ModelConfigSchema instead of ModelRecordSchema. This is inconsistent with the HFT implementation and defeats the purpose of having a lighter ModelConfig schema for task inputs. The required fields should be [...ModelConfigSchema.required, ...TFMPModelSchema.required] instead of [...ModelRecordSchema.required, ...].
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- Replaced `ModelRecordSchema` with `ModelConfigSchema` in the ExtendedModelConfigSchema definition. - Adjusted required properties to align with the new schema structure. - Improved code consistency and readability in the model schema file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is being reviewed by Cursor Bugbot
Details
You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
| taskModels?.find((m) => m.model_id === model) | ||
| ); | ||
|
|
||
| const combined: (string | ModelConfig)[] = [...requestedInline, ...usingStrings]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inline models not filtered for task compatibility in narrowInput
The narrowInput method filters string models based on task compatibility but keeps all inline ModelConfig objects unfiltered. However, validateInput rejects inline configs with a tasks array that doesn't include the current task type. This inconsistency means incompatible inline configs pass through narrowInput but fail validation later. The purpose of narrowInput is to strip incompatible models in dataflows so validation succeeds when at least some models are compatible. Inline configs with explicit incompatible tasks arrays should be filtered out in narrowInput to maintain consistency with validateInput.
Additional Locations (1)
| progressScaleMax: number = 10 | ||
| ) => { | ||
| const cacheKey = `${model.model_id}:${model.providerConfig.pipeline}`; | ||
| const cacheKey = `${model.providerConfig.modelPath}:${model.providerConfig.pipeline}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pipeline cache key ignores device and dtype configuration
The pipeline cache key changed from model.model_id:pipeline to model.providerConfig.modelPath:pipeline. However, the pipeline is created with additional configuration options including dType (dtype) and device (lines 372, 376). Since these options aren't included in the cache key, two model configs with the same modelPath and pipeline but different dType or device settings will incorrectly share the same cached pipeline. This could cause models to run on the wrong device (e.g., CPU instead of WebGPU) or with the wrong precision, leading to incorrect behavior or errors.
|
Closes #149 |
Introduce
ModelConfigandModelRecordschemas, allowingAiTaskto accept full model configurations directly and enabling job workers to run without a model repository.The original
AiTaskinput schema suggested it could take aModelConfigobject, but the implementation only supported a string identifier, requiring a model repository lookup at runtime. This change aligns the implementation with the schema and decouples job execution from the model repository by embedding the necessaryModelConfigdirectly into job payloads.ModelConfigis a lighter schema for task inputs, whileModelRecordis used for repository persistence.Note
Allow passing full
ModelConfigin task/job inputs (no repo lookup), introduceModelConfigSchemavsModelRecordSchema, update provider run fns and repositories accordingly.ModelConfigSchema(for task/job inputs) andModelRecordSchema(for persistence); exportModelConfig/ModelRecord.InMemoryModelRepository,ModelRepository, test bindings) now useModelRecordSchema.AiTask/AiVisionTaskacceptstring | ModelConfig, resolve inline configs without repository, and embedmodelconfig intoAiJobInput.model.providerwhen config given.AiJobpassesModelConfigdirectly to run funcs; no repository lookup.AiProviderRegistrytypes updated to useModelConfig; worker wrapper forwards[input, model].HfTransformersOnnxModelConfig/TFMPModelConfig.model.providerConfig.modelPath:pipeline(notmodel_id).AiTaskSchemas.TypeModel*now build fromModelConfigSchema.modelobjects and remove repository dependency inAiJobexecution.Written by Cursor Bugbot for commit a29226e. This will update automatically on new commits. Configure here.