Add Cache Miss/Hit Test#97
Open
alex-jw-brooks wants to merge 8 commits intofoundation-model-stack:mainfrom
Open
Add Cache Miss/Hit Test#97alex-jw-brooks wants to merge 8 commits intofoundation-model-stack:mainfrom
alex-jw-brooks wants to merge 8 commits intofoundation-model-stack:mainfrom
Conversation
1be911b to
ad3073c
Compare
This was referenced Jul 30, 2025
Merged
ad3073c to
fa5bd38
Compare
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
fa5bd38 to
1566583
Compare
Contributor
Author
|
bot:test |
JRosenkranz
reviewed
Oct 20, 2025
| model_kwargs = _get_common_model_kwargs(is_gptq, model_path) | ||
|
|
||
| # Get the AIU model w/ the persistent model fixture | ||
| model = persistent_model.get_or_create( |
Contributor
There was a problem hiding this comment.
It looks like we are re-creating the model and validation_model when it's already being created in the fixture. Is this required?
Contributor
Author
There was a problem hiding this comment.
Nope! Good point, returned them both out of the fixture and deleted it from the cache hit check so that it'll be reused
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Contributor
Author
|
bot:test |
JRosenkranz
reviewed
Oct 31, 2025
JRosenkranz
approved these changes
Oct 31, 2025
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Collaborator
|
@JRosenkranz Is this good to go? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR builds on top of #20 and #93 to add a cache for testing using the refactored version of the test to allow some code reuse. #93 should probably be merged first (splitting this out for readability).
Summary of changes (wrt the original cache test PR)
- Makes sure gptq kwargs are passed through to the AIU model
- Makes sure
options={"sendnn.dynamic": COMPILE_DYNAMIC_SENDNN}is passed consistently- Clears the torch sendnn
.cache- the current PR can break if the cache test runs second since the cache paths aren't actually reset in torch sendnn. We reset the compiler settings and clear the directory, but don't clear the spyre cache object in the current PR, which causes alignment issues if the cache test doesn't run first- The current PR runs the check as two tests (cache miss -> cache hit); moves the cache miss test to run as a fixture to set things up so that we can just run cache hit as the test
Note that there is still some weirdness around how micro models are handled, mostly due to the way we configure common models paths / micro model usage and also check thresholds based on whether micro models exist.