Add Cache Miss/Hit Test by alex-jw-brooks · Pull Request #97 · foundation-model-stack/aiu-fms-testing-utils

alex-jw-brooks · 2025-07-30T12:25:55Z

This PR builds on top of #20 and #93 to add a cache for testing using the refactored version of the test to allow some code reuse. #93 should probably be merged first (splitting this out for readability).

Summary of changes (wrt the original cache test PR)
- Makes sure gptq kwargs are passed through to the AIU model
- Makes sure options={"sendnn.dynamic": COMPILE_DYNAMIC_SENDNN} is passed consistently
- Clears the torch sendnn .cache - the current PR can break if the cache test runs second since the cache paths aren't actually reset in torch sendnn. We reset the compiler settings and clear the directory, but don't clear the spyre cache object in the current PR, which causes alignment issues if the cache test doesn't run first
- The current PR runs the check as two tests (cache miss -> cache hit); moves the cache miss test to run as a fixture to set things up so that we can just run cache hit as the test

Note that there is still some weirdness around how micro models are handled, mostly due to the way we configure common models paths / micro model usage and also check thresholds based on whether micro models exist.

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

alex-jw-brooks · 2025-10-14T11:48:07Z

bot:test
TEST_FILE=test_decoders.py MODEL_ID=ibm-granite/granite-3.3-8b-instruct BATCH_SIZE=1,2 SEQUENCE_LENGTH=1024,2048 USE_TINY_MODEL=1 NUM_AIU=4

JRosenkranz · 2025-10-20T11:23:53Z

+    model_kwargs = _get_common_model_kwargs(is_gptq, model_path)
+
+    # Get the AIU model w/ the persistent model fixture
+    model = persistent_model.get_or_create(


It looks like we are re-creating the model and validation_model when it's already being created in the fixture. Is this required?

Nope! Good point, returned them both out of the fixture and deleted it from the cache hit check so that it'll be reused

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

alex-jw-brooks · 2025-10-26T12:04:38Z

bot:test
TEST_FILE=test_decoders.py MODEL_ID=ibm-granite/granite-3.3-8b-instruct BATCH_SIZE=1,2 SEQUENCE_LENGTH=1024 USE_TINY_MODEL=1 NUM_AIU=4

JRosenkranz

lgtm once the duplicate lines are fixed

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

matthew-pisano · 2026-02-12T21:37:17Z

@JRosenkranz Is this good to go?

alex-jw-brooks force-pushed the rebased_cache_tests branch 2 times, most recently from 1be911b to ad3073c Compare July 30, 2025 12:26

alex-jw-brooks changed the title ~~Rebased cache tests~~ Add Cache Miss/Hit Test Jul 30, 2025

This was referenced Jul 30, 2025

Refactor Decoder Tests #93

Merged

Add Cache Miss/Hit Test alex-jw-brooks/aiu-fms-testing-utils#1

Open

alex-jw-brooks force-pushed the rebased_cache_tests branch from ad3073c to fa5bd38 Compare October 13, 2025 09:43

alex-jw-brooks added 6 commits October 13, 2025 09:44

Add cache test

18bbf01

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

use tmp_path fixture for cache test

42305bb

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

fix cache_dir in cache checks

fe8c61f

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

only warmup on cache tests

630fe39

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

parametrize use_cache

fd1c20f

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

use request param for setting up use_cache

1566583

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

alex-jw-brooks force-pushed the rebased_cache_tests branch from fa5bd38 to 1566583 Compare October 13, 2025 09:46

JRosenkranz requested review from JRosenkranz, ani300 and tharapalanivel October 14, 2025 15:44

JRosenkranz reviewed Oct 20, 2025

View reviewed changes

reuse aiu/cpu models from cache miss fixture

c33072e

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

JRosenkranz reviewed Oct 31, 2025

View reviewed changes

Comment thread tests/models/test_decoders.py

JRosenkranz approved these changes Oct 31, 2025

View reviewed changes

remove duplicate code

b8181ac

Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>

alex-jw-brooks requested a review from JRosenkranz November 2, 2025 04:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Cache Miss/Hit Test#97

Add Cache Miss/Hit Test#97
alex-jw-brooks wants to merge 8 commits intofoundation-model-stack:mainfrom
alex-jw-brooks:rebased_cache_tests

alex-jw-brooks commented Jul 30, 2025 •

edited

Loading

Uh oh!

alex-jw-brooks commented Oct 14, 2025

Uh oh!

JRosenkranz Oct 20, 2025

Uh oh!

alex-jw-brooks Oct 26, 2025

Uh oh!

alex-jw-brooks commented Oct 26, 2025

Uh oh!

Uh oh!

JRosenkranz left a comment •

edited

Loading

Uh oh!

matthew-pisano commented Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

alex-jw-brooks commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alex-jw-brooks commented Oct 14, 2025

Uh oh!

JRosenkranz Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

alex-jw-brooks Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

alex-jw-brooks commented Oct 26, 2025

Uh oh!

Uh oh!

JRosenkranz left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

matthew-pisano commented Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

alex-jw-brooks commented Jul 30, 2025 •

edited

Loading

JRosenkranz left a comment •

edited

Loading