Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
114 commits
Select commit Hold shift + click to select a range
f197700
fix
gmlwns2000 Aug 11, 2025
478ae42
fix self extend delta
gmlwns2000 Aug 12, 2025
4e43315
fit gpt oss
gmlwns2000 Aug 12, 2025
2d30fc6
disable delta glm
gmlwns2000 Aug 12, 2025
7eab17a
added block sums and bsa index to query sparse attention
jeffwillette Jul 10, 2025
9e43cb1
refactor
gmlwns2000 Jul 14, 2025
76041ee
refactor
gmlwns2000 Jul 15, 2025
9ef3f6b
fixed bug in test code
jeffwillette Jul 19, 2025
bea3860
changed bsa_block_size_q default arg to 1
jeffwillette Jul 19, 2025
db385e2
commit before incorporating changes into upstream branch
jeffwillette Aug 8, 2025
c6f870b
fix
gmlwns2000 Aug 12, 2025
9efdef4
wow
gmlwns2000 Aug 12, 2025
46540e2
add visualization
gmlwns2000 Aug 12, 2025
a743eba
fix
gmlwns2000 Aug 12, 2025
c42b580
update
gmlwns2000 Aug 12, 2025
fa68b31
fix bug
gmlwns2000 Aug 12, 2025
ff96a50
fix
gmlwns2000 Aug 13, 2025
40e3232
handle qsa bsa
gmlwns2000 Aug 13, 2025
e1ab25d
wip
daniel-geon-park Aug 13, 2025
0514b44
wip
daniel-geon-park Aug 18, 2025
8b4b63c
Merge branch 'feat/delta-bsa-hip' of github.com:DeepAuto-AI/hip-atten…
daniel-geon-park Aug 18, 2025
e2fdcf6
added implementation of winner min-heap inside qsa
jeffwillette Aug 14, 2025
8ac0b5e
debugging using_exp_sum bug
jeffwillette Aug 14, 2025
ff84c21
qsa winner heap working here
jeffwillette Aug 16, 2025
92b0de7
minor cleanup on qsa code.
jeffwillette Aug 16, 2025
3f6a4d4
heap verified to return same indices as plain online top-k
jeffwillette Aug 16, 2025
6e5d8b3
reordered code for minor cleanup
jeffwillette Aug 17, 2025
bb0fe68
reordered the qsa top-k loop
jeffwillette Aug 17, 2025
75fd0d4
cleaned up qsa kernel and added tests
jeffwillette Aug 18, 2025
5f4cf9d
added back in full hip/bsa tests
jeffwillette Aug 18, 2025
d7ba5fe
fmt
gmlwns2000 Aug 19, 2025
8dc169d
what is happend here :(
gmlwns2000 Aug 19, 2025
05c728a
fix to use int32 in QSA masking
gmlwns2000 Aug 19, 2025
d0b090a
fix autotune
gmlwns2000 Aug 19, 2025
95e3828
chnage score to fp32
gmlwns2000 Aug 19, 2025
3047eb3
wip
daniel-geon-park Aug 19, 2025
ceffd32
wip
daniel-geon-park Aug 20, 2025
e67d29e
Merge branch 'feat/delta-bsa-hip-rebase' into feat/delta-bsa-geon
daniel-geon-park Aug 20, 2025
876933c
fix
gmlwns2000 Aug 23, 2025
99adaf7
added implementation of winner min-heap inside qsa
jeffwillette Aug 14, 2025
7deefe8
debugging using_exp_sum bug
jeffwillette Aug 14, 2025
46df21d
qsa winner heap working here
jeffwillette Aug 16, 2025
c01c9a9
heap verified to return same indices as plain online top-k
jeffwillette Aug 16, 2025
c536d63
reordered code for minor cleanup
jeffwillette Aug 17, 2025
902ccfb
reordered the qsa top-k loop
jeffwillette Aug 17, 2025
43e1bfb
cleaned up qsa kernel and added tests
jeffwillette Aug 18, 2025
72aeaa0
added back in full hip/bsa tests
jeffwillette Aug 18, 2025
5400fa5
fixed bugs in qsa top-k's
jeffwillette Aug 20, 2025
f142e70
removed block pointers from qsa kernel
jeffwillette Aug 21, 2025
c1bb572
converted pointers to all int64
jeffwillette Aug 22, 2025
edaf477
fix
gmlwns2000 Aug 23, 2025
22bcb61
removed functions which snuck in via rebase
jeffwillette Aug 23, 2025
ac71de0
wip
daniel-geon-park Aug 23, 2025
73371d3
add EXACT_K
daniel-geon-park Aug 23, 2025
22a1c48
wip
daniel-geon-park Aug 23, 2025
a422c9d
Merge remote-tracking branch 'origin/research/delta-qsa' into feat/de…
daniel-geon-park Aug 23, 2025
fbed768
added os environ args for heap,top_k,reverse
jeffwillette Aug 24, 2025
3282b30
wip
daniel-geon-park Aug 24, 2025
2dc2aa8
wip
daniel-geon-park Aug 25, 2025
be23e44
added logsumexp trick for qsa kernel top-k
jeffwillette Aug 27, 2025
7c2cedc
add override envvars
daniel-geon-park Aug 27, 2025
235230e
Merge remote-tracking branch 'origin/research/delta-qsa' into feat/de…
daniel-geon-park Aug 27, 2025
d254a9e
fix tests
daniel-geon-park Aug 27, 2025
c54b3e9
fix type hint
daniel-geon-park Aug 27, 2025
25c63ac
Merge pull request #81 from DeepAuto-AI/feat/delta-bsa-geon
jeffwillette Aug 27, 2025
9d87c24
fix est test
daniel-geon-park Aug 27, 2025
624c751
change query_sparse_attention interface
daniel-geon-park Aug 27, 2025
89b80d3
make estimate compatible with tree
daniel-geon-park Aug 27, 2025
54d6c58
fix constexpr error
daniel-geon-park Aug 27, 2025
44f40c9
Merge pull request #82 from DeepAuto-AI/feat/delta-bsa-geon
daniel-geon-park Aug 27, 2025
06c4ae8
fix
gmlwns2000 Sep 2, 2025
7bc6bef
Merge pull request #83 from DeepAuto-AI/feat/delta-qsa-ain
gmlwns2000 Sep 2, 2025
34054e4
add row_sums, row_sums_bsa
daniel-geon-park Aug 29, 2025
89eaf5a
fix estimation
daniel-geon-park Aug 29, 2025
fe3a3d6
add overestimate_factor
daniel-geon-park Aug 29, 2025
3d73b9f
testing
daniel-geon-park Aug 29, 2025
50444bd
fix bug
gmlwns2000 Sep 2, 2025
4b25beb
fix
gmlwns2000 Sep 2, 2025
173ca1a
fmt
gmlwns2000 Sep 2, 2025
e69ad6b
Merge pull request #84 from DeepAuto-AI/feat/delta-qsa-geon
gmlwns2000 Sep 2, 2025
451e06b
fix reverse_iter environment variable bug
daniel-geon-park Sep 2, 2025
905d7be
fix bugs
gmlwns2000 Sep 5, 2025
eb04b72
fmt
gmlwns2000 Sep 5, 2025
f2260de
Merge pull request #85 from DeepAuto-AI/feat/delta-qsa-geon
gmlwns2000 Sep 5, 2025
0252bb2
fix sliding_window estimate bug
daniel-geon-park Sep 5, 2025
0f3ab2b
added latency test
jeffwillette Sep 5, 2025
72bae69
updated latency test
jeffwillette Sep 5, 2025
dfbdd04
removed union topk test
jeffwillette Sep 5, 2025
7a804f5
added e2e latency update
jeffwillette Sep 8, 2025
21b905a
fixed autotune dirty init
jeffwillette Sep 10, 2025
7ab0e59
fix
gmlwns2000 Sep 5, 2025
3a26074
fix
gmlwns2000 Sep 8, 2025
c08c96a
update
gmlwns2000 Sep 11, 2025
8a36a8f
fix display
gmlwns2000 Sep 11, 2025
e7bc542
fmt
gmlwns2000 Sep 11, 2025
1843ab0
Merge pull request #86 from DeepAuto-AI/feat/delta-qsa-geon
gmlwns2000 Sep 11, 2025
593e55e
mark as exec
gmlwns2000 Sep 11, 2025
f31773c
hotfix
gmlwns2000 Sep 11, 2025
42eb92e
quick fix
gmlwns2000 Sep 11, 2025
757abc1
fmt
gmlwns2000 Sep 11, 2025
f94d7c0
Merge pull request #88 from DeepAuto-AI/feat/delta-qsa-fix
gmlwns2000 Sep 11, 2025
7753f25
added bsa meanpool attention to paged hip
jeffwillette Sep 12, 2025
05ccb0a
bsa_meanpool bugfix
jeffwillette Sep 12, 2025
da2cfc3
minor bugfix to bsa_meanpool
jeffwillette Sep 12, 2025
6fb29d1
removed mask_n in forward_bsa_meanpool
jeffwillette Sep 12, 2025
55c62a1
removed addition of block_size_q to active_mask in forward_bsa_meanpool
jeffwillette Sep 12, 2025
3bd1793
fix
gmlwns2000 Sep 11, 2025
c8e0c09
fmt
gmlwns2000 Sep 11, 2025
8e9bdfd
fix
gmlwns2000 Sep 13, 2025
2497286
added padding to bsa_meanpool attn
jeffwillette Sep 13, 2025
d098bee
fix
gmlwns2000 Sep 13, 2025
3b1bccd
fix
gmlwns2000 Sep 15, 2025
f209261
fix
gmlwns2000 Sep 25, 2025
930f558
fmt
gmlwns2000 Oct 1, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions configs/mixed_landmark_0814_no_extend_qsa.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,11 @@
"__head_reduce": 0,
"__last_dense": -1,
"__using_landmark": false,
"__seq_thresh_fa3": 131072,
"__delta_attention_args": "window_0-diff_1-w_8-dense_decode-smooth",
"__using_dense_prefill": true,
"__seq_thresh_fa3": 0,
"__delta_attention_args": "window_0-diff_1-w_16-dense_decode-smooth",
"using_extend": false,
"dense_layers": [0],
"dense_layers": [0, 1, 2, 47, 46, 45],
"mask_refresh_interval": [96],
"layers": [
{
Expand Down
45 changes: 45 additions & 0 deletions configs/qwen3_30b_a3b_config_1m.json

Large diffs are not rendered by default.

65 changes: 65 additions & 0 deletions configs/qwen3_30b_a3b_qsa.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
{
"__delta_attention_args": "window_0-diff_1-w_16-dense_decode-smooth",
"__using_dense_prefill": true,
"__head_reduce": 2,
"using_extend": false,
"dense_layers": [
0,
1,
2,
3,
4,
5,
8,
11,
14,
17,
20,
23,
26,
29,
30,
33,
36,
39,
41,
42,
43,
44,
45,
46,
47
],
"mask_refresh_interval": [64, 24, 8],
"prefill_layers": [
{},
{
"second_stage_k": 2048,
"sliding_window_size": 512,
"sink_token_size": 64,
"stages": [
{
"stage_block_size_q": 64,
"stage_block_stride_q": 2,
"stage_chunk_size": 64,
"stage_k": null,
"stage_stride": 1
},
{
"stage_block_size_q": 64,
"stage_block_stride_q": 2,
"stage_chunk_size": 32,
"stage_k": 16384,
"stage_stride": 1
},
{
"stage_block_size_q": 64,
"stage_block_stride_q": 1,
"stage_chunk_size": 16,
"stage_k": 4096,
"stage_stride": 1
}
]
}
]
}
2 changes: 1 addition & 1 deletion configs/rebuttal/llama31_extend.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
"__seq_thresh_fa3": 0,
"using_extend": true,
"self_extend_scale": 36,
"dense_layers": [0,1,2],
"dense_layers": [0, 1, 2],
"mask_refresh_interval": [32, 16, 8],
"layers": [
{
Expand Down
66 changes: 66 additions & 0 deletions configs/rebuttal/llama31_noextend_dense_decode.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
{
"__head_reduce": 0,
"__using_dense_prefill": true,
"__last_dense": -1,
"__using_landmark": false,
"__seq_thresh_fa3": 0,
"__delta_attention_args": "window_0-diff_2-w_256-dense_decode",
"using_extend": false,
"self_extend_scale": 2,
"dense_layers": [0,1,2],
"mask_refresh_interval": [32, 16, 8],
"layers": [
{
"sliding_window_size": 131072,
"second_stage_k": 0,
"sink_token_size": 0,
"sa_extend_backend": "self_extend"
},
{
"sliding_window_size": 131072,
"second_stage_k": 0,
"sink_token_size": 0,
"sa_extend_backend": "self_extend"
}
],
"prefill_layers": [
{
"sliding_window_size": 131072,
"second_stage_k": 0,
"sink_token_size": 0,
"sa_extend_backend": "self_extend"
},
{
"sliding_window_size": 1024,
"second_stage_k": 2048,
"sink_token_size": 256,
"sa_extend_backend": "self_extend",
"scan_extend_backend": "none",
"stages": [
{
"stage_block_size_q":64,
"stage_block_stride_q":2,
"stage_chunk_size":128,
"stage_k":null,
"stage_stride":1
},
{
"stage_block_size_q":64,
"stage_block_stride_q":2,
"stage_chunk_size":32,
"stage_k":32768,
"stage_stride":1,
"using_landmark":true
},
{
"stage_block_size_q":64,
"stage_block_stride_q":1,
"stage_chunk_size":8,
"stage_k":8192,
"stage_stride":1,
"using_landmark":true
}
]
}
]
}
1 change: 1 addition & 0 deletions hip-research/src/hip_research/main/chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,7 @@
"model": "anything",
"messages": [{"role": "system", "content": sys_prompt}] + chat_log,
"stream": True,
"max_tokens": 32768,
"temperature": 0.7,
"top_p": 0.8,
"top_k": 20,
Expand Down
Loading