Zb/illico fixes#4102
Conversation
zboldyga
commented
May 5, 2026
- Closes #
- Tests included or not required because:
- Release notes not necessary because:
|
@ilan-gold this is a work in progress but opening a draft to keep you in the loop. This PR is scoped only to fixes to the current illico integration. The 3 commits here correspond with unique fixes As for the groupby change, I want to double-check this and refine a bit more, but the prior version has dependencies on the illico internal api/decisions, and is slower than the proposed approach (~a few seconds runtime for some datasets I tested, could be more for bigger ones). |
|
Nice this looks like a great start (and a big cleanup over what I had!) |
|
@ilan-gold OK I reviewed carefully and I feel this is sufficient / complete for the scope of this PR. Kind of note to self here, but overall if/when illico is more tightly integrated the whole flow from X to final stats can involve fewer operations. I suspect the unstack step won't be needed (the _to_iter function will be removed later, and it's tests as well). |
|
@zboldyga This looks good overall. Just to understand, was this responsible for the performance difference? |
|
@ilan-gold I'm glad you asked haha. I ran a wider benchmark, then added some new alternatives, and found a better approach. But yes, the intention here is twofold: there are some minor test adjustments, and a performance improvement so that scanpy's illico is closer to the speed + memory overhead of using illico directly. Here's benchmarking across 14 real perturb-seq datasets. Note that groupby_dict was the former approach. unstack_reindex was what I had prior to the latest commit (it was not helping, tbh). And streaming_loc is what I settled on (very simple, just walks the data as it exists in memory and there's little overhead to doing this). Also, note that this latest approach is streaming friendly, e.g. it will work with backed files. Total time (ms)Datasets ordered by n_cells (ascending).
phys_footprint delta (MB)Peak memory, sampled at 5 ms
So we're saving time and memory, and the code is pretty easy to read |