Conversation
BeforeAfter |
|
Oh very nice! |
|
I implemented one optimisation here, which reduces the number of memory accesses. Before the patch, the
→ we now perform one memory access per recursive call |
We achieve this by using a representation of the form (beginning index, ending index) for the slices rather than (beginning index, length). This mostly reduces the number of arithmetic operations in merge_hi, while merge_lo remains more or less untouched. This has a noticeable impact on the in the benchmark!
|
And with the last commit: |
Niols
left a comment
There was a problem hiding this comment.
A few comments but I approve the changes! Don't we want to update merge as well to use two offsets? Like where is it clever to go from ben/len to ben/end?
| src1 ofs1 len1 | ||
| dest beg | ||
| src0 beg0 end0 x0 | ||
| src1 beg1 end1 x1 |
There was a problem hiding this comment.
I think it would be worth having comments like (* x0 = src0.(beg0) *) already at this point. It's readable in the assertions below but I think it's better to have it here!
| assert (end0 >= beg0); | ||
| assert (end1 >= beg1); | ||
|
|
||
| (* This is used to optimise the case len0 = 1 below. *) |
There was a problem hiding this comment.
This comment does not make sense to me. I think it would deserve more text!
| src1 ofs1 len1 | ||
| dest end_ | ||
| src0 beg0 end0 x0 (* run0 *) | ||
| src1 beg1 end1 x1 (* run1 *) |
| assert (x1 = src1.(end1)); | ||
| assert (end0 >= beg0); | ||
| assert (end1 >= beg1); | ||
| (* This is used to optimise the case len1 = 1 below. *) |
| merge_hi | ||
| cmp | ||
| dest (end_ - 1) | ||
| src0 beg0 (end0 - 1) src0.(end0 - 1) |
There was a problem hiding this comment.
We use end0 - 1 twice here, is it worth having an intermediary value? (Same in other branch; same in merge_lo.)
Some optimisations on the
merge_{hi,lo}functions