Optimize Eq and Ord for LazyByteString using pointer equality#404
Optimize Eq and Ord for LazyByteString using pointer equality#404
Conversation
This is inspired by a discussion in Haskell-Cafe: https://mail.haskell.org/pipermail/haskell-cafe/2021-June/134073.html
|
TODO for myself:
|
| | otherwise = case compare al bl of | ||
| LT -> a == S.BS bp al && eq as (Chunk (S.BS (S.plusForeignPtr bp al) (bl - al)) bs) | ||
| EQ -> a == b && eq as bs | ||
| GT -> S.BS ap bl == b && eq (Chunk (S.BS (S.plusForeignPtr ap bl) (al - bl)) as) bs |
There was a problem hiding this comment.
In the LT and GT cases, we'd recursively run the pointer equality checks on freshly allocated Chunks – which is totally wasteful
It might be better to call an "inner" function here which doesn't perform the pointer equality check.
cmp has the same issue.
|
I suspect that sharing-based equality-checks on lazy data structures are inherently at odds with referential transparency due to infinite and partially-defined values. So, it's my opinion that these comparison functions should only be offered from an Unsafe module, if at all. I will also point out that since the Eq instance for strict ByteString already performs a sharing-based equality check, the existing Eq instance for lazy ByteString should already be pretty fast in most cases where there is a long shared tail. The same does not appear to be true for the Ord instance. |
|
Thanks for your comments, @clyring!
Could you clarify where exactly you see the problem? My intention was that the changed instances would behave just like the old ones. But maybe this won't work out?!
I'm also not convinced yet that this patch will pay off performance-wise. |
|
These comparators give the same result as those in the existing instances, as long as at least one argument is a finite, total ByteString. But for infinite and partial ByteStrings, they will (sometimes) produce |
This is inspired by a discussion in Haskell-Cafe:
https://mail.haskell.org/pipermail/haskell-cafe/2021-June/134073.html