https://github.com/uxlfoundation/oneDPL/blob/95bedeff1630c9a00b7db99ac90781c91698a8a1/include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl.h#L770 I think it's very memory-expensive to store 4 bytes for each element just as usage flag (a mask). @mmichel11 @danhoeflinger Could you please take a look?