-
Notifications
You must be signed in to change notification settings - Fork 299
Description
Hello,
I'm trying to build some code I written and successfully run on SSE4.2, AVX2, arm neon for riscV too. (riscv64 on Orange Pi RV2). So as a minimum viable example, I tried to start with the example from https://xsimd.readthedocs.io/en/latest/api/dispatching.html
In my actual code, I'd like to use the same basic principle. Currently, I have a dispatching file where I use an #if defined .../#elif ... etc. to define my architecture list based on what I build for. And then I have a simple impl file for each of the archs that I then call with the required arch information for the compiler as suggested in the example. Here I just copied the example from the doc into a subfolder of the xsimd checkout (tag for 14.0.0) with adaptation to target rvv.
Source code
So what I have now - just with a few attempts for some debugging output - I have based on the mentioned example.
sum.hpp:
#ifndef _SUM_HPP
#define _SUM_HPP
#include "xsimd/xsimd.hpp"
#include <iostream>
// functor with a call method that depends on `Arch`
struct sum
{
// It's critical not to use an in-class definition here.
// In-class and inline definition bypass extern template mechanism.
template <class Arch, class T>
T operator()(Arch, T const* data, unsigned size);
};
template <class Arch, class T>
T sum::operator()(Arch, T const* data, unsigned size)
{
using batch = xsimd::batch<T, Arch>;
std::cout << "Vector size: " << batch::size << '\n';
batch acc(static_cast<T>(0));
const unsigned n = size / batch::size * batch::size;
for (unsigned i = 0; i != n; i += batch::size)
acc += batch::load_unaligned(data + i);
//T star_acc = xsimd::reduce_add(acc);
std::array<T, 128> out_values;
acc.store_unaligned(out_values.data());
T star_acc = 0;
for (unsigned i = 0; i < batch::size; ++i)
star_acc += out_values[i];
for (unsigned i = n; i < size; ++i)
star_acc += data[i];
return star_acc;
}
// Inform the compiler that sse2 and avx2 implementation are to be found in another compilation unit.
extern template float sum::operator()<xsimd::rvv<128>, float>(xsimd::rvv<128>, float const*, unsigned);
//extern template float sum::operator()<xsimd::sse2, float>(xsimd::sse2, float const*, unsigned);
#endif
sum_rvv128.cpp (this I normally would compile as a seprate object, but to my understanding this shouldn't make a difference in this use case). For this I'd probably do a rvv128 and rvv256 if my assumption is correct and dispatch with priority for 256 and then 128.
#include "sum.hpp"
template float sum::operator()<xsimd::rvv<128>, float>(xsimd::rvv<128>, float const*, unsigned);
Is that understanding correct, that for riscv the length of the target vector is templated and needs to be the number of bits? I didn't see a clear example for it but that is what I condensed from looking at the includes.
sum.cpp (The dispatch, for simplicity with a main to quickly test once it is compiling):
#include "sum.hpp"
#include <iostream>
#include <vector>
// Create the dispatching function, specifying the architecture we want to
// target.
static auto dispatched = xsimd::dispatch<xsimd::arch_list<xsimd::rvv<128>>>(sum{});
// Call the appropriate implementation based on runtime information.
int main(int argc, char* argv[])
{
std::vector<float> data;
size_t num_entries = 17;
for (size_t idx = 0; idx < num_entries; ++idx)
{
data.push_back(idx * 2.3456f);
}
float res = dispatched(data.data(), num_entries);
std::cout << "Sum of vector\n";
return 0;
}
Build attempt
Trying for simplicity to just compile it manually, normally I'd use cmake but wanted to keep it simple.
g++-14 -o sum.o -std=c++20 -I../include sum.cpp sum_rvv128.cpp -march=rv64gcv_zvl128b_zba_zbb_zbs -mrvv-vector-bits=zvl
The build error
Here some excerpt from the from the first errors provided, it goes on like this for a while as anyway usual for templates. This one is huge but if it helps I can provide the full log.
In file included from ../include/xsimd/memory/../config/../types/xsimd_all_registers.hpp:47,
from ../include/xsimd/memory/../config/xsimd_arch.hpp:19,
from ../include/xsimd/memory/xsimd_aligned_allocator.hpp:27,
from ../include/xsimd/xsimd.hpp:63,
from sum.hpp:3,
from sum.cpp:1:
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:238:27: error: explicit specialization in non-namespace scope ‘struct xsimd::types::detail::rvv_semiblob<T, divisor>’
238 | template <>
| ^
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:239:24: error: too few template-parameter-lists
239 | struct semitype<2>
| ^~~~~~~~~~~
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:243:27: error: explicit specialization in non-namespace scope ‘struct xsimd::types::detail::rvv_semiblob<T, divisor>’
243 | template <>
| ^
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:244:24: error: too few template-parameter-lists
244 | struct semitype<4>
| ^~~~~~~~~~~
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:248:27: error: explicit specialization in non-namespace scope ‘struct xsimd::types::detail::rvv_semiblob<T, divisor>’
248 | template <>
| ^
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:249:24: error: too few template-parameter-lists
249 | struct semitype<8>
| ^~~~~~~~~~~
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:260:27: error: explicit specialization in non-namespace scope ‘struct xsimd::types::detail::rvv_semiblob<T, divisor>’
260 | template <>
| ^
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:261:28: error: template-id ‘get_bytes<2>’ in declaration of primary template
261 | vuint8m1_t get_bytes<2>() const { return __riscv_vlmul_ext_v_u8mf2_u8m1(value); }
| ^~~~~~~~~~~~
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:262:27: error: explicit specialization in non-namespace scope ‘struct xsimd::types::detail::rvv_semiblob<T, divisor>’
262 | template <>
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:263:28: error: template-id ‘get_bytes<4>’ in declaration of primary template
263 | vuint8m1_t get_bytes<4>() const { return __riscv_vlmul_ext_v_u8mf4_u8m1(value); }
| ^~~~~~~~~~~~
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:263:28: error: ‘vuint8m1_t xsimd::types::detail::rvv_semiblob<T, divisor>::get_bytes() const’ cannot be overloaded with ‘vuint8m1_t xsimd::types::detail::rvv_semiblob<T, divisor>::get_bytes() const’
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:261:28: note: previous declaration ‘vuint8m1_t xsimd::types::detail::rvv_semiblob<T, divisor>::get_bytes() const’
261 | vuint8m1_t get_bytes<2>() const { return __riscv_vlmul_ext_v_u8mf2_u8m1(value); }
| ^~~~~~~~~~~~
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:264:27: error: explicit specialization in non-namespace scope ‘struct xsimd::types::detail::rvv_semiblob<T, divisor>’
264 | template <>
| ^
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:265:28: error: template-id ‘get_bytes<8>’ in declaration of primary template
265 | vuint8m1_t get_bytes<8>() const { return __riscv_vlmul_ext_v_u8mf8_u8m1(value); }
| ^~~~~~~~~~~~
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:265:28: error: ‘vuint8m1_t xsimd::types::detail::rvv_semiblob<T, divisor>::get_bytes() const’ cannot be overloaded with ‘vuint8m1_t xsimd::types::detail::rvv_semiblob<T, divisor>::get_bytes() const’
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:261:28: note: previous declaration ‘vuint8m1_t xsimd::types::detail::rvv_semiblob<T, divisor>::get_bytes() const’
261 | vuint8m1_t get_bytes<2>() const { return __riscv_vlmul_ext_v_u8mf2_u8m1(value); }
| ^~~~~~~~~~~~
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:273:27: error: explicit specialization in non-namespace scope ‘struct xsimd::types::detail::rvv_semiblob<T, divisor>’
273 | template <>
| ^
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:274:22: error: template-id ‘set_bytes<2>’ in declaration of primary template
274 | void set_bytes<2>(vuint8m1_t v) { value = __riscv_vlmul_trunc_v_u8m1_u8mf2(v); }
| ^~~~~~~~~~~~
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:275:27: error: explicit specialization in non-namespace scope ‘struct xsimd::types::detail::rvv_semiblob<T, divisor>’
275 | template <>
| ^
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:276:22: error: template-id ‘set_bytes<4>’ in declaration of primary template
276 | void set_bytes<4>(vuint8m1_t v) { value = __riscv_vlmul_trunc_v_u8m1_u8mf4(v); }
| ^~~~~~~~~~~~
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:276:22: error: ‘void xsimd::types::detail::rvv_semiblob<T, divisor>::set_bytes(vuint8m1_t)’ cannot be overloaded with ‘void xsimd::types::detail::rvv_semiblob<T, divisor>::set_bytes(vuint8m1_t)’
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:274:22: note: previous declaration ‘void xsimd::types::detail::rvv_semiblob<T, divisor>::set_bytes(vuint8m1_t)’
274 | void set_bytes<2>(vuint8m1_t v) { value = __riscv_vlmul_trunc_v_u8m1_u8mf2(v); }
| ^~~~~~~~~~~~
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:277:27: error: explicit specialization in non-namespace scope ‘struct xsimd::types::detail::rvv_semiblob<T, divisor>’
277 | template <>
| ^
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:278:22: error: template-id ‘set_bytes<8>’ in declaration of primary template
278 | void set_bytes<8>(vuint8m1_t v) { value = __riscv_vlmul_trunc_v_u8m1_u8mf8(v); }
| ^~~~~~~~~~~~
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:278:22: error: ‘void xsimd::types::detail::rvv_semiblob<T, divisor>::set_bytes(vuint8m1_t)’ cannot be overloaded with ‘void xsimd::types::detail::rvv_semiblob<T, divisor>::set_bytes(vuint8m1_t)’
../include/xsimd/memory/../config/../types/xsimd_rvv_register.hpp:274:22: note: previous declaration ‘void xsimd::types::detail::rvv_semiblob<T, divisor>::set_bytes(vuint8m1_t)’
274 | void set_bytes<2>(vuint8m1_t v) { value = __riscv_vlmul_trunc_v_u8m1_u8mf2(v); }
| ^~~~~~~~~~~~
In file included from ../include/xsimd/types/../arch/xsimd_isa.hpp:128,
from ../include/xsimd/types/xsimd_batch.hpp:520,
from ../include/xsimd/xsimd.hpp:77:
../include/xsimd/types/../arch/./xsimd_rvv.hpp:157:15: warning: declaration of ‘using xsimd::kernel::detail::rvvid_cruft::impl<signed char, Ret(First, Args ...)>::ctx = struct xsimd::kernel::detail::rvvid_cruft::ctx<signed char>’ changes meaning of ‘ctx’ [-Wchanges-meaning]
157 | using ctx = ctx<KEY>; \
| ^~~
../include/xsimd/types/../arch/./xsimd_rvv.hpp:221:5: note: in expansion of macro ‘XSIMD_RVV_WRAPPER_DROP_1ST’
221 | XSIMD_RVV_WRAPPER##variant(int8_t, XSIMD_RVV_IDENTIFIER(i, 8, name), __VA_ARGS__) \
| ^~~~~~~~~~~~~~~~~
../include/xsimd/types/../arch/./xsimd_rvv.hpp:260:5: note: in expansion of macro ‘XSIMD_RVV_OVERLOAD_i’
260 | XSIMD_RVV_OVERLOAD_i(name_i, variant, __VA_ARGS__) \
| ^~~~~~~~~~~~~~~~~~~~
../include/xsimd/types/../arch/./xsimd_rvv.hpp:265:48: note: in expansion of macro ‘XSIMD_RVV_OVERLOAD3’
So what do I need to do for RVV targets? What am I getting wrong? Do I need to provide something special here?