With #80814, we achieved functional parity of Vector512<T> with Vector128<T> and Vector256<T>. However, there are some new instructions available in Avx512 capable hardware that will allow additional hardware acceleration opportunities for all three types.
This includes:
We should also ensure that all APIs are accelerated as intrinsic, where applicable, in particular the following are still managed fallbacks (but accelerated):
There may be others as well, so a general audit to validate would be good.
With #80814, we achieved functional parity of
Vector512<T>withVector128<T>andVector256<T>. However, there are some new instructions available in Avx512 capable hardware that will allow additional hardware acceleration opportunities for all three types.This includes:
vcvtqq2pd&vcvtuqq2pdvcvtpd2qqvcvtps2udqvcvtpd2uqqvpternlogvpermi2*,vpermt2*, etcWe should also ensure that all APIs are accelerated as intrinsic, where applicable, in particular the following are still managed fallbacks (but accelerated):
There may be others as well, so a general audit to validate would be good.