At @quantstack.bsky.social we designed novel bit-unpacking SIMD optimizations for @arrow.apache.org and #ApacheParquet, and implemented them entirely using C++ metaprogramming instead of Python-based code generation.
We'll publish a deep dive blog post soon.
github.com/apache/arrow...
Rationale for this change
The current bit-unpacking algorithm (which is implemented as a C++ code generator script in Python) does not fully leverage SIMD operations: all loads and some bitshifts u...