We propose a parallel algorithm that exploits GPU power to parametrize orthogonal Householder matrices used for the SVD. Various expensive matrix operations enjoy a substantial speedup using our approach.
In this paper:
- We introduce a novel algorithm, FastH, which increases core utilization, leaving less cores to run idle.
- FastH retains the same time complexity as the previous sequential algorithm while reducing the number of sequential operations.
- FastH is faster than all previous algorithms, fast enough to speed up several matrix operations. For example, for matrix inversion in Neural Networks, FastH is up to 27x faster.
- We publish a code written in CUDA which can be readily used for experiments.