Files
Abstract
From one point of view, Matrix Factorization can be seen as the workhorse for modern predictive big data analytics. It is a very important technique widely used in statistics, including multiple regression. Parallelizing matrix factorization can increase the speed of analytics. In the era of big data, the speedup often can be further magnified by the size of the dataset. This paper discusses several common matrix factorization algorithms that have been developed through the years, as well as their applications in multiple regression. The paper also presents a novel parallel Householder QR Factorization algorithm. After conducting several experiments on two different testing environments, substantial speedupswere obtained.