# SCM Repository

[matrix] Annotation of /pkg/inst/doc/Comparisons.Rnw
 [matrix] / pkg / inst / doc / Comparisons.Rnw

# Annotation of /pkg/inst/doc/Comparisons.Rnw

 1 : bates 10 \documentclass{article} 2 : \usepackage{myVignette} 3 : \usepackage[authoryear,round]{natbib} 4 : \bibliographystyle{plainnat} 5 : %%\VignetteIndexEntry{Comparisons of Least Squares calculation speeds} 6 : %%\VignetteDepends{Matrix} 7 : \begin{document} 8 : \SweaveOpts{engine=R,eps=FALSE,pdf=TRUE,width=5,height=3,strip.white=TRUE} 9 : \setkeys{Gin}{width=\textwidth} 10 : \title{Comparing Least Squares Calculations} 11 : \author{Douglas Bates\\R Development Core Team\\\email{Douglas.Bates@R-project.org}} 12 : \date{\today} 13 : \maketitle 14 : \begin{abstract} 15 : Many statistics methods require one or more least squares problems 16 : to be solved. There are several ways to perform this calculation, 17 : using objects from the base R system and using objects in the 18 : classes defined in the \code{Matrix} package. 19 : 20 : We compare the speed of some of these methods on a very small 21 : example and on a example for which the model matrix is large and 22 : sparse. 23 : \end{abstract} 24 : <>= 25 : options(width=75) 26 : @ 27 : 28 : \section{Linear least squares calculations} 29 : \label{sec:LeastSquares} 30 : 31 : Many statistical techniques require least squares solutions 32 : 33 : \label{eq:LeastSquares} 34 : \widehat{\bm{\beta}}= 35 : \arg\min_{\bm{\beta}}\left\|\bm{y}-\bX\bm{\beta}\right\|^2 36 : 37 : where $\bX$ is an $n\times p$ model matrix ($p\leq n$), $\bm{y}$ is 38 : $n$-dimensional and $\bm{\beta}$ is $p$ dimensional. Most statistics 39 : texts state that the solution to (\ref{eq:LeastSquares}) is 40 : 41 : \label{eq:XPX} 42 : \widehat{\bm{\beta}}=\left(\bX\trans\bX\right)^{-1}\bX\trans\bm{y} 43 : 44 : when $\bX$ has full column rank (i.e. the columns of $\bX$ are 45 : linearly independent) and all too frequently it is calculated in 46 : exactly this way. 47 : 48 : 49 : \subsection{A small example} 50 : \label{sec:smallLSQ} 51 : 52 : As an example, let's create a model matrix, \code{mm}, and corresponding 53 : response vector, \code{y}, for a simple linear regression model using 54 : the \code{Formaldehyde} data. 55 : <>= 56 : data(Formaldehyde) 57 : str(Formaldehyde) 58 : print(mm <- cbind(1, Formaldehyde$carb)) 59 : print(y <- Formaldehyde$optden) 60 : @ 61 : Using \code{t} to evaluate 62 : the transpose, \code{solve} to take an inverse, and the \code{\%*\%} 63 : operator for matrix multiplication, we can translate \ref{eq:XPX} into 64 : the \Slang{} as 65 : <>= 66 : solve(t(mm) %*% mm) %*% t(mm) %*% y 67 : @ 68 : 69 : On modern computers this calculation is performed so quickly that it cannot 70 : be timed accurately in \RR{} 71 : <>= 72 : bates 342 system.time(solve(t(mm) %*% mm) %*% t(mm) %*% y, gc = TRUE) 73 : bates 10 @ 74 : and it provides essentially the same results as the standard 75 : \code{lm.fit} function that is called by \code{lm}. 76 : <>= 77 : dput(c(solve(t(mm) %*% mm) %*% t(mm) %*% y)) 78 : dput(lm.fit(mm, y)$coefficients) 79 : @ 80 : bates 342 %$ 81 : bates 10 82 : \subsection{A large example} 83 : \label{sec:largeLSQ} 84 : 85 : For a large, ill-conditioned least squares problem, such as that 86 : described in \citet{koen:ng:2003}, the literal translation of 87 : (\ref{eq:XPX}) does not perform well. 88 : <>= 89 : library(Matrix) 90 : data(mm, package = "Matrix") 91 : data(y, package = "Matrix") 92 : bates 342 mm <- as(mm, "matrix") 93 : bates 10 dim(mm) 94 : bates 342 system.time(naive.sol <- solve(t(mm) %*% mm) %*% t(mm) %*% y, gc = TRUE) 95 : bates 10 @ 96 : 97 : Because the calculation of a cross-product'' matrix, such as 98 : $\bX\trans\bX$ or $\bX\trans\bm{y}$, is a common operation in 99 : statistics, the \code{crossprod} function has been provided to do 100 : this efficiently. In the single argument form \code{crossprod(mm)} 101 : calculates $\bX\trans\bX$, taking advantage of the symmetry of the 102 : product. That is, instead of calculating the $712^2=506944$ elements of 103 : $\bX\trans\bX$ separately, it only calculates the $(712\cdot 104 : 713)/2=253828$ elements in the upper triangle and replicates them in 105 : the lower triangle. Furthermore, there is no need to calculate the 106 : inverse of a matrix explicitly when solving a 107 : linear system of equations. When the two argument form of the \code{solve} 108 : function is used the linear system 109 : 110 : \label{eq:LSQsol} 111 : \left(\bX\trans\bX\right) \widehat{\bm{\beta}} = \bX\trans\by 112 : 113 : is solved directly. 114 : 115 : Combining these optimizations we obtain 116 : <>= 117 : bates 342 system.time(cpod.sol <- solve(crossprod(mm), crossprod(mm,y)), gc = TRUE) 118 : bates 10 all.equal(naive.sol, cpod.sol) 119 : @ 120 : 121 : On this computer (2.0 GHz Pentium-4, 1 GB Memory, Goto's BLAS) the 122 : crossprod form of the calculation is about four times as fast as the 123 : naive calculation. In fact, the entire crossprod solution is 124 : faster than simply calculating $\bX\trans\bX$ the naive way. 125 : <>= 126 : bates 342 system.time(t(mm) %*% mm, gc = TRUE) 127 : bates 10 @ 128 : 129 : \subsection{Least squares calculations with Matrix classes} 130 : \label{sec:MatrixLSQ} 131 : 132 : The \code{crossprod} function applied to a single matrix takes 133 : advantage of symmetry when calculating the product but does not retain 134 : the information that the product is symmetric (and positive 135 : semidefinite). As a result the solution of (\ref{eq:LSQsol}) is 136 : performed using general linear system solver based on an LU 137 : decomposition when it would be faster, and more stable numerically, to 138 : use a Cholesky decomposition. The Cholesky decomposition could be used 139 : but it is rather awkward 140 : <>= 141 : bates 342 system.time(ch <- chol(crossprod(mm)), gc = TRUE) 142 : system.time(chol.sol <- backsolve(ch, 143 : forwardsolve(ch, crossprod(mm, y), upper = TRUE, trans = TRUE)), 144 : gc = TRUE) 145 : bates 10 all.equal(chol.sol, naive.sol) 146 : @ 147 : 148 : The \code{Matrix} package uses the S4 class system 149 : \citep{R:Chambers:1998} to retain information on the structure of 150 : matrices from the intermediate calculations. A general matrix in 151 : dense storage, created by the \code{Matrix} function, has class 152 : bates 492 \code{"dgeMatrix"} but its cross-product has class \code{"dpoMatrix"}. 153 : The \code{solve} methods for the \code{"dpoMatrix"} class use the 154 : bates 10 Cholesky decomposition. 155 : <>= 156 : data(mm, package = "Matrix") 157 : bates 492 mm <- as(mm, "dgeMatrix") 158 : bates 10 class(crossprod(mm)) 159 : bates 342 system.time(Mat.sol <- solve(crossprod(mm), crossprod(mm, y)), gc=TRUE) 160 : bates 10 all.equal(naive.sol, as(Mat.sol, "matrix")) 161 : @ 162 : 163 : Furthermore, any method that calculates a 164 : decomposition or factorization stores the resulting factorization with 165 : the original object so that it can be reused without recalculation. 166 : <>= 167 : bates 342 xpx <- crossprod(mm) 168 : xpy <- crossprod(mm, y) 169 : system.time(solve(xpx, xpy), gc=TRUE) 170 : system.time(solve(xpx, xpy), gc=TRUE) 171 : bates 10 @ 172 : 173 : The model matrix \code{mm} is sparse; that is, most of the elements of 174 : \code{mm} are zero. The \code{Matrix} package incorporates special 175 : methods for sparse matrices, which produce the fastest results of all. 176 : 177 : <>= 178 : data(mm, package = "Matrix") 179 : class(mm) 180 : bates 342 system.time(sparse.sol <- solve(crossprod(mm), crossprod(mm, y)), gc=TRUE) 181 : bates 10 all.equal(naive.sol, as(sparse.sol, "matrix")) 182 : @ 183 : 184 : As with other classes in the \code{Matrix} package, the 185 : \code{sscMatrix} retains any factorization that has been calculated 186 : although, in this case, the decomposition is so fast that it is 187 : difficult to determine the difference in the solution times. 188 : 189 : <>= 190 : bates 342 xpx <- crossprod(mm) 191 : xpy <- crossprod(mm, y) 192 : system.time(solve(xpx, xpy), gc=TRUE) 193 : system.time(solve(xpx, xpy), gc=TRUE) 194 : bates 10 @ 195 : 196 : \bibliography{Matrix} 197 : \end{document}