# SCM Repository

[matrix] Annotation of /pkg/inst/doc/Comparisons.Rnw
 [matrix] / pkg / inst / doc / Comparisons.Rnw

# Annotation of /pkg/inst/doc/Comparisons.Rnw

 1 : bates 10 \documentclass{article} 2 : \usepackage{myVignette} 3 : \usepackage[authoryear,round]{natbib} 4 : \bibliographystyle{plainnat} 5 : %%\VignetteIndexEntry{Comparisons of Least Squares calculation speeds} 6 : %%\VignetteDepends{Matrix} 7 : \begin{document} 8 : \SweaveOpts{engine=R,eps=FALSE,pdf=TRUE,width=5,height=3,strip.white=TRUE} 9 : \setkeys{Gin}{width=\textwidth} 10 : \title{Comparing Least Squares Calculations} 11 : \author{Douglas Bates\\R Development Core Team\\\email{Douglas.Bates@R-project.org}} 12 : \date{\today} 13 : \maketitle 14 : \begin{abstract} 15 : Many statistics methods require one or more least squares problems 16 : to be solved. There are several ways to perform this calculation, 17 : using objects from the base R system and using objects in the 18 : classes defined in the \code{Matrix} package. 19 : 20 : We compare the speed of some of these methods on a very small 21 : example and on a example for which the model matrix is large and 22 : sparse. 23 : \end{abstract} 24 : <>= 25 : options(width=75) 26 : @ 27 : 28 : \section{Linear least squares calculations} 29 : \label{sec:LeastSquares} 30 : 31 : Many statistical techniques require least squares solutions 32 : 33 : \label{eq:LeastSquares} 34 : \widehat{\bm{\beta}}= 35 : \arg\min_{\bm{\beta}}\left\|\bm{y}-\bX\bm{\beta}\right\|^2 36 : 37 : where $\bX$ is an $n\times p$ model matrix ($p\leq n$), $\bm{y}$ is 38 : $n$-dimensional and $\bm{\beta}$ is $p$ dimensional. Most statistics 39 : texts state that the solution to (\ref{eq:LeastSquares}) is 40 : 41 : \label{eq:XPX} 42 : \widehat{\bm{\beta}}=\left(\bX\trans\bX\right)^{-1}\bX\trans\bm{y} 43 : 44 : when $\bX$ has full column rank (i.e. the columns of $\bX$ are 45 : linearly independent) and all too frequently it is calculated in 46 : exactly this way. 47 : 48 : 49 : \subsection{A small example} 50 : \label{sec:smallLSQ} 51 : 52 : As an example, let's create a model matrix, \code{mm}, and corresponding 53 : response vector, \code{y}, for a simple linear regression model using 54 : the \code{Formaldehyde} data. 55 : <>= 56 : data(Formaldehyde) 57 : str(Formaldehyde) 58 : print(mm <- cbind(1, Formaldehyde$carb)) 59 : print(y <- Formaldehyde$optden) 60 : @ 61 : Using \code{t} to evaluate 62 : the transpose, \code{solve} to take an inverse, and the \code{\%*\%} 63 : operator for matrix multiplication, we can translate \ref{eq:XPX} into 64 : the \Slang{} as 65 : <>= 66 : solve(t(mm) %*% mm) %*% t(mm) %*% y 67 : @ 68 : 69 : On modern computers this calculation is performed so quickly that it cannot 70 : be timed accurately in \RR{} 71 : <>= 72 : system.time(solve(t(mm) %*% mm) %*% t(mm) %*% y) 73 : @ 74 : and it provides essentially the same results as the standard 75 : \code{lm.fit} function that is called by \code{lm}. 76 : <>= 77 : dput(c(solve(t(mm) %*% mm) %*% t(mm) %*% y)) 78 : dput(lm.fit(mm, y)$coefficients) 79 : @ 80 : 81 : 82 : \subsection{A large example} 83 : \label{sec:largeLSQ} 84 : 85 : For a large, ill-conditioned least squares problem, such as that 86 : described in \citet{koen:ng:2003}, the literal translation of 87 : (\ref{eq:XPX}) does not perform well. 88 : <>= 89 : library(Matrix) 90 : data(mm, package = "Matrix") 91 : data(y, package = "Matrix") 92 : mm = as(mm, "matrix") 93 : dim(mm) 94 : bates 185 invisible(gc()) 95 : system.time(naive.sol <- solve(t(mm) %*% mm) %*% t(mm) %*% y) 96 : bates 10 @ 97 : 98 : Because the calculation of a cross-product'' matrix, such as 99 :$\bX\trans\bX$or$\bX\trans\bm{y}$, is a common operation in 100 : statistics, the \code{crossprod} function has been provided to do 101 : this efficiently. In the single argument form \code{crossprod(mm)} 102 : calculates$\bX\trans\bX$, taking advantage of the symmetry of the 103 : product. That is, instead of calculating the$712^2=506944$elements of 104 :$\bX\trans\bX$separately, it only calculates the$(712\cdot 105 : 713)/2=253828$elements in the upper triangle and replicates them in 106 : the lower triangle. Furthermore, there is no need to calculate the 107 : inverse of a matrix explicitly when solving a 108 : linear system of equations. When the two argument form of the \code{solve} 109 : function is used the linear system 110 : 111 : \label{eq:LSQsol} 112 : \left(\bX\trans\bX\right) \widehat{\bm{\beta}} = \bX\trans\by 113 : 114 : is solved directly. 115 : 116 : Combining these optimizations we obtain 117 : <>= 118 : bates 185 invisible(gc()) 119 : system.time(cpod.sol <- solve(crossprod(mm), crossprod(mm,y))) 120 : bates 10 all.equal(naive.sol, cpod.sol) 121 : @ 122 : 123 : On this computer (2.0 GHz Pentium-4, 1 GB Memory, Goto's BLAS) the 124 : crossprod form of the calculation is about four times as fast as the 125 : naive calculation. In fact, the entire crossprod solution is 126 : faster than simply calculating$\bX\trans\bX\$ the naive way. 127 : <>= 128 : bates 185 invisible(gc()) 129 : bates 10 system.time(t(mm) %*% mm) 130 : @ 131 : 132 : \subsection{Least squares calculations with Matrix classes} 133 : \label{sec:MatrixLSQ} 134 : 135 : The \code{crossprod} function applied to a single matrix takes 136 : advantage of symmetry when calculating the product but does not retain 137 : the information that the product is symmetric (and positive 138 : semidefinite). As a result the solution of (\ref{eq:LSQsol}) is 139 : performed using general linear system solver based on an LU 140 : decomposition when it would be faster, and more stable numerically, to 141 : use a Cholesky decomposition. The Cholesky decomposition could be used 142 : but it is rather awkward 143 : <>= 144 : ch = chol(crossprod(mm)) 145 : bates 185 invisible(gc()) 146 : bates 10 system.time(chol(crossprod(mm))) 147 : bates 185 invisible(gc()) 148 : system.time(chol.sol <- backsolve(ch, 149 : forwardsolve(ch, crossprod(mm, y), upper = TRUE, trans = TRUE))) 150 : bates 10 all.equal(chol.sol, naive.sol) 151 : @ 152 : 153 : The \code{Matrix} package uses the S4 class system 154 : \citep{R:Chambers:1998} to retain information on the structure of 155 : matrices from the intermediate calculations. A general matrix in 156 : dense storage, created by the \code{Matrix} function, has class 157 : \code{"geMatrix"} but its cross-product has class \code{"poMatrix"}. 158 : The \code{solve} methods for the \code{"poMatrix"} class use the 159 : Cholesky decomposition. 160 : <>= 161 : data(mm, package = "Matrix") 162 : mm = as(mm, "geMatrix") 163 : class(crossprod(mm)) 164 : bates 185 invisible(gc()) 165 : system.time(Mat.sol <- solve(crossprod(mm), crossprod(mm, y))) 166 : bates 10 all.equal(naive.sol, as(Mat.sol, "matrix")) 167 : @ 168 : 169 : Furthermore, any method that calculates a 170 : decomposition or factorization stores the resulting factorization with 171 : the original object so that it can be reused without recalculation. 172 : <>= 173 : xpx = crossprod(mm) 174 : xpy = crossprod(mm, y) 175 : bates 185 invisible(gc()) 176 : bates 10 system.time(solve(xpx, xpy)) 177 : bates 185 invisible(gc()) 178 : bates 10 system.time(solve(xpx, xpy)) 179 : @ 180 : 181 : The model matrix \code{mm} is sparse; that is, most of the elements of 182 : \code{mm} are zero. The \code{Matrix} package incorporates special 183 : methods for sparse matrices, which produce the fastest results of all. 184 : 185 : <>= 186 : data(mm, package = "Matrix") 187 : class(mm) 188 : bates 185 invisible(gc()) 189 : system.time(sparse.sol <- solve(crossprod(mm), crossprod(mm, y))) 190 : bates 10 all.equal(naive.sol, as(sparse.sol, "matrix")) 191 : @ 192 : 193 : As with other classes in the \code{Matrix} package, the 194 : \code{sscMatrix} retains any factorization that has been calculated 195 : although, in this case, the decomposition is so fast that it is 196 : difficult to determine the difference in the solution times. 197 : 198 : <>= 199 : xpx = crossprod(mm) 200 : xpy = crossprod(mm, y) 201 : bates 185 invisible(gc()) 202 : bates 10 system.time(solve(xpx, xpy)) 203 : bates 185 invisible(gc()) 204 : bates 10 system.time(solve(xpx, xpy)) 205 : @ 206 : 207 : \bibliography{Matrix} 208 : \end{document}