SCM

SCM Repository

[matrix] Annotation of /pkg/inst/doc/Comparisons.Rnw
ViewVC logotype

Annotation of /pkg/inst/doc/Comparisons.Rnw

Parent Directory Parent Directory | Revision Log Revision Log


Revision 185 - (view) (download)

1 : bates 10 \documentclass{article}
2 :     \usepackage{myVignette}
3 :     \usepackage[authoryear,round]{natbib}
4 :     \bibliographystyle{plainnat}
5 :     %%\VignetteIndexEntry{Comparisons of Least Squares calculation speeds}
6 :     %%\VignetteDepends{Matrix}
7 :     \begin{document}
8 :     \SweaveOpts{engine=R,eps=FALSE,pdf=TRUE,width=5,height=3,strip.white=TRUE}
9 :     \setkeys{Gin}{width=\textwidth}
10 :     \title{Comparing Least Squares Calculations}
11 :     \author{Douglas Bates\\R Development Core Team\\\email{Douglas.Bates@R-project.org}}
12 :     \date{\today}
13 :     \maketitle
14 :     \begin{abstract}
15 :     Many statistics methods require one or more least squares problems
16 :     to be solved. There are several ways to perform this calculation,
17 :     using objects from the base R system and using objects in the
18 :     classes defined in the \code{Matrix} package.
19 :    
20 :     We compare the speed of some of these methods on a very small
21 :     example and on a example for which the model matrix is large and
22 :     sparse.
23 :     \end{abstract}
24 :     <<preliminaries, echo=FALSE>>=
25 :     options(width=75)
26 :     @
27 :    
28 :     \section{Linear least squares calculations}
29 :     \label{sec:LeastSquares}
30 :    
31 :     Many statistical techniques require least squares solutions
32 :     \begin{equation}
33 :     \label{eq:LeastSquares}
34 :     \widehat{\bm{\beta}}=
35 :     \arg\min_{\bm{\beta}}\left\|\bm{y}-\bX\bm{\beta}\right\|^2
36 :     \end{equation}
37 :     where $\bX$ is an $n\times p$ model matrix ($p\leq n$), $\bm{y}$ is
38 :     $n$-dimensional and $\bm{\beta}$ is $p$ dimensional. Most statistics
39 :     texts state that the solution to (\ref{eq:LeastSquares}) is
40 :     \begin{equation}
41 :     \label{eq:XPX}
42 :     \widehat{\bm{\beta}}=\left(\bX\trans\bX\right)^{-1}\bX\trans\bm{y}
43 :     \end{equation}
44 :     when $\bX$ has full column rank (i.e. the columns of $\bX$ are
45 :     linearly independent) and all too frequently it is calculated in
46 :     exactly this way.
47 :    
48 :    
49 :     \subsection{A small example}
50 :     \label{sec:smallLSQ}
51 :    
52 :     As an example, let's create a model matrix, \code{mm}, and corresponding
53 :     response vector, \code{y}, for a simple linear regression model using
54 :     the \code{Formaldehyde} data.
55 :     <<modelMatrix>>=
56 :     data(Formaldehyde)
57 :     str(Formaldehyde)
58 :     print(mm <- cbind(1, Formaldehyde$carb))
59 :     print(y <- Formaldehyde$optden)
60 :     @
61 :     Using \code{t} to evaluate
62 :     the transpose, \code{solve} to take an inverse, and the \code{\%*\%}
63 :     operator for matrix multiplication, we can translate \ref{eq:XPX} into
64 :     the \Slang{} as
65 :     <<naiveCalc>>=
66 :     solve(t(mm) %*% mm) %*% t(mm) %*% y
67 :     @
68 :    
69 :     On modern computers this calculation is performed so quickly that it cannot
70 :     be timed accurately in \RR{}
71 :     <<timedNaive>>=
72 :     system.time(solve(t(mm) %*% mm) %*% t(mm) %*% y)
73 :     @
74 :     and it provides essentially the same results as the standard
75 :     \code{lm.fit} function that is called by \code{lm}.
76 :     <<catNaive>>=
77 :     dput(c(solve(t(mm) %*% mm) %*% t(mm) %*% y))
78 :     dput(lm.fit(mm, y)$coefficients)
79 :     @
80 :    
81 :    
82 :     \subsection{A large example}
83 :     \label{sec:largeLSQ}
84 :    
85 :     For a large, ill-conditioned least squares problem, such as that
86 :     described in \citet{koen:ng:2003}, the literal translation of
87 :     (\ref{eq:XPX}) does not perform well.
88 :     <<KoenNg>>=
89 :     library(Matrix)
90 :     data(mm, package = "Matrix")
91 :     data(y, package = "Matrix")
92 :     mm = as(mm, "matrix")
93 :     dim(mm)
94 : bates 185 invisible(gc())
95 :     system.time(naive.sol <- solve(t(mm) %*% mm) %*% t(mm) %*% y)
96 : bates 10 @
97 :    
98 :     Because the calculation of a ``cross-product'' matrix, such as
99 :     $\bX\trans\bX$ or $\bX\trans\bm{y}$, is a common operation in
100 :     statistics, the \code{crossprod} function has been provided to do
101 :     this efficiently. In the single argument form \code{crossprod(mm)}
102 :     calculates $\bX\trans\bX$, taking advantage of the symmetry of the
103 :     product. That is, instead of calculating the $712^2=506944$ elements of
104 :     $\bX\trans\bX$ separately, it only calculates the $(712\cdot
105 :     713)/2=253828$ elements in the upper triangle and replicates them in
106 :     the lower triangle. Furthermore, there is no need to calculate the
107 :     inverse of a matrix explicitly when solving a
108 :     linear system of equations. When the two argument form of the \code{solve}
109 :     function is used the linear system
110 :     \begin{equation}
111 :     \label{eq:LSQsol}
112 :     \left(\bX\trans\bX\right) \widehat{\bm{\beta}} = \bX\trans\by
113 :     \end{equation}
114 :     is solved directly.
115 :    
116 :     Combining these optimizations we obtain
117 :     <<crossKoenNg>>=
118 : bates 185 invisible(gc())
119 :     system.time(cpod.sol <- solve(crossprod(mm), crossprod(mm,y)))
120 : bates 10 all.equal(naive.sol, cpod.sol)
121 :     @
122 :    
123 :     On this computer (2.0 GHz Pentium-4, 1 GB Memory, Goto's BLAS) the
124 :     crossprod form of the calculation is about four times as fast as the
125 :     naive calculation. In fact, the entire crossprod solution is
126 :     faster than simply calculating $\bX\trans\bX$ the naive way.
127 :     <<xpxKoenNg>>=
128 : bates 185 invisible(gc())
129 : bates 10 system.time(t(mm) %*% mm)
130 :     @
131 :    
132 :     \subsection{Least squares calculations with Matrix classes}
133 :     \label{sec:MatrixLSQ}
134 :    
135 :     The \code{crossprod} function applied to a single matrix takes
136 :     advantage of symmetry when calculating the product but does not retain
137 :     the information that the product is symmetric (and positive
138 :     semidefinite). As a result the solution of (\ref{eq:LSQsol}) is
139 :     performed using general linear system solver based on an LU
140 :     decomposition when it would be faster, and more stable numerically, to
141 :     use a Cholesky decomposition. The Cholesky decomposition could be used
142 :     but it is rather awkward
143 :     <<naiveChol>>=
144 :     ch = chol(crossprod(mm))
145 : bates 185 invisible(gc())
146 : bates 10 system.time(chol(crossprod(mm)))
147 : bates 185 invisible(gc())
148 :     system.time(chol.sol <- backsolve(ch,
149 :     forwardsolve(ch, crossprod(mm, y), upper = TRUE, trans = TRUE)))
150 : bates 10 all.equal(chol.sol, naive.sol)
151 :     @
152 :    
153 :     The \code{Matrix} package uses the S4 class system
154 :     \citep{R:Chambers:1998} to retain information on the structure of
155 :     matrices from the intermediate calculations. A general matrix in
156 :     dense storage, created by the \code{Matrix} function, has class
157 :     \code{"geMatrix"} but its cross-product has class \code{"poMatrix"}.
158 :     The \code{solve} methods for the \code{"poMatrix"} class use the
159 :     Cholesky decomposition.
160 :     <<MatrixKoenNg>>=
161 :     data(mm, package = "Matrix")
162 :     mm = as(mm, "geMatrix")
163 :     class(crossprod(mm))
164 : bates 185 invisible(gc())
165 :     system.time(Mat.sol <- solve(crossprod(mm), crossprod(mm, y)))
166 : bates 10 all.equal(naive.sol, as(Mat.sol, "matrix"))
167 :     @
168 :    
169 :     Furthermore, any method that calculates a
170 :     decomposition or factorization stores the resulting factorization with
171 :     the original object so that it can be reused without recalculation.
172 :     <<saveFactor>>=
173 :     xpx = crossprod(mm)
174 :     xpy = crossprod(mm, y)
175 : bates 185 invisible(gc())
176 : bates 10 system.time(solve(xpx, xpy))
177 : bates 185 invisible(gc())
178 : bates 10 system.time(solve(xpx, xpy))
179 :     @
180 :    
181 :     The model matrix \code{mm} is sparse; that is, most of the elements of
182 :     \code{mm} are zero. The \code{Matrix} package incorporates special
183 :     methods for sparse matrices, which produce the fastest results of all.
184 :    
185 :     <<SparseKoenNg>>=
186 :     data(mm, package = "Matrix")
187 :     class(mm)
188 : bates 185 invisible(gc())
189 :     system.time(sparse.sol <- solve(crossprod(mm), crossprod(mm, y)))
190 : bates 10 all.equal(naive.sol, as(sparse.sol, "matrix"))
191 :     @
192 :    
193 :     As with other classes in the \code{Matrix} package, the
194 :     \code{sscMatrix} retains any factorization that has been calculated
195 :     although, in this case, the decomposition is so fast that it is
196 :     difficult to determine the difference in the solution times.
197 :    
198 :     <<SparseSaveFactor>>=
199 :     xpx = crossprod(mm)
200 :     xpy = crossprod(mm, y)
201 : bates 185 invisible(gc())
202 : bates 10 system.time(solve(xpx, xpy))
203 : bates 185 invisible(gc())
204 : bates 10 system.time(solve(xpx, xpy))
205 :     @
206 :    
207 :     \bibliography{Matrix}
208 :     \end{document}

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge