SCM

SCM Repository

[matrix] Annotation of /pkg/inst/doc/Comparisons.Rnw
ViewVC logotype

Annotation of /pkg/inst/doc/Comparisons.Rnw

Parent Directory Parent Directory | Revision Log Revision Log


Revision 10 - (view) (download)

1 : bates 10 \documentclass{article}
2 :     \usepackage{myVignette}
3 :     \usepackage[authoryear,round]{natbib}
4 :     \bibliographystyle{plainnat}
5 :     %%\VignetteIndexEntry{Comparisons of Least Squares calculation speeds}
6 :     %%\VignetteDepends{Matrix}
7 :     \begin{document}
8 :     \SweaveOpts{engine=R,eps=FALSE,pdf=TRUE,width=5,height=3,strip.white=TRUE}
9 :     \setkeys{Gin}{width=\textwidth}
10 :     \title{Comparing Least Squares Calculations}
11 :     \author{Douglas Bates\\R Development Core Team\\\email{Douglas.Bates@R-project.org}}
12 :     \date{\today}
13 :     \maketitle
14 :     \begin{abstract}
15 :     Many statistics methods require one or more least squares problems
16 :     to be solved. There are several ways to perform this calculation,
17 :     using objects from the base R system and using objects in the
18 :     classes defined in the \code{Matrix} package.
19 :    
20 :     We compare the speed of some of these methods on a very small
21 :     example and on a example for which the model matrix is large and
22 :     sparse.
23 :     \end{abstract}
24 :     <<preliminaries, echo=FALSE>>=
25 :     options(width=75)
26 :     @
27 :    
28 :     \section{Linear least squares calculations}
29 :     \label{sec:LeastSquares}
30 :    
31 :     Many statistical techniques require least squares solutions
32 :     \begin{equation}
33 :     \label{eq:LeastSquares}
34 :     \widehat{\bm{\beta}}=
35 :     \arg\min_{\bm{\beta}}\left\|\bm{y}-\bX\bm{\beta}\right\|^2
36 :     \end{equation}
37 :     where $\bX$ is an $n\times p$ model matrix ($p\leq n$), $\bm{y}$ is
38 :     $n$-dimensional and $\bm{\beta}$ is $p$ dimensional. Most statistics
39 :     texts state that the solution to (\ref{eq:LeastSquares}) is
40 :     \begin{equation}
41 :     \label{eq:XPX}
42 :     \widehat{\bm{\beta}}=\left(\bX\trans\bX\right)^{-1}\bX\trans\bm{y}
43 :     \end{equation}
44 :     when $\bX$ has full column rank (i.e. the columns of $\bX$ are
45 :     linearly independent) and all too frequently it is calculated in
46 :     exactly this way.
47 :    
48 :    
49 :     \subsection{A small example}
50 :     \label{sec:smallLSQ}
51 :    
52 :     As an example, let's create a model matrix, \code{mm}, and corresponding
53 :     response vector, \code{y}, for a simple linear regression model using
54 :     the \code{Formaldehyde} data.
55 :     <<modelMatrix>>=
56 :     data(Formaldehyde)
57 :     str(Formaldehyde)
58 :     print(mm <- cbind(1, Formaldehyde$carb))
59 :     print(y <- Formaldehyde$optden)
60 :     @
61 :     Using \code{t} to evaluate
62 :     the transpose, \code{solve} to take an inverse, and the \code{\%*\%}
63 :     operator for matrix multiplication, we can translate \ref{eq:XPX} into
64 :     the \Slang{} as
65 :     <<naiveCalc>>=
66 :     solve(t(mm) %*% mm) %*% t(mm) %*% y
67 :     @
68 :    
69 :     On modern computers this calculation is performed so quickly that it cannot
70 :     be timed accurately in \RR{}
71 :     <<timedNaive>>=
72 :     system.time(solve(t(mm) %*% mm) %*% t(mm) %*% y)
73 :     @
74 :     and it provides essentially the same results as the standard
75 :     \code{lm.fit} function that is called by \code{lm}.
76 :     <<catNaive>>=
77 :     dput(c(solve(t(mm) %*% mm) %*% t(mm) %*% y))
78 :     dput(lm.fit(mm, y)$coefficients)
79 :     @
80 :    
81 :    
82 :     \subsection{A large example}
83 :     \label{sec:largeLSQ}
84 :    
85 :     For a large, ill-conditioned least squares problem, such as that
86 :     described in \citet{koen:ng:2003}, the literal translation of
87 :     (\ref{eq:XPX}) does not perform well.
88 :     <<KoenNg>>=
89 :     library(Matrix)
90 :     data(mm, package = "Matrix")
91 :     data(y, package = "Matrix")
92 :     mm = as(mm, "matrix")
93 :     dim(mm)
94 :     naive.sol = solve(t(mm) %*% mm) %*% t(mm) %*% y
95 :     system.time(solve(t(mm) %*% mm) %*% t(mm) %*% y)
96 :     system.time(solve(t(mm) %*% mm) %*% t(mm) %*% y)
97 :     system.time(solve(t(mm) %*% mm) %*% t(mm) %*% y)
98 :     @
99 :     (Here and in what follows we will repeat timings four times to obtain
100 :     timings that are close to the steady-state values.)
101 :    
102 :     Because the calculation of a ``cross-product'' matrix, such as
103 :     $\bX\trans\bX$ or $\bX\trans\bm{y}$, is a common operation in
104 :     statistics, the \code{crossprod} function has been provided to do
105 :     this efficiently. In the single argument form \code{crossprod(mm)}
106 :     calculates $\bX\trans\bX$, taking advantage of the symmetry of the
107 :     product. That is, instead of calculating the $712^2=506944$ elements of
108 :     $\bX\trans\bX$ separately, it only calculates the $(712\cdot
109 :     713)/2=253828$ elements in the upper triangle and replicates them in
110 :     the lower triangle. Furthermore, there is no need to calculate the
111 :     inverse of a matrix explicitly when solving a
112 :     linear system of equations. When the two argument form of the \code{solve}
113 :     function is used the linear system
114 :     \begin{equation}
115 :     \label{eq:LSQsol}
116 :     \left(\bX\trans\bX\right) \widehat{\bm{\beta}} = \bX\trans\by
117 :     \end{equation}
118 :     is solved directly.
119 :    
120 :     Combining these optimizations we obtain
121 :     <<crossKoenNg>>=
122 :     cpod.sol = solve(crossprod(mm), crossprod(mm,y))
123 :     system.time(solve(crossprod(mm), crossprod(mm,y)))
124 :     system.time(solve(crossprod(mm), crossprod(mm,y)))
125 :     system.time(solve(crossprod(mm), crossprod(mm,y)))
126 :     all.equal(naive.sol, cpod.sol)
127 :     @
128 :    
129 :     On this computer (2.0 GHz Pentium-4, 1 GB Memory, Goto's BLAS) the
130 :     crossprod form of the calculation is about four times as fast as the
131 :     naive calculation. In fact, the entire crossprod solution is
132 :     faster than simply calculating $\bX\trans\bX$ the naive way.
133 :     <<xpxKoenNg>>=
134 :     system.time(t(mm) %*% mm)
135 :     system.time(t(mm) %*% mm)
136 :     system.time(t(mm) %*% mm)
137 :     system.time(t(mm) %*% mm)
138 :     @
139 :    
140 :     \subsection{Least squares calculations with Matrix classes}
141 :     \label{sec:MatrixLSQ}
142 :    
143 :     The \code{crossprod} function applied to a single matrix takes
144 :     advantage of symmetry when calculating the product but does not retain
145 :     the information that the product is symmetric (and positive
146 :     semidefinite). As a result the solution of (\ref{eq:LSQsol}) is
147 :     performed using general linear system solver based on an LU
148 :     decomposition when it would be faster, and more stable numerically, to
149 :     use a Cholesky decomposition. The Cholesky decomposition could be used
150 :     but it is rather awkward
151 :     <<naiveChol>>=
152 :     ch = chol(crossprod(mm))
153 :     system.time(chol(crossprod(mm)))
154 :     system.time(chol(crossprod(mm)))
155 :     system.time(chol(crossprod(mm)))
156 :     chol.sol = backsolve(ch,
157 :     forwardsolve(ch, crossprod(mm, y), upper = TRUE, trans = TRUE))
158 :     system.time(backsolve(ch, forwardsolve(ch, crossprod(mm, y), upper = TRUE, trans = TRUE)))
159 :     system.time(backsolve(ch, forwardsolve(ch, crossprod(mm, y), upper = TRUE, trans = TRUE)))
160 :     system.time(backsolve(ch, forwardsolve(ch, crossprod(mm, y), upper = TRUE, trans = TRUE)))
161 :     all.equal(chol.sol, naive.sol)
162 :     @
163 :    
164 :     The \code{Matrix} package uses the S4 class system
165 :     \citep{R:Chambers:1998} to retain information on the structure of
166 :     matrices from the intermediate calculations. A general matrix in
167 :     dense storage, created by the \code{Matrix} function, has class
168 :     \code{"geMatrix"} but its cross-product has class \code{"poMatrix"}.
169 :     The \code{solve} methods for the \code{"poMatrix"} class use the
170 :     Cholesky decomposition.
171 :     <<MatrixKoenNg>>=
172 :     data(mm, package = "Matrix")
173 :     mm = as(mm, "geMatrix")
174 :     class(crossprod(mm))
175 :     Mat.sol = solve(crossprod(mm), crossprod(mm, y))
176 :     system.time(solve(crossprod(mm), crossprod(mm, y)))
177 :     system.time(solve(crossprod(mm), crossprod(mm, y)))
178 :     system.time(solve(crossprod(mm), crossprod(mm, y)))
179 :     all.equal(naive.sol, as(Mat.sol, "matrix"))
180 :     @
181 :    
182 :     Furthermore, any method that calculates a
183 :     decomposition or factorization stores the resulting factorization with
184 :     the original object so that it can be reused without recalculation.
185 :     <<saveFactor>>=
186 :     xpx = crossprod(mm)
187 :     xpy = crossprod(mm, y)
188 :     system.time(solve(xpx, xpy))
189 :     system.time(solve(xpx, xpy))
190 :     system.time(solve(xpx, xpy))
191 :     @
192 :    
193 :     The model matrix \code{mm} is sparse; that is, most of the elements of
194 :     \code{mm} are zero. The \code{Matrix} package incorporates special
195 :     methods for sparse matrices, which produce the fastest results of all.
196 :    
197 :     <<SparseKoenNg>>=
198 :     data(mm, package = "Matrix")
199 :     class(mm)
200 :     sparse.sol = solve(crossprod(mm), crossprod(mm, y))
201 :     system.time(solve(crossprod(mm), crossprod(mm, y)))
202 :     system.time(solve(crossprod(mm), crossprod(mm, y)))
203 :     system.time(solve(crossprod(mm), crossprod(mm, y)))
204 :     all.equal(naive.sol, as(sparse.sol, "matrix"))
205 :     @
206 :    
207 :     As with other classes in the \code{Matrix} package, the
208 :     \code{sscMatrix} retains any factorization that has been calculated
209 :     although, in this case, the decomposition is so fast that it is
210 :     difficult to determine the difference in the solution times.
211 :    
212 :     <<SparseSaveFactor>>=
213 :     xpx = crossprod(mm)
214 :     xpy = crossprod(mm, y)
215 :     system.time(solve(xpx, xpy))
216 :     system.time(solve(xpx, xpy))
217 :     system.time(solve(xpx, xpy))
218 :     @
219 :    
220 :     \bibliography{Matrix}
221 :     \end{document}

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge