SCM

SCM Repository

[matrix] Annotation of /pkg/inst/doc/Comparisons.Rnw
ViewVC logotype

Annotation of /pkg/inst/doc/Comparisons.Rnw

Parent Directory Parent Directory | Revision Log Revision Log


Revision 342 - (view) (download)

1 : bates 10 \documentclass{article}
2 :     \usepackage{myVignette}
3 :     \usepackage[authoryear,round]{natbib}
4 :     \bibliographystyle{plainnat}
5 :     %%\VignetteIndexEntry{Comparisons of Least Squares calculation speeds}
6 :     %%\VignetteDepends{Matrix}
7 :     \begin{document}
8 :     \SweaveOpts{engine=R,eps=FALSE,pdf=TRUE,width=5,height=3,strip.white=TRUE}
9 :     \setkeys{Gin}{width=\textwidth}
10 :     \title{Comparing Least Squares Calculations}
11 :     \author{Douglas Bates\\R Development Core Team\\\email{Douglas.Bates@R-project.org}}
12 :     \date{\today}
13 :     \maketitle
14 :     \begin{abstract}
15 :     Many statistics methods require one or more least squares problems
16 :     to be solved. There are several ways to perform this calculation,
17 :     using objects from the base R system and using objects in the
18 :     classes defined in the \code{Matrix} package.
19 :    
20 :     We compare the speed of some of these methods on a very small
21 :     example and on a example for which the model matrix is large and
22 :     sparse.
23 :     \end{abstract}
24 :     <<preliminaries, echo=FALSE>>=
25 :     options(width=75)
26 :     @
27 :    
28 :     \section{Linear least squares calculations}
29 :     \label{sec:LeastSquares}
30 :    
31 :     Many statistical techniques require least squares solutions
32 :     \begin{equation}
33 :     \label{eq:LeastSquares}
34 :     \widehat{\bm{\beta}}=
35 :     \arg\min_{\bm{\beta}}\left\|\bm{y}-\bX\bm{\beta}\right\|^2
36 :     \end{equation}
37 :     where $\bX$ is an $n\times p$ model matrix ($p\leq n$), $\bm{y}$ is
38 :     $n$-dimensional and $\bm{\beta}$ is $p$ dimensional. Most statistics
39 :     texts state that the solution to (\ref{eq:LeastSquares}) is
40 :     \begin{equation}
41 :     \label{eq:XPX}
42 :     \widehat{\bm{\beta}}=\left(\bX\trans\bX\right)^{-1}\bX\trans\bm{y}
43 :     \end{equation}
44 :     when $\bX$ has full column rank (i.e. the columns of $\bX$ are
45 :     linearly independent) and all too frequently it is calculated in
46 :     exactly this way.
47 :    
48 :    
49 :     \subsection{A small example}
50 :     \label{sec:smallLSQ}
51 :    
52 :     As an example, let's create a model matrix, \code{mm}, and corresponding
53 :     response vector, \code{y}, for a simple linear regression model using
54 :     the \code{Formaldehyde} data.
55 :     <<modelMatrix>>=
56 :     data(Formaldehyde)
57 :     str(Formaldehyde)
58 :     print(mm <- cbind(1, Formaldehyde$carb))
59 :     print(y <- Formaldehyde$optden)
60 :     @
61 :     Using \code{t} to evaluate
62 :     the transpose, \code{solve} to take an inverse, and the \code{\%*\%}
63 :     operator for matrix multiplication, we can translate \ref{eq:XPX} into
64 :     the \Slang{} as
65 :     <<naiveCalc>>=
66 :     solve(t(mm) %*% mm) %*% t(mm) %*% y
67 :     @
68 :    
69 :     On modern computers this calculation is performed so quickly that it cannot
70 :     be timed accurately in \RR{}
71 :     <<timedNaive>>=
72 : bates 342 system.time(solve(t(mm) %*% mm) %*% t(mm) %*% y, gc = TRUE)
73 : bates 10 @
74 :     and it provides essentially the same results as the standard
75 :     \code{lm.fit} function that is called by \code{lm}.
76 :     <<catNaive>>=
77 :     dput(c(solve(t(mm) %*% mm) %*% t(mm) %*% y))
78 :     dput(lm.fit(mm, y)$coefficients)
79 :     @
80 : bates 342 %$
81 : bates 10
82 :     \subsection{A large example}
83 :     \label{sec:largeLSQ}
84 :    
85 :     For a large, ill-conditioned least squares problem, such as that
86 :     described in \citet{koen:ng:2003}, the literal translation of
87 :     (\ref{eq:XPX}) does not perform well.
88 :     <<KoenNg>>=
89 :     library(Matrix)
90 :     data(mm, package = "Matrix")
91 :     data(y, package = "Matrix")
92 : bates 342 mm <- as(mm, "matrix")
93 : bates 10 dim(mm)
94 : bates 342 system.time(naive.sol <- solve(t(mm) %*% mm) %*% t(mm) %*% y, gc = TRUE)
95 : bates 10 @
96 :    
97 :     Because the calculation of a ``cross-product'' matrix, such as
98 :     $\bX\trans\bX$ or $\bX\trans\bm{y}$, is a common operation in
99 :     statistics, the \code{crossprod} function has been provided to do
100 :     this efficiently. In the single argument form \code{crossprod(mm)}
101 :     calculates $\bX\trans\bX$, taking advantage of the symmetry of the
102 :     product. That is, instead of calculating the $712^2=506944$ elements of
103 :     $\bX\trans\bX$ separately, it only calculates the $(712\cdot
104 :     713)/2=253828$ elements in the upper triangle and replicates them in
105 :     the lower triangle. Furthermore, there is no need to calculate the
106 :     inverse of a matrix explicitly when solving a
107 :     linear system of equations. When the two argument form of the \code{solve}
108 :     function is used the linear system
109 :     \begin{equation}
110 :     \label{eq:LSQsol}
111 :     \left(\bX\trans\bX\right) \widehat{\bm{\beta}} = \bX\trans\by
112 :     \end{equation}
113 :     is solved directly.
114 :    
115 :     Combining these optimizations we obtain
116 :     <<crossKoenNg>>=
117 : bates 342 system.time(cpod.sol <- solve(crossprod(mm), crossprod(mm,y)), gc = TRUE)
118 : bates 10 all.equal(naive.sol, cpod.sol)
119 :     @
120 :    
121 :     On this computer (2.0 GHz Pentium-4, 1 GB Memory, Goto's BLAS) the
122 :     crossprod form of the calculation is about four times as fast as the
123 :     naive calculation. In fact, the entire crossprod solution is
124 :     faster than simply calculating $\bX\trans\bX$ the naive way.
125 :     <<xpxKoenNg>>=
126 : bates 342 system.time(t(mm) %*% mm, gc = TRUE)
127 : bates 10 @
128 :    
129 :     \subsection{Least squares calculations with Matrix classes}
130 :     \label{sec:MatrixLSQ}
131 :    
132 :     The \code{crossprod} function applied to a single matrix takes
133 :     advantage of symmetry when calculating the product but does not retain
134 :     the information that the product is symmetric (and positive
135 :     semidefinite). As a result the solution of (\ref{eq:LSQsol}) is
136 :     performed using general linear system solver based on an LU
137 :     decomposition when it would be faster, and more stable numerically, to
138 :     use a Cholesky decomposition. The Cholesky decomposition could be used
139 :     but it is rather awkward
140 :     <<naiveChol>>=
141 : bates 342 system.time(ch <- chol(crossprod(mm)), gc = TRUE)
142 :     system.time(chol.sol <- backsolve(ch,
143 :     forwardsolve(ch, crossprod(mm, y), upper = TRUE, trans = TRUE)),
144 :     gc = TRUE)
145 : bates 10 all.equal(chol.sol, naive.sol)
146 :     @
147 :    
148 :     The \code{Matrix} package uses the S4 class system
149 :     \citep{R:Chambers:1998} to retain information on the structure of
150 :     matrices from the intermediate calculations. A general matrix in
151 :     dense storage, created by the \code{Matrix} function, has class
152 :     \code{"geMatrix"} but its cross-product has class \code{"poMatrix"}.
153 :     The \code{solve} methods for the \code{"poMatrix"} class use the
154 :     Cholesky decomposition.
155 :     <<MatrixKoenNg>>=
156 :     data(mm, package = "Matrix")
157 : bates 342 mm <- as(mm, "geMatrix")
158 : bates 10 class(crossprod(mm))
159 : bates 342 system.time(Mat.sol <- solve(crossprod(mm), crossprod(mm, y)), gc=TRUE)
160 : bates 10 all.equal(naive.sol, as(Mat.sol, "matrix"))
161 :     @
162 :    
163 :     Furthermore, any method that calculates a
164 :     decomposition or factorization stores the resulting factorization with
165 :     the original object so that it can be reused without recalculation.
166 :     <<saveFactor>>=
167 : bates 342 xpx <- crossprod(mm)
168 :     xpy <- crossprod(mm, y)
169 :     system.time(solve(xpx, xpy), gc=TRUE)
170 :     system.time(solve(xpx, xpy), gc=TRUE)
171 : bates 10 @
172 :    
173 :     The model matrix \code{mm} is sparse; that is, most of the elements of
174 :     \code{mm} are zero. The \code{Matrix} package incorporates special
175 :     methods for sparse matrices, which produce the fastest results of all.
176 :    
177 :     <<SparseKoenNg>>=
178 :     data(mm, package = "Matrix")
179 :     class(mm)
180 : bates 342 system.time(sparse.sol <- solve(crossprod(mm), crossprod(mm, y)), gc=TRUE)
181 : bates 10 all.equal(naive.sol, as(sparse.sol, "matrix"))
182 :     @
183 :    
184 :     As with other classes in the \code{Matrix} package, the
185 :     \code{sscMatrix} retains any factorization that has been calculated
186 :     although, in this case, the decomposition is so fast that it is
187 :     difficult to determine the difference in the solution times.
188 :    
189 :     <<SparseSaveFactor>>=
190 : bates 342 xpx <- crossprod(mm)
191 :     xpy <- crossprod(mm, y)
192 :     system.time(solve(xpx, xpy), gc=TRUE)
193 :     system.time(solve(xpx, xpy), gc=TRUE)
194 : bates 10 @
195 :    
196 :     \bibliography{Matrix}
197 :     \end{document}

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge