SCM

SCM Repository

[matrix] Annotation of /pkg/inst/doc/Comparisons.Rnw
ViewVC logotype

Annotation of /pkg/inst/doc/Comparisons.Rnw

Parent Directory Parent Directory | Revision Log Revision Log


Revision 188 - (view) (download)

1 : bates 10 \documentclass{article}
2 :     \usepackage{myVignette}
3 :     \usepackage[authoryear,round]{natbib}
4 :     \bibliographystyle{plainnat}
5 :     %%\VignetteIndexEntry{Comparisons of Least Squares calculation speeds}
6 :     %%\VignetteDepends{Matrix}
7 :     \begin{document}
8 :     \SweaveOpts{engine=R,eps=FALSE,pdf=TRUE,width=5,height=3,strip.white=TRUE}
9 :     \setkeys{Gin}{width=\textwidth}
10 :     \title{Comparing Least Squares Calculations}
11 :     \author{Douglas Bates\\R Development Core Team\\\email{Douglas.Bates@R-project.org}}
12 :     \date{\today}
13 :     \maketitle
14 :     \begin{abstract}
15 :     Many statistics methods require one or more least squares problems
16 :     to be solved. There are several ways to perform this calculation,
17 :     using objects from the base R system and using objects in the
18 :     classes defined in the \code{Matrix} package.
19 :    
20 :     We compare the speed of some of these methods on a very small
21 :     example and on a example for which the model matrix is large and
22 :     sparse.
23 :     \end{abstract}
24 :     <<preliminaries, echo=FALSE>>=
25 :     options(width=75)
26 : bates 188 sysgc.time <- function(expr, gcFirst = TRUE)
27 :     {
28 :     if (!exists("proc.time"))
29 :     return(rep(as.numeric(NA), 5))
30 :     loc.frame <- parent.frame()
31 :     if (gcFirst)
32 :     gc(FALSE)
33 :     on.exit(cat("Timing stopped at:", proc.time() - time, "\n"))
34 :     expr <- substitute(expr)
35 :     time <- proc.time()
36 :     eval(expr, envir = loc.frame)
37 :     new.time <- proc.time()
38 :     on.exit()
39 :     if (length(new.time) == 3)
40 :     new.time <- c(new.time, 0, 0)
41 :     if (length(time) == 3)
42 :     time <- c(time, 0, 0)
43 :     new.time - time
44 :     }
45 : bates 10 @
46 :    
47 :     \section{Linear least squares calculations}
48 :     \label{sec:LeastSquares}
49 :    
50 :     Many statistical techniques require least squares solutions
51 :     \begin{equation}
52 :     \label{eq:LeastSquares}
53 :     \widehat{\bm{\beta}}=
54 :     \arg\min_{\bm{\beta}}\left\|\bm{y}-\bX\bm{\beta}\right\|^2
55 :     \end{equation}
56 :     where $\bX$ is an $n\times p$ model matrix ($p\leq n$), $\bm{y}$ is
57 :     $n$-dimensional and $\bm{\beta}$ is $p$ dimensional. Most statistics
58 :     texts state that the solution to (\ref{eq:LeastSquares}) is
59 :     \begin{equation}
60 :     \label{eq:XPX}
61 :     \widehat{\bm{\beta}}=\left(\bX\trans\bX\right)^{-1}\bX\trans\bm{y}
62 :     \end{equation}
63 :     when $\bX$ has full column rank (i.e. the columns of $\bX$ are
64 :     linearly independent) and all too frequently it is calculated in
65 :     exactly this way.
66 :    
67 :    
68 :     \subsection{A small example}
69 :     \label{sec:smallLSQ}
70 :    
71 :     As an example, let's create a model matrix, \code{mm}, and corresponding
72 :     response vector, \code{y}, for a simple linear regression model using
73 :     the \code{Formaldehyde} data.
74 :     <<modelMatrix>>=
75 :     data(Formaldehyde)
76 :     str(Formaldehyde)
77 :     print(mm <- cbind(1, Formaldehyde$carb))
78 :     print(y <- Formaldehyde$optden)
79 :     @
80 :     Using \code{t} to evaluate
81 :     the transpose, \code{solve} to take an inverse, and the \code{\%*\%}
82 :     operator for matrix multiplication, we can translate \ref{eq:XPX} into
83 :     the \Slang{} as
84 :     <<naiveCalc>>=
85 :     solve(t(mm) %*% mm) %*% t(mm) %*% y
86 :     @
87 :    
88 :     On modern computers this calculation is performed so quickly that it cannot
89 :     be timed accurately in \RR{}
90 :     <<timedNaive>>=
91 : bates 188 sysgc.time(solve(t(mm) %*% mm) %*% t(mm) %*% y)
92 : bates 10 @
93 :     and it provides essentially the same results as the standard
94 :     \code{lm.fit} function that is called by \code{lm}.
95 :     <<catNaive>>=
96 :     dput(c(solve(t(mm) %*% mm) %*% t(mm) %*% y))
97 :     dput(lm.fit(mm, y)$coefficients)
98 :     @
99 :    
100 :    
101 :     \subsection{A large example}
102 :     \label{sec:largeLSQ}
103 :    
104 :     For a large, ill-conditioned least squares problem, such as that
105 :     described in \citet{koen:ng:2003}, the literal translation of
106 :     (\ref{eq:XPX}) does not perform well.
107 :     <<KoenNg>>=
108 :     library(Matrix)
109 :     data(mm, package = "Matrix")
110 :     data(y, package = "Matrix")
111 :     mm = as(mm, "matrix")
112 :     dim(mm)
113 : bates 188 sysgc.time(naive.sol <- solve(t(mm) %*% mm) %*% t(mm) %*% y)
114 : bates 10 @
115 :    
116 :     Because the calculation of a ``cross-product'' matrix, such as
117 :     $\bX\trans\bX$ or $\bX\trans\bm{y}$, is a common operation in
118 :     statistics, the \code{crossprod} function has been provided to do
119 :     this efficiently. In the single argument form \code{crossprod(mm)}
120 :     calculates $\bX\trans\bX$, taking advantage of the symmetry of the
121 :     product. That is, instead of calculating the $712^2=506944$ elements of
122 :     $\bX\trans\bX$ separately, it only calculates the $(712\cdot
123 :     713)/2=253828$ elements in the upper triangle and replicates them in
124 :     the lower triangle. Furthermore, there is no need to calculate the
125 :     inverse of a matrix explicitly when solving a
126 :     linear system of equations. When the two argument form of the \code{solve}
127 :     function is used the linear system
128 :     \begin{equation}
129 :     \label{eq:LSQsol}
130 :     \left(\bX\trans\bX\right) \widehat{\bm{\beta}} = \bX\trans\by
131 :     \end{equation}
132 :     is solved directly.
133 :    
134 :     Combining these optimizations we obtain
135 :     <<crossKoenNg>>=
136 : bates 188 sysgc.time(cpod.sol <- solve(crossprod(mm), crossprod(mm,y)))
137 : bates 10 all.equal(naive.sol, cpod.sol)
138 :     @
139 :    
140 :     On this computer (2.0 GHz Pentium-4, 1 GB Memory, Goto's BLAS) the
141 :     crossprod form of the calculation is about four times as fast as the
142 :     naive calculation. In fact, the entire crossprod solution is
143 :     faster than simply calculating $\bX\trans\bX$ the naive way.
144 :     <<xpxKoenNg>>=
145 : bates 188 sysgc.time(t(mm) %*% mm)
146 : bates 10 @
147 :    
148 :     \subsection{Least squares calculations with Matrix classes}
149 :     \label{sec:MatrixLSQ}
150 :    
151 :     The \code{crossprod} function applied to a single matrix takes
152 :     advantage of symmetry when calculating the product but does not retain
153 :     the information that the product is symmetric (and positive
154 :     semidefinite). As a result the solution of (\ref{eq:LSQsol}) is
155 :     performed using general linear system solver based on an LU
156 :     decomposition when it would be faster, and more stable numerically, to
157 :     use a Cholesky decomposition. The Cholesky decomposition could be used
158 :     but it is rather awkward
159 :     <<naiveChol>>=
160 : bates 188 sysgc.time(ch <- chol(crossprod(mm)))
161 :     sysgc.time(chol.sol <- backsolve(ch,
162 : bates 185 forwardsolve(ch, crossprod(mm, y), upper = TRUE, trans = TRUE)))
163 : bates 10 all.equal(chol.sol, naive.sol)
164 :     @
165 :    
166 :     The \code{Matrix} package uses the S4 class system
167 :     \citep{R:Chambers:1998} to retain information on the structure of
168 :     matrices from the intermediate calculations. A general matrix in
169 :     dense storage, created by the \code{Matrix} function, has class
170 :     \code{"geMatrix"} but its cross-product has class \code{"poMatrix"}.
171 :     The \code{solve} methods for the \code{"poMatrix"} class use the
172 :     Cholesky decomposition.
173 :     <<MatrixKoenNg>>=
174 :     data(mm, package = "Matrix")
175 :     mm = as(mm, "geMatrix")
176 :     class(crossprod(mm))
177 : bates 188 sysgc.time(Mat.sol <- solve(crossprod(mm), crossprod(mm, y)))
178 : bates 10 all.equal(naive.sol, as(Mat.sol, "matrix"))
179 :     @
180 :    
181 :     Furthermore, any method that calculates a
182 :     decomposition or factorization stores the resulting factorization with
183 :     the original object so that it can be reused without recalculation.
184 :     <<saveFactor>>=
185 :     xpx = crossprod(mm)
186 :     xpy = crossprod(mm, y)
187 : bates 188 sysgc.time(solve(xpx, xpy))
188 :     sysgc.time(solve(xpx, xpy))
189 : bates 10 @
190 :    
191 :     The model matrix \code{mm} is sparse; that is, most of the elements of
192 :     \code{mm} are zero. The \code{Matrix} package incorporates special
193 :     methods for sparse matrices, which produce the fastest results of all.
194 :    
195 :     <<SparseKoenNg>>=
196 :     data(mm, package = "Matrix")
197 :     class(mm)
198 : bates 188 sysgc.time(sparse.sol <- solve(crossprod(mm), crossprod(mm, y)))
199 : bates 10 all.equal(naive.sol, as(sparse.sol, "matrix"))
200 :     @
201 :    
202 :     As with other classes in the \code{Matrix} package, the
203 :     \code{sscMatrix} retains any factorization that has been calculated
204 :     although, in this case, the decomposition is so fast that it is
205 :     difficult to determine the difference in the solution times.
206 :    
207 :     <<SparseSaveFactor>>=
208 :     xpx = crossprod(mm)
209 :     xpy = crossprod(mm, y)
210 : bates 188 sysgc.time(solve(xpx, xpy))
211 :     sysgc.time(solve(xpx, xpy))
212 : bates 10 @
213 :    
214 :     \bibliography{Matrix}
215 :     \end{document}

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge