SCM

SCM Repository

[matrix] Annotation of /www/slides/2010-07-21-Gaithersburg/sparse-model-matrices.Rnw
ViewVC logotype

Annotation of /www/slides/2010-07-21-Gaithersburg/sparse-model-matrices.Rnw

Parent Directory Parent Directory | Revision Log Revision Log


Revision 2575 - (view) (download)

1 : mmaechler 2573 \SweaveOpts{engine=R, keep.source=TRUE}
2 :     \SweaveOpts{eps=FALSE, pdf=TRUE, width=9, height=6, strip.white=TRUE}
3 :     \setkeys{Gin}{width=\textwidth}
4 :    
5 :     <<preliminaries,echo=FALSE,results=hide>>=
6 :     options(width=75)
7 :     library(Matrix)
8 :     @
9 :    
10 :     \begin{frame}[fragile]\frametitle{Sparse \alert{Model} Matrices}
11 :     New model matrix classes,
12 :     %The \pkg{Matrix} package now has model matrix classes,
13 :     generalizing \Rp's standard \Rfun{model.matrix}:% function:
14 :     <<model.matrix>>=
15 :     str(dd <- data.frame(a = gl(3,4), b = gl(4,1,12)))# balanced 2-way
16 :     model.matrix(~ 0+ a + b, dd)
17 :     @
18 :     \end{frame}
19 :    
20 :     \begin{frame}[fragile]\frametitle{Sparse \alert{Model} Matrices}
21 :     The model matrix above
22 :     \begin{itemize}
23 :     \item \dots\dots has many zeros, and
24 :     \item \emph{ratio} ((zeros) : (non-zeros)) increases dramatically
25 :     with many-level factors
26 :     \item even more zeros for factor \emph{interactions}:
27 :     \end{itemize}
28 :     \smallskip
29 :    
30 :     \begin{footnotesize}
31 :     <<model.matrix>>=
32 :     model.matrix(~ 0+ a * b, dd)
33 :     @
34 :     \end{footnotesize}
35 :     \end{frame}
36 :    
37 :     \begin{frame}[fragile]\frametitle{Sparse \alert{Model} Matrices in 'Matrix'}
38 : mmaechler 2575 \begin{itemize}
39 :     \item
40 :     These matrices can become very large: Both many rows (large $n$), \emph{and}
41 :     many columns, large $p$.
42 : mmaechler 2573
43 : mmaechler 2575
44 :     \item
45 : mmaechler 2573 Eg., in Linear Mixed Effects Models,
46 :     \begin{equation*}
47 :     \Ew{\bc Y|\bc B=\bm b} = \gX \bm\beta+\rZ\bm b,
48 :     \end{equation*}
49 : mmaechler 2575 \begin{itemize}
50 :     \item
51 :     the $\rZ$ matrix is often large and very sparse, and in \pkg{lme4} has
52 :     always been stored as \code{"sparseMatrix"} (\code{"dgCMatrix"}, specifically).
53 : mmaechler 2573
54 : mmaechler 2575 \item
55 :     Sometimes, $\gX$, (fixed effect matrix) is large, too.
56 :     \nlQ $\to$ optionally also \code{"sparseMatrix"} in
57 :     \pkg{lme4}\footnote{the development version of \pkg{lme4}, currently called \pkg{lme4a}.}.
58 :     \end{itemize}
59 : mmaechler 2573
60 : mmaechler 2575 \item
61 : mmaechler 2573 We've extended \Rfun{model.\Ul{m}atrix} to \Rfun{model.\Ul{M}atrix} in
62 :     package \pkg{Matrix} with optional argument \code{sparse = TRUE}.
63 : mmaechler 2575 \end{itemize}
64 : mmaechler 2573 \end{frame}
65 :    
66 :     \begin{frame}[fragile]
67 :     \frametitle{Sparse Model Matrix \alert{Classes} in 'Matrix'}
68 :     \begin{footnotesize}
69 :     \begin{Schunk}
70 :     \begin{Sinput}
71 :     setClass("modelMatrix",
72 :     representation(assign = "integer",
73 :     contrasts = "list", "VIRTUAL"),
74 :     contains = "Matrix",
75 :     validity = function(object) { ........... })
76 :    
77 :     setClass("sparseModelMatrix", representation("VIRTUAL"),
78 :     contains = c("CsparseMatrix", "modelMatrix"))
79 :     setClass("denseModelMatrix", representation("VIRTUAL"),
80 :     contains = c("denseMatrix", "modelMatrix"))
81 :     ## The ``actual'' *ModelMatrix classes:
82 :     setClass("dsparseModelMatrix",
83 :     contains = c("dgCMatrix", "sparseModelMatrix"))
84 :     setClass("ddenseModelMatrix", contains =
85 :     c("dgeMatrix", "ddenseMatrix", "denseModelMatrix"))
86 :     \end{Sinput}
87 :     \end{Schunk}
88 :     \end{footnotesize}
89 : mmaechler 2575 \medskip
90 :    
91 :     % (adding \code{"ddenseMatrix"} above does \emph{not} influence slots, but yields
92 :     % consistent superclass ordering.)
93 :     (\code{"ddenseMatrix"}: \emph{not} for slots, but
94 :     consistent superclass ordering)
95 : mmaechler 2573 \end{frame}
96 :    
97 :     \begin{frame}[fragile]\frametitle{model.\alert{M}atrix(*, sparse=TRUE)}
98 :     Constructing \alert{sparse} models matrices (\pkg{Matrix} package):
99 :     \begin{footnotesize}
100 :     <<modelMat-ex>>=
101 :     model.Matrix(~ 0+ a * b, dd, sparse=TRUE)
102 :     @
103 :     \end{footnotesize}
104 :     % the above is cut off, i.e. not visible here
105 :     % identical syntax, just \code{model.\alert{M}atrix(..)}.
106 :     \end{frame}
107 :    
108 :     \section[modelMatrix \Also "glpModel"s]{%
109 :     modelMatrix \Also General Linear Prediction Models}
110 :     \begin{frame}[fragile]
111 :     \frametitle{"modelMatrix" \Also General Linear Prediction Models}
112 :     Idea: Very \emph{general} setup for
113 :     \begin{block}{Statistical models based on linear predictors}
114 :     Class \code{"glpModel"} := General Linear Prediction Models
115 :     \end{block}
116 :     \begin{Schunk}
117 :     \begin{Sinput}
118 :     setClass("Model", representation(call = "call", fitProps = "list",
119 :     "VIRTUAL"))
120 :     setClass("glpModel", representation(resp = "respModule",
121 :     pred = "predModule"),
122 :     contains = "Model")
123 :     \end{Sinput}
124 :     \end{Schunk}
125 :     \medskip
126 :    
127 :     Two main ingredients:
128 :     \begin{enumerate}
129 :     \item Response module \code{"respModule"}
130 :     \item (Linear) Prediction module \code{"predModule"}
131 :     \end{enumerate}
132 :     \end{frame}
133 :    
134 :     \begin{frame}[fragile]\frametitle{(1) Response Module}
135 :     \code{"respModule"}:
136 :     Response modules for models with a linear predictor, which can
137 :     include linear models (\code{lm}), generalized linear models (\code{glm}),
138 :     nonlinear models (\code{nls}) and generalized nonlinear models (\code{nglm}):
139 :     \begin{footnotesize}\begin{Schunk}
140 :     \begin{Sinput}
141 :     setClass("respModule",
142 :     representation(mu = "numeric", # of length n
143 :     offset = "numeric", # of length n * k
144 :     sqrtXwt = "matrix", # of dim(.) == dim(X) == (n, k)
145 :     sqrtrwt = "numeric", # sqrt(residual weights)
146 :     weights = "numeric", # prior weights
147 :     wtres = "numeric",
148 :     y = "numeric"),
149 :     validity = function(object) { ....... })
150 :     setClass("glmRespMod",
151 :     representation(family = "family",
152 :     eta = "numeric",
153 :     n = "numeric"), # for evaluation of the aic
154 :     contains = "respModule", validity = function(object) { ..... })
155 :     setClass("nlsRespMod",
156 :     representation(nlenv = "environment", ......), ............)
157 :     setClass("nglmRespMod", contains = c("glmRespMod", "nlsRespMod"))
158 :     \end{Sinput}
159 :     \end{Schunk}\end{footnotesize}
160 :     \end{frame}
161 :    
162 :     \begin{frame}[fragile]\frametitle{(2) Prediction Module}
163 :     %\item
164 : mmaechler 2575 \code{"predModule"}: Linear predictor module consists of
165 :     \begin{itemize}
166 :     \item the model matrix \code{X},
167 :     \item the coefficient vector \code{coef},
168 :     \item a triangular factor of the weighted model matrix \code{fac},
169 :     \item (\code{Vtr} $= \bm {V\trans r}$, where $\bm r =$ residuals (typically)
170 :     \end{itemize}
171 :     currently in \textcolor{Orange}{dense} and
172 : mmaechler 2573 \textcolor{Mulberry}{sparse} flavor:
173 : mmaechler 2575
174 :     \medskip
175 :    
176 : mmaechler 2573 \begin{footnotesize}\begin{Schunk}
177 :     \begin{Sinput}
178 :     setClass("predModule",
179 :     representation(X = "modelMatrix", coef = "numeric", Vtr = "numeric",
180 :     fac = "CholeskyFactorization",
181 :     "VIRTUAL"))
182 :     ## the sub classes specify more specific classes for the two non-trivial slots:
183 :     setClass("dPredModule", contains = "predModule",
184 :     representation(X = "ddenseModelMatrix", fac = "Cholesky"))
185 :     setClass("sPredModule", contains = "predModule",
186 :     representation(X = "dsparseModelMatrix", fac = "CHMfactor"))
187 :     \end{Sinput}
188 :     \end{Schunk}\end{footnotesize}
189 :    
190 :     \end{frame}
191 :    
192 :     \begin{frame}
193 :     \frametitle{Fitting all ``glpModel''s with One IRLS algorithm}
194 : mmaechler 2575 Fitting via IRLS (Iteratively Reweighted Least Squares), where
195 :     the prediction and response module parts each update
196 : mmaechler 2573 ``themselves''.
197 : mmaechler 2575
198 :     \bigskip
199 :    
200 :     These 3 Steps are iterated till convergence:\\
201 : mmaechler 2573 \begin{enumerate}
202 : mmaechler 2575 \item prediction module (\textsc{pm}) only passes \code{X \%*\% coef}$ = \bm X \beta$
203 :     to the response module (\textsc{rm})
204 :     \item from that, the \textsc{rm}
205 :     \begin{itemize}
206 :     \item updates its $\bm \mu$,
207 :     \item then its weighted residuals and ``X weights''
208 :     \end{itemize}
209 :     \item these two are in turn passed to \textsc{pm} which
210 :     \begin{itemize}
211 :     \item reweights itself and
212 :     \item \code{solve()}s for
213 :     $\Delta\bm\beta$, the \emph{increment} of $\bm\beta$.
214 :     \end{itemize}
215 : mmaechler 2573 \end{enumerate}
216 : mmaechler 2575 Convergence only if Bates-Watts orthgonality criterion is fulfilled.
217 : mmaechler 2573 \end{frame}
218 : mmaechler 2575
219 :    
220 : mmaechler 2573 %%% Local Variables:
221 :     %%% TeX-command-default: "LaTeX PDF"
222 :     %%% TeX-master: "MaechlerBates.tex"
223 :     %%% End:
224 :    

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge