SCM

SCM Repository

[matrix] Annotation of /pkg/Matrix/TODO
ViewVC logotype

Annotation of /pkg/Matrix/TODO

Parent Directory Parent Directory | Revision Log Revision Log


Revision 2813 - (view) (download)

1 : mmaechler 2712 ##-*- mode: org -*-
2 : mmaechler 2677
3 : mmaechler 2784 * *Urgent* in some sense ------------------------------------------------------
4 :    
5 : mmaechler 2712 - Look at Paul Bailey's problem -- CHOLMOD error (even seg.fault for him)
6 :     --> ~/R/MM/Pkg-ex/Matrix/sparseOrderedLogit.R
7 :    
8 :     - BunchKaufman()'s result is not really useful yet {but it is used on C
9 :     level e.g. for solve(<dsyMatrix>). Should define expand() method or
10 :     similar, see man/BunchKaufman-methods.Rd and R/dsyMatrix.R (at end).
11 :    
12 : mmaechler 2784 * New smallish ideas, relatively urgent for MM --------------------------------
13 : mmaechler 2712
14 :     - Investigate the "band changing (and getting) ideas 'band<-' etc,
15 : mmaechler 2739 from Jeremy D Silver, per posts to R-devel on Aug.26,2011
16 :     {MM: ~/R/MM/Pkg-ex/Matrix/bands-Jeremy_Silver-ex.R }
17 : mmaechler 2712
18 : mmaechler 2725 - cbind2() / rbind2() for sparseMatrices: dimnames propagation should
19 :     happen in C, see R/bind2.R and src/Csparse.c (Csparse_horzcat etc).
20 :    
21 : maechler 2112 - Check for DimNames propagation in coercion and other operations.
22 : mmaechler 2788 Done for (%*%, crossprod, tcrossprod).
23 : maechler 2112
24 : bates 344 - Report the problem in the Linux ldexp manual page. The second and
25 :     third calls in the Synopsis should be to ldexpf and ldexpl.
26 : maechler 472
27 : maechler 634 - provide methods for "dspMatrix" and "dppMatrix"!
28 : mmaechler 2813 2012-07: mostly(?) DONE, with Ops, etc, also pack() / unpack()
29 : maechler 634
30 : bates 645 - combine the C functions for multiplication by special forms and
31 :     solution wrt special forms by using a 'right' argument and a
32 :     'classed' argument.
33 : maechler 675 [done with dgeMatrix_matrix_mm(); not yet for other classes;
34 :     and for _crossprod()]
35 :    
36 : mmaechler 2784 * Generalization of Existing Classes and Methods --------------------------
37 : bates 712
38 : maechler 935 - "Math2" , "Math", "Arith":
39 :     keep triangular and symmetric Matrices when appropriate:
40 :     particularly desirable for "Math2": round(), signif()
41 : maechler 956
42 : mmaechler 2582 - For triangular matrices, make sure the four rules of
43 :     "triangular matrix algebra" (Golub+Van Loan 1996, 3.1.8, p.93) are fulfilled.
44 :     - since 2008-03-06 ok for Csparse
45 :     - since 2010-07-23 ok for <dtr> %*% <dtr>
46 :     TODO: e.g. for <ltr> %*% <dtC>
47 : maechler 2120
48 : maechler 956 - "d" <-> "l" coercion for all "[TCR]" sparse matrices is really trivial:
49 :     "d" -> "l" : drops the 'x' slot
50 :     "l" -> "d" : construct an 'x' slot of all '1'
51 :     We currently have many of these conversions explicitly, e.g.
52 :     setAs("dsTMatrix", "lsTMatrix",
53 :     function(from) new("lsTMatrix", i = from@i, j = from@j, uplo = from@uplo,
54 :     Dim = from@Dim, Dimnames = from@Dimnames))
55 :     but I would rather want to automatically construct all these coercion
56 :     methods at once by a ``method constructor'', i.e.,
57 :     for all "dsparse*" -> "lsparse*" and vice versa.
58 :     How can one do this {in a documented way} ?
59 : maechler 1087
60 : maechler 2048 - Think of constructing setAs(...) calls automatically in order to
61 :     basically enable all ``sensible'' as(fromMatrix, toMatrix) calls,
62 :     possibly using canCoerce(.)
63 :    
64 :     - When we have a packed matrix, it's a waste to go through "full" to "sparse":
65 :     ==> implement
66 :     setAs("dspMatrix", "sparseMatrix")
67 :     setAs("dppMatrix", "sparseMatrix")
68 :     setAs("dtpMatrix", "sparseMatrix")
69 :     and the same for "lsp" , "ltp" and "nsp" , "ntp" !
70 :    
71 : maechler 1097 - tcrossprod(x, y) : do provide methods for y != NULL
72 :     calling Lapack's DGEMM for "dense"
73 : maechler 1109 [2005-12-xx: done for dgeMatrix at least]
74 :    
75 : maechler 1592 - Factorizations: LU done; also Schur() for *sparse* Matrices.
76 : maechler 1253
77 : maechler 1619 - use .Call(Csparse_drop, M, tol) in more places,
78 :     both with 'tol = 0.' to drop "values that happen to be 0" and for
79 :     zapsmall() methods for Csparse*
80 : maechler 1654
81 :     - implement .Call(Csparse_scale, ....) interfacing to cholmod_scale()
82 :     in src/CHOLMOD/Include/cholmod_matrixops.h : for another function
83 :     specifically for multiplying a cholmod_sparse object by a diagonal matrix.
84 :     Use it in %*% and [t]crossprod methods.
85 :    
86 : maechler 1714 - make sure *all* group methods have (maybe "bail-out") setMethod for "Matrix".
87 :     e.g. zapsmall(<pMatrix>) fails "badly"
88 : maechler 1725
89 : maechler 2112 - rbind2(<sparse>, <dense>) does not work (e.g. <dgC>, <dge>)
90 : maechler 1799
91 : maechler 1833 - <sparse> %*% <dense> {also in crossprod/tcrossprod} currently always
92 :     returns <dense>, since --> Csparse_dense_prod --> cholmod_sdmult
93 :     and that does only return dense.
94 :     When the sparse matrix is very sparse, i.e. has many rows with only zero
95 :     entries, it would make much sense to return sparse.
96 :    
97 :     - ! <symmetricMatrix> loses symmetry, both for dense and sparse matrices.
98 :     !M where M is "sparseMatrix", currently always gives dense. This only
99 :     makes sense when M is ``really sparse''.
100 : maechler 1855
101 : maechler 2005 - diag(m) <- val currently automatically works via m[cbind(i,i)] <- val
102 : maechler 2115 This (`[<-` method) is now "smart" for diagonalMatrix, but needs also to
103 :     be for triangularMatrix, and probably also "dense*general*Matrix" since the
104 : maechler 2005 above currently goes via "matrix" and back instead of using the 'x' slot
105 : maechler 2115 directly; in particular, the triangular* "class property" is lost!
106 : mmaechler 2688 [current ??]
107 : maechler 2043
108 : mmaechler 2688 - The "[<-" now uses src/t_Csparse_subassign.c and no longer explodes
109 :     memory. *However* it is still too slow when the replacment region is large.
110 : mmaechler 2207
111 : mmaechler 2784 * Cholesky(), chol() etc ------------------
112 : mmaechler 2688
113 : mmaechler 2784 - chol() should ``work'': proper result or "good" error message.
114 :     (mostly done ?)
115 :    
116 :     - example(Cholesky, echo=FALSE) ; cm <- chol(mtm); str(cm); str(mtm)
117 :    
118 :     shows that chol() does not seem to make use of an already
119 :     present factorization and rather uses one with more '0' in x slot.
120 :    
121 : maechler 2043 - examples for solve( Cholesky(.), b, system = c("A", "LDLt"....))
122 :     probably rather in man/CHMfactor-class.Rd than man/Cholesky.Rd
123 : maechler 2072
124 : maechler 2137 - LDL(<CHMsimpl>) looks relatively easy; via "tCsparse_diag()"
125 :     {diagonal entries of *triangular* Csparse}
126 :     --> see comment in determinant(<dsC>) in R/dsCMatrix.R, will give
127 :     faster determinant
128 :    
129 : mmaechler 2784 - Allow Cholesky(A,..) when A is not symmetric *AND*
130 :     we really _mean_ to factorize AA' ( + beta * I)
131 :    
132 :     - update(Cholesky(..), *): make *also* use of the possibility to update
133 :     with non-symmetric A and then AA' + mult * I is really meant.
134 :     .updateCHMfactor() ## allows that already(?)
135 :    
136 :     - add examples (and tests!) for update(<CHMfactor>, ..) and
137 :     Cholesky(......, Imult), also tests for hidden {hence no examples}
138 :     ldetL2up() { R/CHMfactor.R }. MM: See e.g. ~/R/MM/Pkg-ex/Matrix/CholUpdate.R
139 :    
140 :     - implement fast diag(<triangularCsparse>) via calling new
141 :     src/Csparse.c's diag_tC_ptr() .
142 :     -- diag_tC_ptr() functionality now exported via
143 :     R/dsCMatrix.R .diag.dsC() -- the name is silly, but
144 :     functionality nice. See (hidden) example in man/Cholesky.Rd
145 :    
146 :     - chol(<nsCMatrix>) gives "temporarily disabled"
147 :     but should give the *symbolic* factorization;
148 :     similarly Cholesky(.) is not enabled
149 :    
150 :     * "Basic" new functionality -- "nice to have" (non-urgent)
151 :    
152 : maechler 2072 - tr(A %*% B) {and even tr(A %*% B %*% C) ...} are also needed
153 :     frequently in some computations {conditional normal distr. ...}.
154 :     Since this can be done faster than by
155 :     sum(diag(A %*% B)) even for traditional matrices, e.g.
156 :     sum(A * t(B)) or {even faster for "full" mat}
157 :     crossprod(as.vector(A), as.vector(B))
158 :     and even more so for, e.g. <sparse> %*% <dense>
159 :     {used in Soeren's 'gR' computations},
160 :     we should also provide a generic and methods.
161 : mmaechler 2618 - diag(A %*% B) might look like a "generalization" of tr(A %*% B),
162 :     but as the above tricks show, is not really.
163 : mmaechler 2684 Still, it's well worth to provide diag.prod(A, B):
164 : maechler 2103
165 : mmaechler 2684 Well, if A %*% B is square, diag(A %*% B) === colSums(t(A) * B)
166 :     and we should probably teach people about that !
167 :    
168 : maechler 2106 - eigen() should become generic, and get a method at least for diagonal,
169 :     but also for symmetric -> dsyMatrix [LAPACK dsyev() uses UPLO !],
170 :     but also simply for dgeMatrix (without going via tradition matrices).
171 :     What about Sparse? There's fill-in, but it may still be sensible, e.g.
172 :     mlist <- list(1, 2:3, diag(x=5:3), 27, cbind(1,3:6), 100:101)
173 :     ee <- eigen(tcrossprod(bdiag(lapply(mlist, as.matrix))))
174 :     Matrix( signif(ee$vectors, 3) )
175 :    
176 : mmaechler 2784 * Everything else aka "Miscellaneous" ----------------------------------------
177 :    
178 :     - qr.R(qr(x)) may differ for the "same" matrix, depending on it being
179 :     sparse or dense:
180 :     "qr.R(<sparse>) may differ from qr.R(<dense>) because of permutations"
181 :    
182 :     This is not really acceptable and currently influences rcond() as well.
183 :    
184 : maechler 2106 - facmul() has no single method defined; it looks like a good idea though
185 :     (instead of the infamous qr.qy, qr.qty,.... functions)
186 : maechler 2112
187 :     - symmpart() and skewpart() for *sparse* matrices still use (x +/- t(x))/2
188 :     and could be made more efficient.
189 :     Consider going via asTuniq() or something very close to
190 :     .Arith.Csparse() in R/Ops.R
191 : mmaechler 2486 For a traditional "matrix" object, we should speedup, using C code ..
192 : maechler 2112
193 : maechler 2115 - many setAs(*, "[dl]..Matrix") are still needed, as long as e.g.
194 :     replCmat() uses as_CspClass() and drop0(.) which itself call
195 :     as_CspClass() quite a bit. --> try to replace these by
196 :     as(*, "CsparseMatrix"); forceSymmetric, etc.
197 : maechler 2151
198 : mmaechler 2175 - writeMM(obj, file=stdout()) creates file "1" since file is silently
199 :     assumed to be a string, i.e. cannot be a connection.
200 :     An R (instead of C) version should be pretty simple, and would work with
201 :     connections automatically ["lsparse" become either "real" or
202 :     "pattern", "depending if they have NAs or not].
203 : mmaechler 2196
204 : mmaechler 2269 - <diagMatrix> o <ddenseMatrix> still works via sparse in some cases, but
205 :     could return <diagMatrix> in the same cases where <diagMatrix> o <numeric> does.
206 : mmaechler 2207
207 :     - look at solve.QP.compact() in \pkg{quadprog} and how to do that using
208 :     our sparse matrices. Maybe this needs to be re-implemented using CHOLMOD
209 :     routines.
210 :    
211 : mmaechler 2236 - We allow "over-allocated" (i,x)-slots for CsparseMatrix objects,
212 :     as per Csparse_validate() and the tests in tests/validObj.R. This is as
213 :     in CHOLMOD/CSparse, where nzmax (>= .@p[n]) corresponds to length(.@i),
214 :     and makes sense e.g. for M[.,.] <- v assignments which could allocate in
215 :     chunks and would not need to re-allocate anything in many cases.
216 :     HOWEVER, replCmat() in R/Csparse.R is still far from making use of that.
217 :    
218 : mmaechler 2271 - advertize rbind2() / cbind2() and (rather?) rBind() / cBind()
219 : mmaechler 2239 ------ -----
220 :     in all vignettes / talks / ... !!
221 :     People erronously try rbind/cbind see that they don't work and then
222 :     reinvent the wheel!
223 : dmbates 2277
224 : mmaechler 2341 --> Consider using the new 'dotMethods' functionality to define
225 :     cbind() and rbind() versions that work with Matrix.
226 : mmaechler 2500 The "Rmpfr" package does that now.
227 : mmaechler 2341
228 : mmaechler 2345 - In all(M1 == M2) for sparse large matrices M1, M2 (e.g. M2 <- M1 !),
229 :     the intermediate 'M1 == M2' typically is dense, hence potentially using
230 :     humongous amount of memory.
231 :     We should/could devise something like allCompare(M1, M2, `==`)
232 :     which would remain sparse in all its computations.
233 :    
234 : dmbates 2277 --------
235 :    
236 :     - Reconsider the linkages in the include files for the SuiteSparse
237 :     packages. It may be better simply to add all the src/<nm>/Include
238 :     directories to the include path for all compilations. I don't think
239 :     there is a big overhead. Right now we need to modify the include
240 :     file src/SPQR/Include/SuiteSparseQR_C.h so that it does not expect
241 :     to have src/UFsparse and src/CHOLMOD/Include on the include path.
242 :     Maybe just those two should be added to the include path.
243 : mmaechler 2341
244 : mmaechler 2490 - (systematically check that LAPACK-calling functions check for
245 :     0-dimensional input themselves; LAPACK gives an integer error code)
246 : mmaechler 2497
247 : mmaechler 2688 - the f[,5762] <- thisCol now go via Csparse_subassign() call ...
248 :     [ in tests/indexing.R ].
249 :     Still would be nice to be able to use abIndex (see replTmat in R/Tsparse.R)
250 : mmaechler 2538
251 :     - {IS THIS CURRENT?}
252 :     Sept. 2009:
253 :     Subject: chol2inv() |-> solve(<CHMfactor>)
254 :    
255 :     when testing and documenting chol2inv(),
256 :     I found that it's pretty simple to also define a method for
257 :     "CHMfactor" objects, namely simply the solve(*, Diagonal(.) "A")
258 :     method.
259 :     This is not particularly exciting, and also does *not*, I think
260 :     help for defining a chol2inv() method for *sparse* (upper)
261 :     triangular matrices.
262 : mmaechler 2553
263 : mmaechler 2592 - sort(<sparseVector>, partial=..), needed, for mean(*, trim = .) or median().
264 : mmaechler 2553 Note that defining xtfrm() does not "help" (as sort() then goes via dense
265 :     index). See "mean" in R/Matrix.R
266 : mmaechler 2572
267 :     - rcond(<sparseMatrix>) - for square currently goes via *dense* -- BAD --
268 :     can we go via qr() in any case?
269 :     In some cases, e.g. lmer()'s "Lambda" (block triangular, small blocks)
270 :     rcond(L) := 1 / (norm(L) * norm(solve(L)))
271 :     is simple {and remains sparse, as solve(L) is still block triangular}
272 : mmaechler 2661
273 :     - How can we ensure that inst/include/cholmod.h remains
274 : mmaechler 2666 correct and equivalent to src/CHOLMOD/Include/cholmod_core.h and siblings ???
275 :     {currently need to do this manually (Emacs M-x compare-windows) for the typedefs}
276 :    
277 : mmaechler 2688 - finalize and activate the new *unused* code in src/t_sparseVector.c
278 : mmaechler 2773
279 :     - check all uses of alloca()/Alloca() in src/*.[ch]
280 :     ensuring that the *size* allocated cannot grow with the
281 :     vector/matrix/nnzero sizes of the input.
282 :     [see the change needed in svn r2770 in src/dtCMatrix.c !]
283 :    

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge