SCM

SCM Repository

[matrix] Annotation of /pkg/Matrix/TODO
ViewVC logotype

Annotation of /pkg/Matrix/TODO

Parent Directory Parent Directory | Revision Log Revision Log


Revision 2990 - (view) (download)

1 : mmaechler 2712 ##-*- mode: org -*-
2 : mmaechler 2677
3 : mmaechler 2897 * *Urgent* in some sense ---------------------------------------------------
4 : mmaechler 2946 ** TODO M[<sparse 2-column Matrix>] indexing should work (but with a warning: use *dense*!)
5 : mmaechler 2885 ** TODO src/CHOLMOD/MatrixOps/cholmod_symmetry.c is "cool" and fast;
6 : mmaechler 2889 Definitely should use it for solve(<dgCMatrix>) {it seems MATLAB does};
7 :     alternatively also is_sym() [in src/cs_utils.c], see below.
8 : mmaechler 2897 ** TODO diagonalMatrix inherits from sparseMatrix, *BUT*
9 :     "ddiMatrix" does not inherit from "dsparseMatrix", nor does
10 :     "ldiMatrix" from "lparseMatrix". Seems an undesirable inconsistency.
11 :     Try changing
12 :     setClass("ddiMatrix", contains = c("diagonalMatrix", "dMatrix"))
13 :     to
14 :     setClass("ddiMatrix", contains = c("diagonalMatrix", "dsparseMatrix"))
15 :    
16 :     ** TODO Look at Paul Bailey's problem -- CHOLMOD error (even seg.fault for him)
17 : mmaechler 2712 --> ~/R/MM/Pkg-ex/Matrix/sparseOrderedLogit.R
18 : mmaechler 2897 ** TODO BunchKaufman()'s result is not really useful yet {but it is used on C
19 : mmaechler 2885 level e.g. for solve(<dsyMatrix>). Should define expand() method or
20 :     similar, see man/BunchKaufman-methods.Rd and R/dsyMatrix.R (at end).
21 : mmaechler 2897 ** TODO src/cs_utils.c : I think is_sym() [only used in Matrix_cs_to_SEXP()] can be made sped up:
22 : mmaechler 2889 leave the for loops, as soon as is_lower == is_upper == 0.
23 : mmaechler 2712
24 : mmaechler 2897 * New smallish ideas, relatively urgent for MM -----------------------------
25 : mmaechler 2900 ** DONE generalize new "indMatrix" class, to allow 0 repetitions
26 :     of some samples, i.e., columns of all 0 s. It's mathematically more
27 :     natural --> typically will be useful.
28 : mmaechler 2966 ** DONE R/products.Rout -- [t]crossprod() could/should become more lenient with *vector*s
29 : mmaechler 2990 *** TODO consider analagous changes to base-R [t]crossprod()
30 : mmaechler 2930 ** TODO cor(<sparseMatrix>) and cov(<sparseMatrix>) at least for y=NULL ("no y").
31 :     -> ~/R/MM/Pkg-ex/Matrix/cor_sparse-propos.R <- http://stackoverflow.com/questions/5888287/
32 : mmaechler 2897 ** TODO Investigate the "band changing (and getting) ideas 'band<-' etc,
33 : mmaechler 2739 from Jeremy D Silver, per posts to R-devel on Aug.26,2011
34 :     {MM: ~/R/MM/Pkg-ex/Matrix/bands-Jeremy_Silver-ex.R }
35 : mmaechler 2712
36 : mmaechler 2897 ** TODO FIXME(2) and (3) in R/products.R: t(.Call(Csparse_dense_*))
37 :     ** TODO cbind2() / rbind2() for sparseMatrices: dimnames propagation should
38 : mmaechler 2725 happen in C, see R/bind2.R and src/Csparse.c (Csparse_horzcat etc).
39 :    
40 : mmaechler 2946 ** TODO use getOption("Matrix.quiet") in more places [--> less messages/warnings]
41 : mmaechler 2897 ** TODO Check for DimNames propagation in coercion and other operations.
42 : mmaechler 2990 Done for (%*%, crossprod, tcrossprod) -- but not systematically checked (tests/matprod.R)
43 :     *** TODO For colSums(), rowSums()
44 : mmaechler 2897 ** TODO Report the problem in the Linux ldexp manual page. The second and
45 : bates 344 third calls in the Synopsis should be to ldexpf and ldexpl.
46 : maechler 472
47 : mmaechler 2897 ** TODO provide methods for "dspMatrix" and "dppMatrix"!
48 : mmaechler 2813 2012-07: mostly(?) DONE, with Ops, etc, also pack() / unpack()
49 : maechler 634
50 : mmaechler 2897 ** TODO combine the C functions for multiplication by special forms and
51 : bates 645 solution wrt special forms by using a 'right' argument and a
52 :     'classed' argument.
53 : maechler 675 [done with dgeMatrix_matrix_mm(); not yet for other classes;
54 :     and for _crossprod()]
55 :    
56 : mmaechler 2897 ** DONE Cache '@factors' components also from R, e.g., for "Tsparse.."
57 :     via .set.factors()
58 :     ** TODO chol() and Cholesky() caching unfinished: the *name* [Ss][Pp][Dd]Cholesky
59 :     depends on (perm, LDL, super) arguments:
60 :     *** DONE .chkName.CHM(name, perm, LDL, super) and .CHM.factor.name()
61 :     *** TODO use the above
62 : mmaechler 2924 ** TODO partly DONE; new arg 'cache=FALSE': allow cache=FALSE to disable the caching
63 : mmaechler 2987 ** TODO 0-based vs 1-based indexing: grep -nHE -e '[01]-(orig|ind|base)' *.R
64 :     Can I find a *uniform* language '1-based indexing' or '0-origin indexing' ?
65 : mmaechler 2990 *** More systemtic possible via new argumnet 'orig_1' in m_encodeInd(), m_encodeInd2()
66 :     -> src/Mutils.c
67 : mmaechler 2897 * Generalization of Existing Classes and Methods ---------------------------
68 :     ** TODO "Math2" , "Math", "Arith": keep triangular and symmetric
69 :     Matrices when appropriate:
70 :     particularly desirable for "Math2": round(), signif()
71 :     ** TODO For triangular matrices, ensure the four rules of "triangular matrix algebra"
72 :     (Golub+Van Loan 1996, 3.1.8, p.93)"
73 :     *** DONE since 2008-03-06 for Csparse
74 :     *** DONE since 2010-07-23 for <dtr> %*% <dtr>
75 :     *** TODO e.g. for <ltr> %*% <dtC>
76 :     ** TODO "d" <-> "l" coercion for all "[TCR]" sparse matrices is really trivial:
77 : maechler 956 "d" -> "l" : drops the 'x' slot
78 :     "l" -> "d" : construct an 'x' slot of all '1'
79 :     We currently have many of these conversions explicitly, e.g.
80 :     setAs("dsTMatrix", "lsTMatrix",
81 :     function(from) new("lsTMatrix", i = from@i, j = from@j, uplo = from@uplo,
82 :     Dim = from@Dim, Dimnames = from@Dimnames))
83 :     but I would rather want to automatically construct all these coercion
84 :     methods at once by a ``method constructor'', i.e.,
85 :     for all "dsparse*" -> "lsparse*" and vice versa.
86 :     How can one do this {in a documented way} ?
87 : maechler 1087
88 : mmaechler 2897 ** TODO Think of constructing setAs(...) calls automatically in order to
89 : maechler 2048 basically enable all ``sensible'' as(fromMatrix, toMatrix) calls,
90 :     possibly using canCoerce(.)
91 :    
92 : mmaechler 2897 ** TODO When we have a packed matrix, it's a waste to go through "full" to "sparse":
93 : maechler 2048 ==> implement
94 :     setAs("dspMatrix", "sparseMatrix")
95 :     setAs("dppMatrix", "sparseMatrix")
96 :     setAs("dtpMatrix", "sparseMatrix")
97 :     and the same for "lsp" , "ltp" and "nsp" , "ntp" !
98 :    
99 : mmaechler 2897 ** TODO tcrossprod(x, y) : do provide methods for y != NULL
100 : maechler 1097 calling Lapack's DGEMM for "dense"
101 : maechler 1109 [2005-12-xx: done for dgeMatrix at least]
102 :    
103 : mmaechler 2897 ** TODO Factorizations: LU done; also Schur() for *sparse* Matrices.
104 : maechler 1253
105 : mmaechler 2897 ** TODO use .Call(Csparse_drop, M, tol) in more places,
106 : maechler 1619 both with 'tol = 0.' to drop "values that happen to be 0" and for
107 :     zapsmall() methods for Csparse*
108 : maechler 1654
109 : mmaechler 2897 ** TODO implement .Call(Csparse_scale, ....) interfacing to cholmod_scale()
110 : maechler 1654 in src/CHOLMOD/Include/cholmod_matrixops.h : for another function
111 :     specifically for multiplying a cholmod_sparse object by a diagonal matrix.
112 :     Use it in %*% and [t]crossprod methods.
113 :    
114 : mmaechler 2897 ** TODO make sure *all* group methods have (maybe "bail-out") setMethod for "Matrix".
115 : maechler 1714 e.g. zapsmall(<pMatrix>) fails "badly"
116 : maechler 1725
117 : mmaechler 2897 ** TODO rbind2(<sparse>, <dense>) does not work (e.g. <dgC>, <dge>)
118 : maechler 1799
119 : mmaechler 2897 ** TODO <sparse> %*% <dense> {also in crossprod/tcrossprod} currently always
120 : maechler 1833 returns <dense>, since --> Csparse_dense_prod --> cholmod_sdmult
121 :     and that does only return dense.
122 :     When the sparse matrix is very sparse, i.e. has many rows with only zero
123 :     entries, it would make much sense to return sparse.
124 :    
125 : mmaechler 2897 ** TODO ! <symmetricMatrix> loses symmetry, both for dense and sparse matrices.
126 : maechler 1833 !M where M is "sparseMatrix", currently always gives dense. This only
127 :     makes sense when M is ``really sparse''.
128 : maechler 1855
129 : mmaechler 2897 ** TODO diag(m) <- val currently automatically works via m[cbind(i,i)] <- val
130 : maechler 2115 This (`[<-` method) is now "smart" for diagonalMatrix, but needs also to
131 :     be for triangularMatrix, and probably also "dense*general*Matrix" since the
132 : maechler 2005 above currently goes via "matrix" and back instead of using the 'x' slot
133 : maechler 2115 directly; in particular, the triangular* "class property" is lost!
134 : mmaechler 2688 [current ??]
135 : maechler 2043
136 : mmaechler 2897 ** TODO The "[<-" now uses src/t_Csparse_subassign.c and no longer explodes
137 : mmaechler 2688 memory. *However* it is still too slow when the replacment region is large.
138 : mmaechler 2207
139 : mmaechler 2897 * Cholesky(), chol() etc ---------------------------------------------------
140 :     ** chol() should ``work'': proper result or "good" error message.
141 : mmaechler 2784 (mostly done ?)
142 :    
143 : mmaechler 2897 ** example(Cholesky, echo=FALSE) ; cm <- chol(mtm); str(cm); str(mtm)
144 : mmaechler 2784
145 :     shows that chol() does not seem to make use of an already
146 :     present factorization and rather uses one with more '0' in x slot.
147 :    
148 : mmaechler 2897 ** examples for solve( Cholesky(.), b, system = c("A", "LDLt"....))
149 : maechler 2043 probably rather in man/CHMfactor-class.Rd than man/Cholesky.Rd
150 : maechler 2072
151 : mmaechler 2897 ** LDL(<CHMsimpl>) looks relatively easy; via "tCsparse_diag()"
152 : maechler 2137 {diagonal entries of *triangular* Csparse}
153 :     --> see comment in determinant(<dsC>) in R/dsCMatrix.R, will give
154 :     faster determinant
155 :    
156 : mmaechler 2897 ** Allow Cholesky(A,..) when A is not symmetric *AND*
157 : mmaechler 2784 we really _mean_ to factorize AA' ( + beta * I)
158 :    
159 : mmaechler 2897 ** update(Cholesky(..), *): make *also* use of the possibility to update
160 : mmaechler 2784 with non-symmetric A and then AA' + mult * I is really meant.
161 :     .updateCHMfactor() ## allows that already(?)
162 :    
163 : mmaechler 2897 ** TODO add examples (and tests!) for update(<CHMfactor>, ..) and
164 : mmaechler 2784 Cholesky(......, Imult), also tests for hidden {hence no examples}
165 : mmaechler 2847 ldetL2up() { R/CHMfactor.R }; see ex in man/wrld_1deg.Rd
166 :     MM: See e.g. ~/R/MM/Pkg-ex/Matrix/CholUpdate.R -- for solve(<CHM>, <type>)
167 : mmaechler 2784
168 : mmaechler 2897 ** TODO implement fast diag(<triangularCsparse>) via calling new
169 : mmaechler 2784 src/Csparse.c's diag_tC_ptr() .
170 : mmaechler 2897 - diag_tC_ptr() functionality now exported via
171 :     R/dsCMatrix.R .diag.dsC() : the name is silly, but
172 : mmaechler 2784 functionality nice. See (hidden) example in man/Cholesky.Rd
173 :    
174 : mmaechler 2897 ** TODO chol(<nsCMatrix>) gives "temporarily disabled"
175 : mmaechler 2784 but should give the *symbolic* factorization;
176 :     similarly Cholesky(.) is not enabled
177 :    
178 : mmaechler 2897 * "Basic" new functionality -- "nice to have" (non-urgent) -----------------
179 :     ** TODO tr(A %*% B) {and even tr(A %*% B %*% C) ...} are also needed
180 : maechler 2072 frequently in some computations {conditional normal distr. ...}.
181 :     Since this can be done faster than by
182 :     sum(diag(A %*% B)) even for traditional matrices, e.g.
183 :     sum(A * t(B)) or {even faster for "full" mat}
184 :     crossprod(as.vector(A), as.vector(B))
185 :     and even more so for, e.g. <sparse> %*% <dense>
186 :     {used in Soeren's 'gR' computations},
187 :     we should also provide a generic and methods.
188 : mmaechler 2897 ** TODO diag(A %*% B) might look like a "generalization" of tr(A %*% B),
189 : mmaechler 2618 but as the above tricks show, is not really.
190 : mmaechler 2684 Still, it's well worth to provide diag.prod(A, B):
191 : maechler 2103
192 : mmaechler 2684 Well, if A %*% B is square, diag(A %*% B) === colSums(t(A) * B)
193 :     and we should probably teach people about that !
194 :    
195 : mmaechler 2897 ** TODO eigen() should become generic, and get a method at least for diagonal,
196 : maechler 2106 but also for symmetric -> dsyMatrix [LAPACK dsyev() uses UPLO !],
197 :     but also simply for dgeMatrix (without going via tradition matrices).
198 :     What about Sparse? There's fill-in, but it may still be sensible, e.g.
199 :     mlist <- list(1, 2:3, diag(x=5:3), 27, cbind(1,3:6), 100:101)
200 :     ee <- eigen(tcrossprod(bdiag(lapply(mlist, as.matrix))))
201 :     Matrix( signif(ee$vectors, 3) )
202 :    
203 : mmaechler 2897 * Everything else aka "Miscellaneous" --------------------------------------
204 :     ** qr.R(qr(x)) may differ for the "same" matrix, depending on it being
205 : mmaechler 2784 sparse or dense:
206 :     "qr.R(<sparse>) may differ from qr.R(<dense>) because of permutations"
207 :    
208 :     This is not really acceptable and currently influences rcond() as well.
209 :    
210 : mmaechler 2897 ** facmul() has no single method defined; it looks like a good idea though
211 : maechler 2106 (instead of the infamous qr.qy, qr.qty,.... functions)
212 : maechler 2112
213 : mmaechler 2897 ** TODO symmpart() and skewpart() for *sparse* matrices still use (x +/- t(x))/2
214 : maechler 2112 and could be made more efficient.
215 :     Consider going via asTuniq() or something very close to
216 :     .Arith.Csparse() in R/Ops.R
217 : mmaechler 2486 For a traditional "matrix" object, we should speedup, using C code ..
218 : maechler 2112
219 : mmaechler 2897 ** TODO many setAs(*, "[dl]..Matrix") are still needed, as long as e.g.
220 : maechler 2115 replCmat() uses as_CspClass() and drop0(.) which itself call
221 :     as_CspClass() quite a bit. --> try to replace these by
222 :     as(*, "CsparseMatrix"); forceSymmetric, etc.
223 : mmaechler 2897 ** writeMM(obj, file=stdout()) creates file "1" since file is silently
224 : mmaechler 2175 assumed to be a string, i.e. cannot be a connection.
225 :     An R (instead of C) version should be pretty simple, and would work with
226 :     connections automatically ["lsparse" become either "real" or
227 :     "pattern", "depending if they have NAs or not].
228 : mmaechler 2897 ** <diagMatrix> o <ddenseMatrix> still works via sparse in some cases, but
229 : mmaechler 2269 could return <diagMatrix> in the same cases where <diagMatrix> o <numeric> does.
230 : mmaechler 2207
231 : mmaechler 2897 ** look at solve.QP.compact() in \pkg{quadprog} and how to do that using
232 : mmaechler 2207 our sparse matrices. Maybe this needs to be re-implemented using CHOLMOD
233 :     routines.
234 :    
235 : mmaechler 2897 ** We allow "over-allocated" (i,x)-slots for CsparseMatrix objects,
236 : mmaechler 2236 as per Csparse_validate() and the tests in tests/validObj.R. This is as
237 :     in CHOLMOD/CSparse, where nzmax (>= .@p[n]) corresponds to length(.@i),
238 :     and makes sense e.g. for M[.,.] <- v assignments which could allocate in
239 :     chunks and would not need to re-allocate anything in many cases.
240 :     HOWEVER, replCmat() in R/Csparse.R is still far from making use of that.
241 :    
242 : mmaechler 2897 ** TODO advertize rbind2() / cbind2() and (rather?) rBind() / cBind()
243 :     ------ -----
244 : mmaechler 2239 in all vignettes / talks / ... !!
245 :     People erronously try rbind/cbind see that they don't work and then
246 :     reinvent the wheel!
247 : dmbates 2277
248 : mmaechler 2341 --> Consider using the new 'dotMethods' functionality to define
249 :     cbind() and rbind() versions that work with Matrix.
250 : mmaechler 2500 The "Rmpfr" package does that now.
251 : mmaechler 2341
252 : mmaechler 2897 ** TODO In all(M1 == M2) for sparse large matrices M1, M2 (e.g. M2 <- M1 !),
253 : mmaechler 2345 the intermediate 'M1 == M2' typically is dense, hence potentially using
254 :     humongous amount of memory.
255 :     We should/could devise something like allCompare(M1, M2, `==`)
256 :     which would remain sparse in all its computations.
257 :    
258 : dmbates 2277 --------
259 :    
260 : mmaechler 2897 ** Reconsider the linkages in the include files for the SuiteSparse
261 : dmbates 2277 packages. It may be better simply to add all the src/<nm>/Include
262 :     directories to the include path for all compilations. I don't think
263 :     there is a big overhead. Right now we need to modify the include
264 :     file src/SPQR/Include/SuiteSparseQR_C.h so that it does not expect
265 :     to have src/UFsparse and src/CHOLMOD/Include on the include path.
266 :     Maybe just those two should be added to the include path.
267 : mmaechler 2341
268 : mmaechler 2897 ** (systematically check that LAPACK-calling functions check for
269 : mmaechler 2490 0-dimensional input themselves; LAPACK gives an integer error code)
270 : mmaechler 2497
271 : mmaechler 2897 ** the f[,5762] <- thisCol now go via Csparse_subassign() call ...
272 : mmaechler 2688 [ in tests/indexing.R ].
273 :     Still would be nice to be able to use abIndex (see replTmat in R/Tsparse.R)
274 : mmaechler 2538
275 : mmaechler 2897 ** {IS THIS CURRENT?}
276 : mmaechler 2538 Sept. 2009:
277 :     Subject: chol2inv() |-> solve(<CHMfactor>)
278 :    
279 :     when testing and documenting chol2inv(),
280 :     I found that it's pretty simple to also define a method for
281 :     "CHMfactor" objects, namely simply the solve(*, Diagonal(.) "A")
282 :     method.
283 :     This is not particularly exciting, and also does *not*, I think
284 :     help for defining a chol2inv() method for *sparse* (upper)
285 :     triangular matrices.
286 : mmaechler 2553
287 : mmaechler 2897 ** sort(<sparseVector>, partial=..), needed, for mean(*, trim = .) or median().
288 : mmaechler 2553 Note that defining xtfrm() does not "help" (as sort() then goes via dense
289 :     index). See "mean" in R/Matrix.R
290 : mmaechler 2572
291 : mmaechler 2897 ** TODO rcond(<sparseMatrix>) for square currently goes via *dense* -- BAD --
292 : mmaechler 2572 can we go via qr() in any case?
293 :     In some cases, e.g. lmer()'s "Lambda" (block triangular, small blocks)
294 :     rcond(L) := 1 / (norm(L) * norm(solve(L)))
295 :     is simple {and remains sparse, as solve(L) is still block triangular}
296 : mmaechler 2661
297 : mmaechler 2897 ** TODO How can we ensure that inst/include/cholmod.h remains
298 : mmaechler 2666 correct and equivalent to src/CHOLMOD/Include/cholmod_core.h and siblings ???
299 :     {currently need to do this manually (Emacs M-x compare-windows) for the typedefs}
300 :    
301 : mmaechler 2897 ** finalize and activate the new *unused* code in src/t_sparseVector.c
302 : mmaechler 2773
303 : mmaechler 2897 ** check all uses of alloca()/Alloca() in src/*.[ch]
304 : mmaechler 2773 ensuring that the *size* allocated cannot grow with the
305 :     vector/matrix/nnzero sizes of the input.
306 :     [see the change needed in svn r2770 in src/dtCMatrix.c !]
307 :    

root@r-forge.r-project.org
ViewVC Help
Powered by ViewVC 1.0.0  
Thanks to:
Vienna University of Economics and Business Powered By FusionForge