# SCM Repository

# View of /pkg/Matrix/TODO

Parent Directory | Revision Log

Revision

File size: 11268 byte(s)

**2677**- (**download**) (**annotate**)*Sat Jun 25 19:18:12 2011 UTC*(7 years, 8 months ago) by*mmaechler*File size: 11268 byte(s)

finally a working C level Csparse_subassign() - to solve the memory-explode issue .. *BUT* it`s still too slow, even if I never remove but use drop0(.) in the end ... commit now before South America ... in case I get hit by a truck :-)

- Arm twisting ... and other convincing arguments by R core fellows: ==> R/zzz.R --- do NOT redefine base, but define an S3 method ------- for as.array() etc - Check for DimNames propagation in coercion and other operations. - Report the problem in the Linux ldexp manual page. The second and third calls in the Synopsis should be to ldexpf and ldexpl. - provide methods for "dspMatrix" and "dppMatrix"! - implement (more) methods for supporting "packed" (symmetric / triangular) matrices; particularly something like pack() and unpack() [to/from our classes from/to "numeric"] --- have already man/unpack.Rd but no method yet! (have some dtr* <-> dtp*) - combine the C functions for multiplication by special forms and solution wrt special forms by using a 'right' argument and a 'classed' argument. [done with dgeMatrix_matrix_mm(); not yet for other classes; and for _crossprod()] ----- - "Math2" , "Math", "Arith": keep triangular and symmetric Matrices when appropriate: particularly desirable for "Math2": round(), signif() - For triangular matrices, make sure the four rules of "triangular matrix algebra" (Golub+Van Loan 1996, 3.1.8, p.93) are fulfilled. - since 2008-03-06 ok for Csparse - since 2010-07-23 ok for <dtr> %*% <dtr> TODO: e.g. for <ltr> %*% <dtC> - "d" <-> "l" coercion for all "[TCR]" sparse matrices is really trivial: "d" -> "l" : drops the 'x' slot "l" -> "d" : construct an 'x' slot of all '1' We currently have many of these conversions explicitly, e.g. setAs("dsTMatrix", "lsTMatrix", function(from) new("lsTMatrix", i = from@i, j = from@j, uplo = from@uplo, Dim = from@Dim, Dimnames = from@Dimnames)) but I would rather want to automatically construct all these coercion methods at once by a ``method constructor'', i.e., for all "dsparse*" -> "lsparse*" and vice versa. How can one do this {in a documented way} ? - Think of constructing setAs(...) calls automatically in order to basically enable all ``sensible'' as(fromMatrix, toMatrix) calls, possibly using canCoerce(.) - When we have a packed matrix, it's a waste to go through "full" to "sparse": ==> implement setAs("dspMatrix", "sparseMatrix") setAs("dppMatrix", "sparseMatrix") setAs("dtpMatrix", "sparseMatrix") and the same for "lsp" , "ltp" and "nsp" , "ntp" ! - tcrossprod(x, y) : do provide methods for y != NULL calling Lapack's DGEMM for "dense" [2005-12-xx: done for dgeMatrix at least] - Factorizations: LU done; also Schur() for *sparse* Matrices. - use .Call(Csparse_drop, M, tol) in more places, both with 'tol = 0.' to drop "values that happen to be 0" and for zapsmall() methods for Csparse* - implement .Call(Csparse_scale, ....) interfacing to cholmod_scale() in src/CHOLMOD/Include/cholmod_matrixops.h : for another function specifically for multiplying a cholmod_sparse object by a diagonal matrix. Use it in %*% and [t]crossprod methods. - chol() should ``work'': proper result or "good" error message. - make sure *all* group methods have (maybe "bail-out") setMethod for "Matrix". e.g. zapsmall(<pMatrix>) fails "badly" - rbind2(<sparse>, <dense>) does not work (e.g. <dgC>, <dge>) - <sparse> %*% <dense> {also in crossprod/tcrossprod} currently always returns <dense>, since --> Csparse_dense_prod --> cholmod_sdmult and that does only return dense. When the sparse matrix is very sparse, i.e. has many rows with only zero entries, it would make much sense to return sparse. - ! <symmetricMatrix> loses symmetry, both for dense and sparse matrices. !M where M is "sparseMatrix", currently always gives dense. This only makes sense when M is ``really sparse''. - example(Cholesky, echo=FALSE) ; cm <- chol(mtm); str(cm); str(mtm) shows that chol() does not seem to make use of an already present factorization and rather uses one with more '0' in x slot. - diag(m) <- val currently automatically works via m[cbind(i,i)] <- val This (`[<-` method) is now "smart" for diagonalMatrix, but needs also to be for triangularMatrix, and probably also "dense*general*Matrix" since the above currently goes via "matrix" and back instead of using the 'x' slot directly; in particular, the triangular* "class property" is lost! Note that 'diag(M[,-1]) <- val' is deadly slow (*) for large sparse M, but that's because of the 2nd line assignment in the equivalent tmpM <- `diag<-`(M[,-1], val) M[,-1] <- tmpM (*): gives *error* {about negative integer} when prod(dim(M)) > .Machine$integer.max, e.g. for square (n x n) M when n >= 46341 == ceiling(2^15.5) This is "the same" as Ashley Ford's report (25 Feb 2010), MM @ ~/R/MM/Pkg-ex/Matrix/nsp-2col-index-bug.R - examples for solve( Cholesky(.), b, system = c("A", "LDLt"....)) probably rather in man/CHMfactor-class.Rd than man/Cholesky.Rd - LDL(<CHMsimpl>) looks relatively easy; via "tCsparse_diag()" {diagonal entries of *triangular* Csparse} --> see comment in determinant(<dsC>) in R/dsCMatrix.R, will give faster determinant - tr(A %*% B) {and even tr(A %*% B %*% C) ...} are also needed frequently in some computations {conditional normal distr. ...}. Since this can be done faster than by sum(diag(A %*% B)) even for traditional matrices, e.g. sum(A * t(B)) or {even faster for "full" mat} crossprod(as.vector(A), as.vector(B)) and even more so for, e.g. <sparse> %*% <dense> {used in Soeren's 'gR' computations}, we should also provide a generic and methods. - diag(A %*% B) might look like a "generalization" of tr(A %*% B), but as the above tricks show, is not really. Still, it's well worth to provide diag.prod(A, B) - qr.R(qr(x)) may differ for the "same" matrix, depending on it being sparse or dense: "qr.R(<sparse>) may differ from qr.R(<dense>) because of permutations" This is not really acceptable and currently influences rcond() as well. - eigen() should become generic, and get a method at least for diagonal, but also for symmetric -> dsyMatrix [LAPACK dsyev() uses UPLO !], but also simply for dgeMatrix (without going via tradition matrices). What about Sparse? There's fill-in, but it may still be sensible, e.g. mlist <- list(1, 2:3, diag(x=5:3), 27, cbind(1,3:6), 100:101) ee <- eigen(tcrossprod(bdiag(lapply(mlist, as.matrix)))) Matrix( signif(ee$vectors, 3) ) - facmul() has no single method defined; it looks like a good idea though (instead of the infamous qr.qy, qr.qty,.... functions) - symmpart() and skewpart() for *sparse* matrices still use (x +/- t(x))/2 and could be made more efficient. Consider going via asTuniq() or something very close to .Arith.Csparse() in R/Ops.R For a traditional "matrix" object, we should speedup, using C code .. - many setAs(*, "[dl]..Matrix") are still needed, as long as e.g. replCmat() uses as_CspClass() and drop0(.) which itself call as_CspClass() quite a bit. --> try to replace these by as(*, "CsparseMatrix"); forceSymmetric, etc. - implement fast diag(<triangularCsparse>) via calling new src/Csparse.c's diag_tC_ptr() - add examples (and tests!) for update(<CHMfactor>, ..) and Cholesky(......, Imult), also tests for hidden {hence no examples} ldetL2up() { R/CHMfactor.R } - chol(<nsCMatrix>) gives "temporarily disabled" but should give the *symbolic* factorization; similarly Cholesky(.) is not enabled - writeMM(obj, file=stdout()) creates file "1" since file is silently assumed to be a string, i.e. cannot be a connection. An R (instead of C) version should be pretty simple, and would work with connections automatically ["lsparse" become either "real" or "pattern", "depending if they have NAs or not]. - <diagMatrix> o <ddenseMatrix> still works via sparse in some cases, but could return <diagMatrix> in the same cases where <diagMatrix> o <numeric> does. - look at solve.QP.compact() in \pkg{quadprog} and how to do that using our sparse matrices. Maybe this needs to be re-implemented using CHOLMOD routines. - We allow "over-allocated" (i,x)-slots for CsparseMatrix objects, as per Csparse_validate() and the tests in tests/validObj.R. This is as in CHOLMOD/CSparse, where nzmax (>= .@p[n]) corresponds to length(.@i), and makes sense e.g. for M[.,.] <- v assignments which could allocate in chunks and would not need to re-allocate anything in many cases. HOWEVER, replCmat() in R/Csparse.R is still far from making use of that. - advertize rbind2() / cbind2() and (rather?) rBind() / cBind() ------ ----- in all vignettes / talks / ... !! People erronously try rbind/cbind see that they don't work and then reinvent the wheel! --> Consider using the new 'dotMethods' functionality to define cbind() and rbind() versions that work with Matrix. The "Rmpfr" package does that now. - In all(M1 == M2) for sparse large matrices M1, M2 (e.g. M2 <- M1 !), the intermediate 'M1 == M2' typically is dense, hence potentially using humongous amount of memory. We should/could devise something like allCompare(M1, M2, `==`) which would remain sparse in all its computations. -------- - Reconsider the linkages in the include files for the SuiteSparse packages. It may be better simply to add all the src/<nm>/Include directories to the include path for all compilations. I don't think there is a big overhead. Right now we need to modify the include file src/SPQR/Include/SuiteSparseQR_C.h so that it does not expect to have src/UFsparse and src/CHOLMOD/Include on the include path. Maybe just those two should be added to the include path. - (systematically check that LAPACK-calling functions check for 0-dimensional input themselves; LAPACK gives an integer error code) - the f[,5762] <- thisCol # now ... line in tests/indexing.R uses very large objects unnecessarily; Improve replTmat() in R/Tsparse.R, making use of new "abIndex" vectors. - {IS THIS CURRENT?} Sept. 2009: Subject: chol2inv() |-> solve(<CHMfactor>) when testing and documenting chol2inv(), I found that it's pretty simple to also define a method for "CHMfactor" objects, namely simply the solve(*, Diagonal(.) "A") method. This is not particularly exciting, and also does *not*, I think help for defining a chol2inv() method for *sparse* (upper) triangular matrices. - sort(<sparseVector>, partial=..), needed, for mean(*, trim = .) or median(). Note that defining xtfrm() does not "help" (as sort() then goes via dense index). See "mean" in R/Matrix.R - rcond(<sparseMatrix>) - for square currently goes via *dense* -- BAD -- can we go via qr() in any case? In some cases, e.g. lmer()'s "Lambda" (block triangular, small blocks) rcond(L) := 1 / (norm(L) * norm(solve(L))) is simple {and remains sparse, as solve(L) is still block triangular} - How can we ensure that inst/include/cholmod.h remains correct and equivalent to src/CHOLMOD/Include/cholmod_core.h and siblings ??? {currently need to do this manually (Emacs M-x compare-windows) for the typedefs} - finalize and activate the new *unused* code in src/sparseVector.c

R-Forge@R-project.org | ViewVC Help |

Powered by ViewVC 1.0.0 |