SCM

[#6611] Long vector support for sparse matrix

Date:
2019-01-30 02:05
Priority:
3
State:
Open
Submitted by:
Josh Panfil (jmpanfil)
Assigned to:
Nobody (None)
Product:
None
Operating System:
None
Component:
None
Summary:
Long vector support for sparse matrix

Detailed description
The Matrix package cannot handle long vectors (length greater than 2^31 - 1) as inputs for sparseMatrix(). In my case, a sparse matrix is necessary for running an XGBoost model because of memory and time constraints. The XGBoost xgb.DMatrix supports using a dgCMatrix object. However, due to the size of my data, trying to create a sparse matrix results in an error. Here's an example of the issue. (Warning: this uses 50-60 GB RAM.)

i <- rep(1, 2^31)
j <- i
j[(2^30): length(j)] <- 2
x <- i
s <- sparseMatrix(i = i, j = j, x = x)

Error in validityMethod(as(object, superClass)) : long vectors not supported yet: ../../src/include/Rinlinedfuns.h:137

Comments:

Message  ↓
Date: 2022-08-21 19:16
Sender: Mikael Jagan

The 'p' slot of all CsparseMatrix, including dgCMatrix, is such that p[length(p)] is the number of nonzero entries. It is constrained to be of type "integer":

> getSlots("CsparseMatrix")
i p Dim Dimnames
"integer" "integer" "integer" "list"

Since elements of integer vectors cannot exceed INT_MAX == 2^31-1, a dgCMatrix with more than 2^31-1 nonzero entries would be formally _invalid_. So it is partly a problem of formal _definition_, and not at all a trivial bug in the sparseMatrix() function.

That said, we _do_ want to eventually support "long" CsparseMatrix. As Martin has said elsewhere, collaborators are welcome, but it will require some proficiency with C programming and some familiarity (or ability to become familiar) with the SuiteSparse library.

Date: 2019-05-28 22:39
Sender: Laurens Geffert

Is anybody working on this?
If not, I am interested to pick this up but I will require some pointers on how to proceed...

Date: 2019-04-11 16:32
Sender: Josh Panfil

Just reporting. Not qualified to work on this myself at the moment. Sorry if I posted this in the wrong area.

Date: 2019-04-11 16:29
Sender: Mike Richards

Josh are you planning on working on this, or just reporting the issue?

Attached Files:

Changes

No Changes Have Been Made to This Item

Thanks to:
Vienna University of Economics and Business Powered By FusionForge