SCM

[#6657] sparse.model.matrix fails to convert to dgCmatrix

Date:
2020-02-13 14:44
Priority:
3
State:
Open
Submitted by:
Nick Hanewinckel (hanewinckel)
Assigned to:
Nobody (None)
Hardware:
PC
Product:
Software A
Operating System:
other
Component:
None
Version:
None
Severity:
major
Resolution:
Awaiting Response
URL:
Summary:
sparse.model.matrix fails to convert to dgCmatrix

Detailed description
This is a very deep potential bug. It's hard to reproduce because it seems to only happen with some custom spline function s we have written. Luckily, I think I see the solution.

Problem:

Occasionally sparse.model.matrix will fail to create the matrix due to this error:
Error in model.spmatrix(t, data, transpose = transpose, drop.unused.levels = drop.unused.levels, :
no slot of name "i" for this object of class "dgeMatrix"

This seems to happen because model.spmatrix is expecting a sparse dgCmatrix but gets a dense dgeMatrix.

By debugging, I see that:
model.spMatrix:
-model.spMatrix executes a for loop for interaction terms (line 100 "for (j in iTrm)")
-matrix rj is created with the function sparseInt.r and passes the argument forceSparse=TRUE

sparseInt.r:
-the problem happens if what passes to this function is dense
-a dgeMatrix returns FALSE for is.matrix(m) and hence the needed dense_to_Csparse does not get called on line 6
- That should be okay, as long as sparse2int on line 19 would do this conversion
- Unfortunately, the forceSparse = TRUE is NOT passed to sparse2int

sparse2int:
- As mentioned above, forceSparce parameter is not passed here. So regardless of the forceSparse argument, this function always will use default forceSparse=FALSE in this context.
- Since dense_to_Csparse never gets called, a dgeMatrix gets passed when model.spMatrix expects a dgCmatrix

Proposed Solution:
I believe the easiest way to fix it is simply to amend line 19 of sparseInt.r to pass the forceSparce=TRUE argument to sparse2int. As a non-developer of this package, I have no idea of the other implications of doing this. For what it's worth, in "debug" mode I simply directly ran dense_to_Csparse on the dgeMatrix and successfully returned the dgCmatrix.

Another possibility is to amend sparseInt.r so that line 5 will trigger .Call(dense_to_Cparse,m) if it is a dense matrix (not just is.matrix which returns false for a dgeMatrix).

I hope this explanation was helpful, and it seems like a real bug, even if it's an extreme edge case.

My temporary workaround is to replace this line in my error-prone code:
X <- sparse.model.matrix(formula, data = getMountedData())
with this (inelegant) one:
X <- .Call(Matrix:::matrix_to_Csparse,model.matrix(formula, data = getMountedData()),'dgCMatrix')

Comments:

Message  ↓
Date: 2020-02-13 22:12
Sender: Nick Hanewinckel

I tried uploading the files, but I can't tell if it was a success? Please advise if you need me to resubmit.

I have also confirmed that the below code (referenced in the bug report) does work correctly to bypass the error:
X <- .Call(Matrix:::matrix_to_Csparse,model.matrix(formula, data = dat),'dgCMatrix')

Attached Files:

Changes

No Changes Have Been Made to This Item

Thanks to:
Vienna University of Economics and Business Powered By FusionForge