SCM

[#6643] Certain subsetting methods don't appear to be implemented for dgCMatrix

Date:
2019-11-06 23:26
Priority:
3
State:
Closed
Submitted by:
Ezra Tucker (czarlogic)
Assigned to:
Martin Maechler (mmaechler)
Hardware:
PC
Product:
None
Operating System:
Linux
Component:
None
Version:
None
Severity:
normal
Resolution:
Invalid
URL:
Summary:
Certain subsetting methods don't appear to be implemented for dgCMatrix

Detailed description
I have an application that requires some creative subsetting of sparse matrices to return other sparse matrices.
Example:

```
> m <- new("dgCMatrix", i = c(2L, 0L, 1L, 2L, 0L, 1L), p = c(0L, 1L, 2L, 4L, 4L, 6L), Dim = c(3L, 5L), Dimnames = list(NULL, NULL), x = c(2, 1, 2, 1, 2, 1), factors = list())
> m

3 x 5 sparse Matrix of class "dgCMatrix"

[1,] . 1 . . 2
[2,] . . 2 . 1
[3,] 2 . 1 . .

> m[1, 1]
[1] 0
> m[1:2, 1:2]
Error in m[1:2, 1:2] : invalid or not-yet-implemented 'Matrix' subsetting
```

I have a few workarounds in place, one of which is to go through R's base::matrix

```
> Matrix(as.matrix(m)[1:2, 1:2])
2 x 2 sparse Matrix of class "dtCMatrix"

[1,] . 1
[2,] . .
```
However, some of my matrices are too big and have too many 0s to do this. My other workaround is to just create a new dgCMatrix with the proper dimensions, selecting the right elements from the i, p, and x slots.

Furthermore, and I believe this to be the same bug, very large sparse matrices don't even attempt to print to print to the screen; I think this is because R is trying to curtail the object to the first number of elements corresponding to the user's options("max.print")

System Info:
R 3.6.1
Matrix 1.2-17
Running on Ubuntu 18.04.03, Linux 5.0.0-32-generic, on Intel Core i5-2520M

Comments:

Message  ↓
Date: 2022-12-02 22:29
Sender: Mikael Jagan

OK, I just added those 3 methods to my not-yet-activated subscript code, with r3802. I will make a note to activate at least those methods before the next release (and then eliminate the existing ones for x=packedMatrix, which would become redundant).

Date: 2022-12-02 22:11
Sender: Mikael Jagan

The trouble with the `[` operator is defining S4 methods that work for all possible signatures (x,i,j,drop) without exploding the method table. Supporting i=NULL and j=NULL likely was never a priority, because i=integer(0) and j=integer(0) are equivalent and seen much more in practice.

Unfortunately, with Matrix 1.4-1, I violated this principle by implementing 14 (!) such methods for x=packedMatrix; see the output of showMethods("[", classes = "NULL").

That is really too much, in retrospect ... I wonder if we can delete those and instead have just 3 methods:

signature(x="Matrix", j="NULL", j="NULL", drop="ANY")
signature(x="Matrix", i="NULL", j= "ANY", drop="ANY")
signature(x="Matrix", i= "ANY", j="NULL", drop="ANY")

reassigning NULL arguments to integer(0) and then dispatching. I will look at this ahead of the next release (probably in spring), notably as I am already working on simplifying our existing constellation of methods for `[`.

Date: 2022-12-02 20:37
Sender: Daniel Sabanés Bové

Hi Martin,

just came across this, and I find this: (sorry for the German error message)

> Matrix(c(0,0:2,0), 3,5)[1:2, NULL]
Fehler in Matrix(c(0, 0:2, 0), 3, 5)[1:2, NULL] :
ungültige oder noch nicht implementierte 'Matrix'-Untermenge

with vanilla R 4.2.1.

Interestingly, it works with negating all column indices:

> Matrix(c(0,0:2,0), 3,5)[1:2, -(1:5)]
2 x 0 sparse Matrix of class "dgCMatrix"

[1,]
[2,]

Is this something known?

Thanks,
cheers
Daniel

Date: 2019-11-09 07:25
Sender: Martin Maechler

Thank you, Ezra.
So it is almost surely the R code in your package which must redefine some methods and or classes which 'Matrix' already defined. That is not good, and really no issue with 'Matrix' so I am closing this issue here.

However I strongly recommend asking about this on the 'R-package-devel' mailing list ... but only if you can provide a relatively simple reproducible way to get to the problem.
Ideally a way that only source()'s some of the R source files of your package, even without using devtools.

Alternatively, you could continue via e-mail (to maechler@r-project.org), but the following applies in any case :

If you don't give us a way to reproduce your problem, we can't help you.

Best,
Martin

Date: 2019-11-08 22:10
Sender: Ezra Tucker

You're of course right, this *is* a very basic subsetting operation, which is why I found this to be so surprising. I'll be honest, I'm having a great deal of difficulty in reproducing this exact issue in a clean environment so I can isolate the problem further; however, I'll continue attempting to do so.

I can offer some clues though.

(1) It's possible, yes, that it is some tidyverse related conflict. The package I'm developing has some tidyverse dependencies. I feel like it's more likely that it's my own package and not something from tidyverse that's causing the problem.

(2) When I start up R, it works perfectly - I can create m, and I can subset, works great. When I run part (or all) of my test suite (testthat), I can no longer subset the matrix by doing m[1:2, 1:2]. I'm pretty sure that the reason for that is that running devtools::test() adds "package:mypackage" to search() (including functions that aren't exported)

(3) I traced some of the behavior to the "index" type in the Matrix package - noticing that doing a 1:2 sequence would be yield integers but "index" is a class union of numeric, logical, and character. I tried doing m[as.numeric(1:2), as.numeric(1:2)] and that seems to work even while m[1:2, 1:2 doesn't. Likewise m[c(1, 2), ] works, while m(as.integer(c(1, 2)), ] .

(4) I confirmed this error on another computer, running the same version of R and Matrix, on Windows 10.

If you have any further thoughts, they would definitely be appreciated. Thanks!

Date: 2019-11-07 15:59
Sender: Martin Maechler

This is a very basic subsetting operation and works perfectly for me (and probably 99.99% of users).

Start R --vanilla
and then

library(Matrix, lib.loc = .Library)

and try the m[1:2, 1:2] again., or similarly, more compactly, just try

```
> Matrix(c(0,0:2,0), 3,5)[1:2, 1:2]
2 x 2 sparse Matrix of class "dgCMatrix"

[1,] . 2
[2,] . .
>
```

It will work.

You must accidentally load another package that manages to conflict with the Matrix package ... which is pretty amazing to me.... ((is it a tidyverse hack ?))

Attached Files:

Changes

Field Old Value Date By
status_idOpen2019-11-09 07:25mmaechler
close_dateNone2019-11-09 07:25mmaechler
ResolutionWorks For Me2019-11-09 07:25mmaechler
assigned_tonone2019-11-07 15:59mmaechler
SeverityNone2019-11-07 15:59mmaechler
ResolutionNone2019-11-07 15:59mmaechler
Thanks to:
Vienna University of Economics and Business Powered By FusionForge