SCM

[#6320] segfault from crossprod(<sparse Matrix>) -- no longer but "problem too large"

Date:
2016-04-26 18:46
Priority:
4
State:
Open
Submitted by:
Benjamin Tyner (btyner)
Assigned to:
Martin Maechler (mmaechler)
Product:
None
Operating System:
All
Component:
None
Summary:
segfault from crossprod(<sparse Matrix>) -- no longer but "problem too large"

Detailed description
Here is my sessionInfo()

R version 3.2.5 (2016-04-14)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Server release 6.7 (Santiago)

locale:
[1] LC_CTYPE=en_US LC_NUMERIC=C LC_TIME=en_US LC_COLLATE=en_US LC_MONETARY=en_US LC_MESSAGES=en_US LC_PAPER=en_US LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] Matrix_1.2-5

loaded via a namespace (and not attached):
[1] grid_3.2.5 lattice_0.20-33

Code to reproduce:

n <- 1000000L
set.seed(6860)
w <- runif(n)
d <- sample(1:100, size = n, replace=TRUE)

M <- Matrix(0, nrow = 100L, ncol = n, sparse=TRUE)
for (i in 1:100) {
bool <- d == i
M[i, bool] <- w[bool] / sum(w[bool])
}

Q <- crossprod(M)

gdb backtrace:

#0 0x00007fffee214cb8 in cholmod_aat () from /home/btyner/R/x86_64-pc-linux-gnu-library/3.2/Matrix/libs/Matrix.so
#1 0x00007fffee1e6f9e in Csparse_crossprod () from /home/btyner/R/x86_64-pc-linux-gnu-library/3.2/Matrix/libs/Matrix.so
#2 0x00007ffff7a81bd0 in do_dotcall () from /home/btyner/R325/lib64/R/lib/libR.so
#3 0x00007ffff7ac09eb in Rf_eval () at eval.c:655
#4 0x00007ffff7ac1be7 in Rf_applyClosure () at eval.c:1046
#5 0x00007ffff7ab7e8a in bcEval () at eval.c:5465
#6 0x00007ffff7ac0450 in Rf_eval () at eval.c:558
#7 0x00007ffff7ac2091 in R_execClosure () at eval.c:1150
#8 0x00007ffff7ac24e3 in R_execMethod () at eval.c:1313
#9 0x00007fffefe7ed9a in R_dispatchGeneric () from /home/btyner/R325/lib64/R/library/methods/libs/methods.so
#10 0x00007ffff7af6503 in do_standardGeneric () from /home/btyner/R325/lib64/R/lib/libR.so
#11 0x00007ffff7ab4947 in bcEval () at eval.c:5493
#12 0x00007ffff7ac0450 in Rf_eval () at eval.c:558
#13 0x00007ffff7ac1be7 in Rf_applyClosure () at eval.c:1046
#14 0x00007ffff7ac05cf in Rf_eval () at eval.c:674
#15 0x00007ffff7ac3bbe in do_set () at eval.c:2108
#16 0x00007ffff7ac07f3 in Rf_eval () at eval.c:627
#17 0x00007ffff7ac50a9 in do_eval () at eval.c:2479
#18 0x00007ffff7ab4947 in bcEval () at eval.c:5493
#19 0x00007ffff7ac0450 in Rf_eval () at eval.c:558
#20 0x00007ffff7ac1be7 in Rf_applyClosure () at eval.c:1046
#21 0x00007ffff7ab7e8a in bcEval () at eval.c:5465
#22 0x00007ffff7ac0450 in Rf_eval () at eval.c:558
#23 0x00007ffff7ac0cc8 in forcePromise () at eval.c:457
#24 0x00007ffff7ac0a51 in Rf_eval () at eval.c:581
#25 0x00007ffff7ac52a1 in do_withVisible () at eval.c:2508
#26 0x00007ffff7af2599 in do_internal () from /home/btyner/R325/lib64/R/lib/libR.so
#27 0x00007ffff7ab3b86 in bcEval () at eval.c:5513
#28 0x00007ffff7ac0450 in Rf_eval () at eval.c:558
#29 0x00007ffff7ac1be7 in Rf_applyClosure () at eval.c:1046
#30 0x00007ffff7ab7e8a in bcEval () at eval.c:5465
#31 0x00007ffff7ac0450 in Rf_eval () at eval.c:558
#32 0x00007ffff7ac1be7 in Rf_applyClosure () at eval.c:1046
#33 0x00007ffff7ac05cf in Rf_eval () at eval.c:674
#34 0x00007ffff7ae78ea in Rf_ReplIteration () from /home/btyner/R325/lib64/R/lib/libR.so
#35 0x00007ffff7ae7c31 in R_ReplConsole () from /home/btyner/R325/lib64/R/lib/libR.so
#36 0x00007ffff7ae7cc4 in run_Rmainloop () from /home/btyner/R325/lib64/R/lib/libR.so
#37 0x000000000040084b in main () at Rmain.c:29

Comments:

Message  ↓
Date: 2022-08-31 17:03
Sender: Mikael Jagan

Moved from Bugs to Feature Requests

Date: 2021-01-10 13:11
Sender: Benjamin Tyner

I also don't have a lot of free time for debugging these days, but for grins fired up valgrind on my Ubuntu machine and after the segfault it says:

Sorry, the program "memcheck-amd64-linux" closed unexpectedly

Your computer does not have enough free memory to automatically analyze the problem and send a report to the developers.

but will continue to look at this, time permitting.

Date: 2021-01-05 16:16
Sender: Martin Maechler

The "problem too large" is currently "as good as it gets".

I had tried in the past to use "cholmod_l_*()" routines everywhere but did not succeed, and gave up ... for the time being.

I'm glad for help -- you must be able to program in C (and more importantly, to understand the SuiteSparse C code and setup).

Date: 2017-08-15 18:45
Sender: Benjamin Tyner

As of Matrix 1.2-10, I no longer get a segfault, but instead:

Error in .local(x, y, ...) :
Cholmod error 'problem too large' at file ../Core/cholmod_aat.c, line 173

Date: 2016-04-27 00:41
Sender: Benjamin Tyner

valgrind had this to say

==22227== Warning: set address range perms: large range [0x3aeed040, 0x18b592320) (undefined)
==22227== Warning: set address range perms: large range [0x40ac87040, 0x6ab9d1600) (undefined)
==22227== Invalid write of size 4
==22227== at 0x128BAC48: cholmod_aat (cholmod_aat.c:231)
==22227== by 0x1288CF9D: Csparse_crossprod (Csparse.c:766)
==22227== by 0x4CE8BCF: do_dotcall (dotcode.c:1251)
==22227== by 0x4D279EA: Rf_eval (eval.c:655)
==22227== by 0x4D28BE6: Rf_applyClosure (eval.c:1046)
==22227== by 0x4D1EE89: bcEval (eval.c:5465)
==22227== by 0x4D2744F: Rf_eval (eval.c:558)
==22227== by 0x4D29090: R_execClosure (eval.c:1150)
==22227== by 0x4D294E2: R_execMethod (eval.c:1313)
==22227== by 0xD7B3D99: R_dispatchGeneric (methods_list_dispatch.c:1035)
==22227== by 0x4D5D502: do_standardGeneric (objects.c:1125)
==22227== by 0x4D1B946: bcEval (eval.c:5493)
==22227== Address 0x18b592320 is 0 bytes after a block of size 5,644,112,608 alloc'd
==22227== at 0x4A069EE: malloc (vg_replace_malloc.c:270)
==22227== by 0x128C594E: cholmod_malloc (cholmod_memory.c:143)
==22227== by 0x128C5D24: cholmod_realloc (cholmod_memory.c:323)
==22227== by 0x128C5E0D: cholmod_realloc_multiple (cholmod_memory.c:445)
==22227== by 0x128C6571: cholmod_allocate_sparse (cholmod_sparse.c:142)
==22227== by 0x128BA89C: cholmod_aat (cholmod_aat.c:183)
==22227== by 0x1288CF9D: Csparse_crossprod (Csparse.c:766)
==22227== by 0x4CE8BCF: do_dotcall (dotcode.c:1251)
==22227== by 0x4D279EA: Rf_eval (eval.c:655)
==22227== by 0x4D28BE6: Rf_applyClosure (eval.c:1046)
==22227== by 0x4D1EE89: bcEval (eval.c:5465)
==22227== by 0x4D2744F: Rf_eval (eval.c:558)
==22227==
==22227== Invalid read of size 4
==22227== at 0x128BACCF: cholmod_aat (cholmod_aat.c:238)
==22227== by 0x1288CF9D: Csparse_crossprod (Csparse.c:766)
==22227== by 0x4CE8BCF: do_dotcall (dotcode.c:1251)
==22227== by 0x4D279EA: Rf_eval (eval.c:655)
==22227== by 0x4D28BE6: Rf_applyClosure (eval.c:1046)
==22227== by 0x4D1EE89: bcEval (eval.c:5465)
==22227== by 0x4D2744F: Rf_eval (eval.c:558)
==22227== by 0x4D29090: R_execClosure (eval.c:1150)
==22227== by 0x4D294E2: R_execMethod (eval.c:1313)
==22227== by 0xD7B3D99: R_dispatchGeneric (methods_list_dispatch.c:1035)
==22227== by 0x4D5D502: do_standardGeneric (objects.c:1125)
==22227== by 0x4D1B946: bcEval (eval.c:5493)
==22227== Address 0x18b592320 is 0 bytes after a block of size 5,644,112,608 alloc'd
==22227== at 0x4A069EE: malloc (vg_replace_malloc.c:270)
==22227== by 0x128C594E: cholmod_malloc (cholmod_memory.c:143)
==22227== by 0x128C5D24: cholmod_realloc (cholmod_memory.c:323)
==22227== by 0x128C5E0D: cholmod_realloc_multiple (cholmod_memory.c:445)
==22227== by 0x128C6571: cholmod_allocate_sparse (cholmod_sparse.c:142)
==22227== by 0x128BA89C: cholmod_aat (cholmod_aat.c:183)
==22227== by 0x1288CF9D: Csparse_crossprod (Csparse.c:766)
==22227== by 0x4CE8BCF: do_dotcall (dotcode.c:1251)
==22227== by 0x4D279EA: Rf_eval (eval.c:655)
==22227== by 0x4D28BE6: Rf_applyClosure (eval.c:1046)
==22227== by 0x4D1EE89: bcEval (eval.c:5465)
==22227== by 0x4D2744F: Rf_eval (eval.c:558)
==22227==
==22227== Invalid write of size 8
==22227== at 0x128BACBC: cholmod_aat (cholmod_aat.c:241)
==22227== by 0x1288CF9D: Csparse_crossprod (Csparse.c:766)
==22227== by 0x4CE8BCF: do_dotcall (dotcode.c:1251)
==22227== by 0x4D279EA: Rf_eval (eval.c:655)
==22227== by 0x4D28BE6: Rf_applyClosure (eval.c:1046)
==22227== by 0x4D1EE89: bcEval (eval.c:5465)
==22227== by 0x4D2744F: Rf_eval (eval.c:558)
==22227== by 0x4D29090: R_execClosure (eval.c:1150)
==22227== by 0x4D294E2: R_execMethod (eval.c:1313)
==22227== by 0xD7B3D99: R_dispatchGeneric (methods_list_dispatch.c:1035)
==22227== by 0x4D5D502: do_standardGeneric (objects.c:1125)
==22227== by 0x4D1B946: bcEval (eval.c:5493)
==22227== Address 0x6ab9d1600 is 0 bytes after a block of size 11,288,225,216 alloc'd
==22227== at 0x4A069EE: malloc (vg_replace_malloc.c:270)
==22227== by 0x128C594E: cholmod_malloc (cholmod_memory.c:143)
==22227== by 0x128C5D24: cholmod_realloc (cholmod_memory.c:323)
==22227== by 0x128C5F38: cholmod_realloc_multiple (cholmod_memory.c:460)
==22227== by 0x128C6571: cholmod_allocate_sparse (cholmod_sparse.c:142)
==22227== by 0x128BA89C: cholmod_aat (cholmod_aat.c:183)
==22227== by 0x1288CF9D: Csparse_crossprod (Csparse.c:766)
==22227== by 0x4CE8BCF: do_dotcall (dotcode.c:1251)
==22227== by 0x4D279EA: Rf_eval (eval.c:655)
==22227== by 0x4D28BE6: Rf_applyClosure (eval.c:1046)
==22227== by 0x4D1EE89: bcEval (eval.c:5465)
==22227== by 0x4D2744F: Rf_eval (eval.c:558)
==22227==
==22227== Invalid read of size 4
==22227== at 0x128BACB0: cholmod_aat (cholmod_aat.c:241)
==22227== by 0x1288CF9D: Csparse_crossprod (Csparse.c:766)
==22227== by 0x4CE8BCF: do_dotcall (dotcode.c:1251)
==22227== by 0x4D279EA: Rf_eval (eval.c:655)
==22227== by 0x4D28BE6: Rf_applyClosure (eval.c:1046)
==22227== by 0x4D1EE89: bcEval (eval.c:5465)
==22227== by 0x4D2744F: Rf_eval (eval.c:558)
==22227== by 0x4D29090: R_execClosure (eval.c:1150)
==22227== by 0x4D294E2: R_execMethod (eval.c:1313)
==22227== by 0xD7B3D99: R_dispatchGeneric (methods_list_dispatch.c:1035)
==22227== by 0x4D5D502: do_standardGeneric (objects.c:1125)
==22227== by 0x4D1B946: bcEval (eval.c:5493)
==22227== Address 0x18b592324 is 4 bytes after a block of size 5,644,112,608 alloc'd
==22227== at 0x4A069EE: malloc (vg_replace_malloc.c:270)
==22227== by 0x128C594E: cholmod_malloc (cholmod_memory.c:143)
==22227== by 0x128C5D24: cholmod_realloc (cholmod_memory.c:323)
==22227== by 0x128C5E0D: cholmod_realloc_multiple (cholmod_memory.c:445)
==22227== by 0x128C6571: cholmod_allocate_sparse (cholmod_sparse.c:142)
==22227== by 0x128BA89C: cholmod_aat (cholmod_aat.c:183)
==22227== by 0x1288CF9D: Csparse_crossprod (Csparse.c:766)
==22227== by 0x4CE8BCF: do_dotcall (dotcode.c:1251)
==22227== by 0x4D279EA: Rf_eval (eval.c:655)
==22227== by 0x4D28BE6: Rf_applyClosure (eval.c:1046)
==22227== by 0x4D1EE89: bcEval (eval.c:5465)
==22227== by 0x4D2744F: Rf_eval (eval.c:558)
==22227==

*** caught segfault ***
address 0x18b593000, cause 'memory not mapped'

Attached Files:

Attachments:
Size Name Date By Download
24 KiBnot_enough_memory.png2021-01-10 13:11btynernot_enough_memory.png

Changes

Field Old Value Date By
typeBugs2022-08-31 17:03jaganmn
File Added5202: not_enough_memory.png2021-01-10 13:11btyner
summarysegfault from crossprod(<sparse Matrix>)2021-01-05 16:16mmaechler
ResolutionAccepted As Bug2021-01-05 16:16mmaechler
priority32017-08-15 10:54mmaechler
assigned_tonone2017-08-15 10:54mmaechler
HardwarePC2016-04-27 12:54mmaechler
Operating SystemLinux2016-04-27 12:54mmaechler
ResolutionNone2016-04-27 12:54mmaechler
Thanks to:
Vienna University of Economics and Business Powered By FusionForge