SCM

[#6512] 32bit limitation in seqdist (maximum of 46,431 cases)

Date:
2017-05-26 06:07
Priority:
3
State:
Open
Submitted by:
Pasi Haapakorva (pasipasi)
Assigned to:
Nobody (None)
Resolution:
None
Severity:
None
TraMineR version:
v2.0-10
E-Mail address:
pasi.haapakorva@gmail.com
Summary:
32bit limitation in seqdist (maximum of 46,431 cases)

Detailed description
I'm actually using TraMineR 2.0-5, but this feature has been around since I started using TraMineR in 2015.

I believe (I'm not a programmer) that seqdist stores distance data in a 2-dimensional matrix, where the index is 1-dimensional and 32bit. It's a feature in R, which I think could be circumvented.

library(TraMineR)

seqdata <- dplyr::tibble(t1 = sample(1:3, 46342, replace = TRUE)) # sample data stored in a tibble

seq <- seqdef(seqdata)

dist <- seqdist(seq, method = "HAM", full.matrix = FALSE)

...returns

Error in seqdist(seq, method = "HAM", full.matrix = FALSE) :
negative length vectors are not allowed

If you repeat the code with one less case/row, it will run.

One option might be to look into long vectors https://stat.ethz.ch/R-manual/R-devel/library/base/html/LongVectors.html

Comments:

Message  ↓
Date: 2019-06-13 21:08
Sender: Janet Wang

Has there been an update on this? I keep having this issue with a different error message but the same issue as you -- too many unique sequences for seqdist to work.

Attached Files:

Changes

No Changes Have Been Made to This Item

Thanks to:
Vienna University of Economics and Business Powered By FusionForge