Forum: help


RE: inbag indices [ Reply ] By: Achim Zeileis on 2021-09-10 20:21 | [forum:49120] |
Very nice, thanks for sharing, Markus! |
RE: inbag indices [ Reply ] By: Markus Loecher on 2021-09-10 20:04 | [forum:49119] |
Thanks, that is really helpful! Now the inbag indices make a lot of sense, and in fact cforest seems to be the only package I tested for which I find agreement between a tree fitted on the bootstrapped data indexed manually as shown here: https://markusloecher.github.io/Testing-Inbag-Indices/ |
RE: inbag indices [ Reply ] By: Achim Zeileis on 2021-09-10 18:14 | [forum:49118] |
I think you don't want fraction = 1 because replace = TRUE already implies n-out-of-n observations, see ?cforest. And then you can infer the indexes from the weights(): set.seed(1) cf <- cforest(dist ~ speed, data = cars, ntree = 3, perturb = list(replace = TRUE)) weights(cf) ## [[1]] ## [1] 1 0 0 1 0 4 1 0 1 2 0 0 0 1 2 0 0 1 0 3 3 0 2 0 3 0 0 1 0 0 0 0 2 2 1 0 3 1 ## [39] 2 1 1 4 1 3 0 2 1 0 0 0 ## ## [[2]] ## [1] 1 1 0 0 0 2 0 0 1 0 1 0 2 2 0 0 2 1 1 1 1 2 3 1 2 1 0 1 2 0 1 1 1 0 0 1 0 2 ## [39] 3 3 0 0 1 2 2 1 0 3 0 1 ## ## [[3]] ## [1] 2 0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 0 0 1 0 0 3 0 2 0 2 1 2 3 1 0 1 1 1 3 0 1 0 ## [39] 2 1 2 2 2 1 0 2 1 1 3 2 sapply(cf$weights, sum) ## [1] 50 50 50 |
inbag indices [ Reply ] By: Markus Loecher on 2021-09-10 15:19 | [forum:49117] |
I understand that cforest "avoids" the bootstrap and instead sub-samples by default. For experimentation purposes I am enabling the bootstrap by setting perturb = list(replace = TRUE, fraction = 1). In other RF implementations (such as ranger and randomForest) the returned object contains a list of the inbag indices, which I find very useful (for many reasons: debugging and trying out new oob correction schemes). Is this possible in partykit/cforest ? Thx, Makus |