SCM

Forum: help

Monitor Forum | Start New Thread Start New Thread
RE: inbag indices [ Reply ]
By: Achim Zeileis on 2021-09-10 20:21
[forum:49120]
Very nice, thanks for sharing, Markus!

RE: inbag indices [ Reply ]
By: Markus Loecher on 2021-09-10 20:04
[forum:49119]
Thanks, that is really helpful!
Now the inbag indices make a lot of sense, and in fact cforest seems to be the only package I tested for which I find agreement between a tree fitted on the bootstrapped data indexed manually as shown here:

https://markusloecher.github.io/Testing-Inbag-Indices/

RE: inbag indices [ Reply ]
By: Achim Zeileis on 2021-09-10 18:14
[forum:49118]
I think you don't want fraction = 1 because replace = TRUE already implies n-out-of-n observations, see ?cforest.

And then you can infer the indexes from the weights():

set.seed(1)
cf <- cforest(dist ~ speed, data = cars, ntree = 3, perturb = list(replace = TRUE))
weights(cf)
## [[1]]
## [1] 1 0 0 1 0 4 1 0 1 2 0 0 0 1 2 0 0 1 0 3 3 0 2 0 3 0 0 1 0 0 0 0 2 2 1 0 3 1
## [39] 2 1 1 4 1 3 0 2 1 0 0 0
##
## [[2]]
## [1] 1 1 0 0 0 2 0 0 1 0 1 0 2 2 0 0 2 1 1 1 1 2 3 1 2 1 0 1 2 0 1 1 1 0 0 1 0 2
## [39] 3 3 0 0 1 2 2 1 0 3 0 1
##
## [[3]]
## [1] 2 0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 0 0 1 0 0 3 0 2 0 2 1 2 3 1 0 1 1 1 3 0 1 0
## [39] 2 1 2 2 2 1 0 2 1 1 3 2

sapply(cf$weights, sum)
## [1] 50 50 50

inbag indices [ Reply ]
By: Markus Loecher on 2021-09-10 15:19
[forum:49117]
I understand that cforest "avoids" the bootstrap and instead sub-samples by default. For experimentation purposes I am enabling the bootstrap by setting perturb = list(replace = TRUE, fraction = 1).

In other RF implementations (such as ranger and randomForest) the returned object contains a list of the inbag indices, which I find very useful (for many reasons: debugging and trying out new oob correction schemes).

Is this possible in partykit/cforest ?

Thx,
Makus

Thanks to:
Vienna University of Economics and Business Powered By FusionForge