SCM

Forum: help

Monitor Forum | Start New Thread Start New Thread
RE: extracting surrogate features [ Reply ]
By: Achim Zeileis on 2024-01-12 14:23
[forum:49830]
Note that the ... argument is passed to ctree_control() in cforest:

cforest(formula, data,
control = ctree_control(teststat = "quad", testtype = "Univ", mincriterion = 0, saveinfo = FALSE, ...),
...)

Thus, you can either set up the control argument explicitly:

cforest(Species ~ ., data = iris, control = ctree_control(maxsurrogate = 3, maxdepth = 4))

Or you can do so implicitly:

cforest(Species ~ ., data = iris, maxsurrogate = 3, maxdepth = 4)

But you cannot mix the two approaches like you tried. In your case, the hard-coded ctree_control() list prevents that maxsurrogate is passed on to it.

RE: extracting surrogate features [ Reply ]
By: Nazrath Nawaz on 2024-01-12 13:53
[forum:49829]
Thank you for the swift response. That is definitely much faster.

I am noticing that the moment you add parameters into ctree_control (such as minsplit, maxdepth), the surrogate information becomes NULL.
Is there something I am missing or doing wrong here?

Example:
cf <- partykit::cforest(Species ~ ., data = iris, maxsurrogate = 3,
control = partykit::ctree_control(maxdepth = 4))

nodeapply(cf$nodes[[10]], ids = 3, function(n) n$surrogates[[2]])


RE: extracting surrogate features [ Reply ]
By: Achim Zeileis on 2024-01-11 23:44
[forum:49828]
In partykit cforest objects, the trees are just in the $nodes element. They are not full ctree objects but they contain the partynode structure containing the essential information including surrogates:

library("partykit")
set.seed(0)
cf <- cforest(Species ~ ., data = iris, maxsurrogate = 3)
length(cf$nodes)
## [1] 500

Here, we have 500 trees (of class "partynode") in the forest. For each of these we can apply the strategy from the SO answer. For example, let's extract from tree number 10, in node number 3, the surrogate split 2:

nodeapply(cf$nodes[[10]], ids = 3, function(n) n$surrogates[[2]])
## [[1]]
## $varid
## [1] 2
##
## $breaks
## [1] 7.9
##
## $index
## [1] 1 2
##
## $right
## [1] TRUE
##
## $prob
## NULL
##
## $info
## NULL
##
## attr(,"class")
## [1] "partysplit"

RE: extracting surrogate features [ Reply ]
By: Nazrath Nawaz on 2024-01-11 16:10
[forum:49827]
Thank you for the response.

The example works well for ctree but I am looking to do this for all trees/nodes from a forest. I have managed to extract a single tree using GetCtree from moreparty but this has only worked on forest objects made using party. Is it possible to extract a single tree from partykit models and pursue the route of the answer you provided or is there another way to extract surrogate splits from cforests?

RE: extracting surrogate features [ Reply ]
By: Achim Zeileis on 2024-01-05 14:36
[forum:49824]
Thanks for your interest. In general, the surrogate feature not only depends on the missing feature but also on the node in the tree where this is needed.

To extract a specific surrogate split from a specific node, see:

https://stackoverflow.com/questions/39081673/how-to-get-the-surrogate-splits-in-a-ctree-model-party-package-in-r

extracting surrogate features [ Reply ]
By: Nazrath Nawaz on 2024-01-05 11:43
[forum:49823]
Dear authors,

I have a specific question and wanted your advice and guidance on this. I have built a partykit model through tidymodels allowing for surrogate split points. My test dataset has missing values and I am interested to know which features the partykit model uses as surrogate features for each missing feature. I want to extract this information from the model or predictions. I appreciate your help.

Thanks to:
Vienna University of Economics and Business Powered By FusionForge