-
Notifications
You must be signed in to change notification settings - Fork 2
06 Step Two
As described in the method, we define
We define a new variable $Z^{}$, $Z^{}_i=1$ if
To compute a classifiction tree, vt.tree("class", ...)
is used. Internally, rpart::rpart()
is computed. It takes in argument:
-
tree.type
: it has to be set to"class"
-
vt.difft
:VT.difft
object (return ofvt.forest()
function) -
sens
:c(">", "<")
.sens
corresponds to the way$Z^{*}$ is defined.-
">"
(default) : $Z^{}$, $Z^{}_i=1$ if$Z_i > c$ and$Z^{*}_i=0$ otherwise. -
"<"
: $Z^{}$, $Z^{}_i=1$ if$Z_i < c$ and$Z^{*}_i=0$ otherwise.
-
-
threshold
: corresponds to$c$ , it can be a vector.$seq(.5, .8, .1)$ by default. -
screening
:NULL
is default value. IfTRUE
only covariables invarimp
vt.data
's field is used.
See ?VT.tree
for details.
# initialize classification tree
tr.class <- vt.tree("class",
vt.difft = vt.f.rf,
sens = ">",
threshold = quantile(vt.f.rf$difft, seq(.5, .8, .1)),
maxdepth = 3,
cp = 0,
maxcompete = 2)
# tr.class is a list if threshold is a vectoor
class(tr.class)
## [1] "list"
# acce trees with treeXX
class(tr.class$tree1)
## [1] "VT.tree.class"
## attr(,"package")
## [1] "aVirtualTwins"
Use regression tree to explain
The function to use is vt.tree("reg", ...)
. It takes same parameters than classification mehod.
# initialize regression tree
tr.reg <- vt.tree("reg",
vt.difft = vt.f.rf,
sens = ">",
threshold = quantile(vt.f.rf$difft, seq(.5, .8, .1)))
# tr.class is a list if threshold is a vectoor
class(tr.reg)
## [1] "list"
# access trees with treeXX
class(tr.reg$tree1)
## [1] "VT.tree.reg"
## attr(,"package")
## [1] "aVirtualTwins"
Once trees have been computed, you surely want to see what are the subgroups. This package provides a wrapper function of intern methods of VT.tree
class : vt.subgroups()
.
This function takes in argument :
-
vt.tree
: object or list of classVT.tree
. Return of thevt.tree()
function. -
only.leaf
: logical. SetTRUE
(default) to visualize only terminal nodes. -
only.fav
: logical. SetTRUE
(default) to visualize only class 1 nodes. ($\hat{A}$ ) -
tables
: logical. SetFALSE
(default) to prevent tables of incidences from being printed. -
verbose
: logical. SetFALSE
(default) to prevent detailed stuffs from being printed. -
compete
: logical. SetTRUE
to print competitors rules thanks to competitors split.FALSE
is default value.
If vt.tree
is a list, unique subgroups are printed.
# use tr.class computed previously
vt.sbgrps <- vt.subgroups(tr.class)
# print tables with knitr package
library(knitr)
knitr::kable(vt.sbgrps)
Subgroup Subgroup size Treatement event rate Control event rate Treatment sample size Control sample size RR (resub) RR (snd)
tree1.11 PRAPACHE< 26.5 & AGE>=51.74 & BLGCS< 11.5 49 0.485 0.375 33 16 1.293 1.043 tree1.3 PRAPACHE>=26.5 157 0.752 0.327 105 52 2.300 1.865 tree2.11 PRAPACHE< 26.5 & BLGCS< 10.5 & AGE>=64.68 14 0.7 0.5 10 4 1.400 1.200 tree2.13 PRAPACHE>=26.5 & AGE< 51.74 & BLLCREAT< 2.3 21 0.375 0.6 16 5 0.625 1.506 tree2.7 PRAPACHE>=26.5 & AGE>=51.74 120 0.897 0.31 78 42 2.894 2.007 tree3.13 PRAPACHE>=26.5 & AGE< 51.74 & BLLBILI>=2.95 13 0.333 0.25 9 4 1.332 1.564 tree4 PRAPACHE>=26.5 & AGE>=51.74 & PRAPACHE>=28.5 89 0.915 0.333 59 30 2.748 2.087
You can plot one tree with package rpart.plot
If you want to see competitors split :
tr.class$tree2$createCompetitors()
head(tr.class$tree2$competitors)
## count ncat improve index var path string
## 1 470 -1 119.987009 26.500 PRAPACHE 2 PRAPACHE < 26.5
## 2 470 1 37.018261 10.500 BLGCS 2 BLGCS >= 10.5
## 3 470 -1 30.491058 61.671 AGE 2 AGE < 61.671
## 9 313 1 6.349646 10.500 BLGCS 4 BLGCS >= 10.5
## 10 313 -1 5.075061 61.671 AGE 4 AGE < 61.671
## 11 313 -1 2.027870 19.500 PRAPACHE 4 PRAPACHE < 19.5
If you want to print incidence of a subgroup :
vt.o$getIncidences("PRAPACHE >= 26 & AGE >= 52")
## $table.selected
## $table.selected$table
## trt
## resp 0 1 sum
## 0 30 18 48
## 1 15 72 87
## sum 45 90 135
## Incidence 0.333 0.8 0.644
##
## $table.selected$rr
## [1] 2.402402
##
##
## $table.not.selected
## $table.not.selected$table
## trt
## resp 0 1 sum
## 0 71 170 241
## 1 37 57 94
## sum 108 227 335
## Incidence 0.343 0.251 0.281
##
## $table.not.selected$rr
## [1] 0.7317784
# or
# tr.class$tree2$getIncidences("PRAPACHE >= 26 & AGE >= 52")
If you want to get infos about the tree
tr.class$tree2$getInfos()
##
## Threshold = 0.0675
## Delta = 0.0671
## Sens : >
## Size of Ahat : 155
# access Ahat
# tr.class$tree2$Ahat
You can re-run rpart computation:
tr.class$tree2$run(maxdepth = 2)
Type ?VT.tree
for details.