7. Experimental features#
All features in this section are experimental and thus not yet fully documented and tested. Please open an issue here if you encounter any problems or have any questions.
7.1. Balancing Tests#
Treatment effects may be subject to selection bias if the distribution of the confounding features differs across treatment arms. The class ModifiedCausalForest
provides the option to conduct balancing tests to assess whether the feature distributions are equal across treatment arms after adjustment by the Modified Causal Forest. The balancing tests are based on the estimation of average treatment effects (\(\text{ATE's}\)) with user-specified features as outcomes. If the features are balanced across treatment arms, the estimated \(\text{ATE's}\) should be close to zero.
The Modified Causal Forest runs balancing tests for the features specified in the parameters var_x_balance_name_ord
and var_x_balance_name_unord
if the parameter p_bt_yes
is set to True. See also the table below.
Parameter |
Description |
---|---|
|
If True, balancing tests for the features specified in |
|
Only relevant if |
|
Only relevant if |
Please consult the API
for more details.
The results of the balancing tests are part of the txt-file in the output folder that the mcf package generates. You can find the location of this folder by accessing the “outpath” entry of the gen_dict attribute of your Modified Causal Forest:
from mcf.example_data_functions import example_data
from mcf.mcf_functions import ModifiedCausalForest
# Generate example data using the built-in function `example_data()`
training_df, prediction_df, name_dict = example_data()
my_mcf = ModifiedCausalForest(
var_y_name="outcome",
var_d_name="treat",
var_x_name_ord="x_cont0"
)
my_mcf.gen_dict["outpath"]
7.1.1. Example#
from mcf.example_data_functions import example_data
from mcf.mcf_functions import ModifiedCausalForest
# Generate example data using the built-in function `example_data()`
training_df, prediction_df, name_dict = example_data()
my_mcf = ModifiedCausalForest(
var_y_name="outcome",
var_d_name="treat",
var_x_name_ord=["x_cont0", "x_cont1", "x_ord1"],
var_x_name_unord=["x_unord0"],
# Parameters for balancing tests:
p_bt_yes=True,
var_x_balance_name_ord=["x_cont0", "x_cont1", "x_ord1"],
var_x_balance_name_unord=["x_unord0"]
)
my_mcf.train(training_df)
results, _ = my_mcf.predict(prediction_df)
7.2. Sensitivity checks#
The method sensitivity()
of the ModifiedCausalForest
class contains some simulation-based tools to check how well the Modified Causal Forest works in removing selection bias and how sensitive the results are with respect to potentially missing confounding covariates (i.e., those related to treatment and potential outcomes).
A paper by Armendariz-Pacheco, Lechner, and Mareckova (2024) will discuss and investigate the different methods in detail. So far, please note that all methods are simulation based.
The sensitivity checks consist of the following steps:
Estimate all treatment probabilities.
Remove all observations from treatment states other than one (largest treatment or user-determined).
Use estimated probabilities to simulate treated observations, respecting the original treatment shares (pseudo-treatments).
Estimate the effects of pseudo-treatments. The true effects are known to be zero, so the deviation from 0 is used as a measure of result sensitivity.
Steps 3 and 4 may be repeated, and results averaged to reduce simulation noise.
Please consult the API for details on how to use the sensitivity()
method.