2. Data cleaning#
The class ModifiedCausalForest
has several data cleaning options to improve the estimation quality of your Modified Causal Forest. Below, you find a table with the relevant parameters and a brief description:
Parameter |
Description |
---|---|
|
If True, all observations with missing values are dropped. Variables not required in the analysis are also removed. Default: True. |
|
If True, covariates are screened and cleaned. Specifically, features without variation are dropped. Default: True. |
|
If |
|
If |
Please consult the API
for more details.
2.1. Example#
from mcf.example_data_functions import example_data
from mcf.mcf_functions import ModifiedCausalForest
# Generate example data using the built-in function `example_data()`
training_df, prediction_df, name_dict = example_data()
my_mcf = ModifiedCausalForest(
var_y_name="outcome",
var_d_name="treat",
var_x_name_ord=["x_cont0", "x_cont1", "x_ord1"],
# Parameters for data cleaning:
dc_clean_data=True,
dc_screen_covariates=True,
dc_check_perfectcorr=False,
dc_min_dummy_obs=100
)