Ad

How To Implement Feature Selection For Categorical Variables?

I'm having a problem selecting the important feature. The features for the dataset are categorical and numerical. The target variable is False or True. The features for the dataset are about 100, so I need to drop some of the features that are not related to the target variable. Which method can be used other than Random Forest feature importance? I'm using Python. In R I can use Boruta package to select the important features. but I do not know how to do this in Python.

Ad

Answer

Selecting relevant features can be done by calculating the P-value of the feature relating to the hypothesis, check https://towardsdatascience.com/feature-selection-correlation-and-p-value-da8921bfb3cf.

Ad
source: stackoverflow.com
Ad