In machine learning, feature selection is the process of selecting a subset of relevant features (variables, predictors) for use in model construction. Feature selection techniques are used for several reasons:
simplification of models to make them easier to interpret,[1]
The central premise when using feature selection is that data sometimes contains features that are redundant or irrelevant, and can thus be removed without incurring much loss of information.[9] Redundancy and irrelevance are two distinct notions, since one relevant feature may be redundant in the presence of another relevant feature with which it is strongly correlated.[10]
Feature extraction creates new features from functions of the original features, whereas feature selection finds a subset of the features. Feature selection techniques are often used in domains where there are many features and comparatively few samples (data points).