The goal of feature-selection methods is the reduction of the dimensionality of the dataset by removing features that are considered irrelevant for the classification. Feature selection is an essential part of text classification. Document collections have 10,000 to 100,000 or more unique words. Many words are not useful for classification. Restricting the set of words that are used for classification makes classification more efficient and can improve generalization error.
Feature selection procedure has been shown to present a number of advantages, including smaller dataset size, smaller computational requirements for the text classification algorithms (especially those that do not scale well with the feature set size) and considerable shrinking of the search space.
Another benefit of feature selection is its tendency to reduce overfitting, i.e. the phenomenon by which a classifier is tuned also to the contingent characteristics of the training data rather than the constitutive characteristics of the categories, and therefore, to increase generalization.