Feature selection

In machine learning, feature selection is the process of selecting a subset of relevant features (variables, predictors) for use in model construction. Feature selection techniques are used for several reasons:

The central premise when using feature selection is that data sometimes contains features that are redundant or irrelevant, and can thus be removed without incurring much loss of information.[9] Redundancy and irrelevance are two distinct notions, since one relevant feature may be redundant in the presence of another relevant feature with which it is strongly correlated.[10]

Feature extraction creates new features from functions of the original features, whereas feature selection finds a subset of the features. Feature selection techniques are often used in domains where there are many features and comparatively few samples (data points).

  1. ^ Gareth James; Daniela Witten; Trevor Hastie; Robert Tibshirani (2013). An Introduction to Statistical Learning. Springer. p. 204.
  2. ^ Brank, Janez; Mladenić, Dunja; Grobelnik, Marko; Liu, Huan; Mladenić, Dunja; Flach, Peter A.; Garriga, Gemma C.; Toivonen, Hannu; Toivonen, Hannu (2011), "Feature Selection", in Sammut, Claude; Webb, Geoffrey I. (eds.), Encyclopedia of Machine Learning, Boston, MA: Springer US, pp. 402–406, doi:10.1007/978-0-387-30164-8_306, ISBN 978-0-387-30768-8, retrieved 2021-07-13
  3. ^ Kramer, Mark A. (1991). "Nonlinear principal component analysis using autoassociative neural networks". AIChE Journal. 37 (2): 233–243. Bibcode:1991AIChE..37..233K. doi:10.1002/aic.690370209. ISSN 1547-5905.
  4. ^ Kratsios, Anastasis; Hyndman, Cody (2021). "NEU: A Meta-Algorithm for Universal UAP-Invariant Feature Representation". Journal of Machine Learning Research. 22 (92): 1–51. ISSN 1533-7928.
  5. ^ Persello, Claudio; Bruzzone, Lorenzo (July 2014). "Relevant and invariant feature selection of hyperspectral images for domain generalization". 2014 IEEE Geoscience and Remote Sensing Symposium (PDF). IEEE. pp. 3562–3565. doi:10.1109/igarss.2014.6947252. ISBN 978-1-4799-5775-0. S2CID 8368258.
  6. ^ Hinkle, Jacob; Muralidharan, Prasanna; Fletcher, P. Thomas; Joshi, Sarang (2012). "Polynomial Regression on Riemannian Manifolds". In Fitzgibbon, Andrew; Lazebnik, Svetlana; Perona, Pietro; Sato, Yoichi; Schmid, Cordelia (eds.). Computer Vision – ECCV 2012. Lecture Notes in Computer Science. Vol. 7574. Berlin, Heidelberg: Springer. pp. 1–14. arXiv:1201.2395. doi:10.1007/978-3-642-33712-3_1. ISBN 978-3-642-33712-3. S2CID 8849753.
  7. ^ Yarotsky, Dmitry (2021-04-30). "Universal Approximations of Invariant Maps by Neural Networks". Constructive Approximation. 55: 407–474. arXiv:1804.10306. doi:10.1007/s00365-021-09546-1. ISSN 1432-0940. S2CID 13745401.
  8. ^ Hauberg, Søren; Lauze, François; Pedersen, Kim Steenstrup (2013-05-01). "Unscented Kalman Filtering on Riemannian Manifolds". Journal of Mathematical Imaging and Vision. 46 (1): 103–120. Bibcode:2013JMIV...46..103H. doi:10.1007/s10851-012-0372-9. ISSN 1573-7683. S2CID 8501814.
  9. ^ Kratsios, Anastasis; Hyndman, Cody (June 8, 2021). "NEU: A Meta-Algorithm for Universal UAP-Invariant Feature Representation". Journal of Machine Learning Research. 22: 10312. Bibcode:2015NatSR...510312B. doi:10.1038/srep10312. PMC 4437376. PMID 25988841.
  10. ^ Cite error: The named reference guyon-intro was invoked but never defined (see the help page).