Hyperparameter (machine learning)

In machine learning, a hyperparameter is a parameter that can be set in order to define any configurable part of a model's learning process. Hyperparameters can be classified as either model hyperparameters (such as the topology and size of a neural network) or algorithm hyperparameters (such as the learning rate and the batch size of an optimizer). These are named hyperparameters in contrast to parameters, which are characteristics that the model learns from the data.

Hyperparameters are not required by every model or algorithm. Some simple algorithms such as ordinary least squares regression require none. However, the LASSO algorithm, for example, adds a regularization hyperparameter to ordinary least squares which must be set before training.^[1] Even models and algorithms without a strict requirement to define hyperparameters may not produce meaningful results if these are not carefully chosen. However, optimal values for hyperparameters are not always easy to predict. Some hyperparameters may have no meaningful effect, or one important variable may be conditional upon the value of another. Often a separate process of hyperparameter tuning is needed to find a suitable combination for the data and task.

As well was improving model performance, hyperparameters can be used to by researchers introduce robustness and reproducibility into their work, especially if it uses models that incorporate random number generation.

^ Yang, Li; Shami, Abdallah (2020-11-20). "On hyperparameter optimization of machine learning algorithms: Theory and practice". Neurocomputing. 415: 295–316. arXiv:2007.15745. doi:10.1016/j.neucom.2020.07.061. ISSN 0925-2312. S2CID 220919678.

[1] Yang, Li; Shami, Abdallah (2020-11-20). "On hyperparameter optimization of machine learning algorithms: Theory and practice". Neurocomputing. 415: 295–316. arXiv:2007.15745. doi:10.1016/j.neucom.2020.07.061. ISSN 0925-2312. S2CID 220919678.

[1]