Automatic problem-specific hyperparameter optimization and model selection for supervised machine learning: Technical Report
Technical Report
OPEN ACCESS
Author / Producer
Date
2015
Publication Type
Report
ETH Bibliography
yes
Citations
Altmetric
OPEN ACCESS
Data
Rights / License
Abstract
The use of machine learning techniques has become increasingly widespread in commercial applications and academic research. Machine learning algorithms learn a model from data that allows computers to make and improve predictions or behaviors. Despite their popularity and usefulness, most machine learning techniques require expert knowledge to guide the decisions about the most appropriate model and settings for a particular problem. In many cases, expert knowledge is not readily available. When it is, the complexity of the problem and subjectivity of the expert can often lead to sub-optimal choices in the machine learning strategy.
Since different machine learning techniques are suitable for different problems, choosing the right technique and fine-tuning its particular settings are crucial tasks that will directly impact the quality of the predictions. However, deciding which machine learning technique is most well suited for processing specific data is not an easy task, as the number of choices is usually very large.
In this work, we present a method that automatically selects the best machine learning algorithm for a particular set of data, and optimizes its parameter settings. Our approach is flexible and customizable, enabling the user to specify their needs in terms of predictive power, sensitivity, specificity, consistency of the predictions, and speed, among other criteria. The results obtained show that using the machine learning technique and configuration sug- gested by our automated approach yields predictions of a much higher quality than selecting the technique with the best results under its default settings. We also present a method to efficiently guide the search for optimal parameter settings by identifying ranges of values for each setting that produce good results for most problems. By transferring this knowledge to new problems, it is possible to find the optimal configuration of the algorithm more quickly.
Since different machine learning techniques are suitable for different problems, choosing the right technique and fine-tuning its particular settings are crucial tasks that will directly impact the quality of the predictions. However, deciding which machine learning technique is most well suited for processing specific data is not an easy task, as the number of choices is usually very large.
In this work, we present a method that automatically selects the best machine learning algorithm for a particular set of data, and optimizes its parameter settings. Our approach is flexible and customizable, enabling the user to specify their needs in terms of predictive power, sensitivity, specificity, consistency of the predictions, and speed, among other criteria. The results obtained show that using the machine learning technique and configuration sug- gested by our automated approach yields predictions of a much higher quality than selecting the technique with the best results under its default settings. We also present a method to efficiently guide the search for optimal parameter settings by identifying ranges of values for each setting that produce good results for most problems. By transferring this knowledge to new problems, it is possible to find the optimal configuration of the algorithm more quickly.
Permanent link
Publication status
published
External links
Editor
Book title
Journal / series
Volume
Pages / Article No.
Publisher
ETH-Zürich
Event
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
DYNAMIC PROGRAMMING (OPERATIONS RESEARCH); MACHINE LEARNING (ARTIFICIAL INTELLIGENCE); DYNAMISCHE OPTIMIERUNG (OPERATIONS RESEARCH); MASCHINELLES LERNEN (KÜNSTLICHE INTELLIGENZ); SUPERVISED LEARNING (ARTIFICIAL INTELLIGENCE); ÜBERWACHTES LERNEN (KÜNSTLICHE INTELLIGENZ); Hyperparameter optimization; Supervised Machine Learning; Model selection
Organisational unit
02150 - Dep. Informatik / Dep. of Computer Science