[PLUS] Hypertuning Machine Learning Models for Better Performance
The ultimate goal of a machine learning model is to generalize well to unseen data (and hence make better predictions). Hypertuning allows us to find the best combination of hyperparameters for a model and by using different values can significantly impact the model's performance. Hyperparameters refer to the values that are the parameters that are manually set before training the model. So for example, the "C" penalty parameter in a Support Vector Machine (SVM) model is an example of a "hyperparameter". The big picture: By systematically exploring and selecting optimal values, we can enhance the model's accuracy, precision, recall, or other evaluation metrics, improving the overall model performance. We will expand on the code we wrote for Training an SVM Classification Model to hypertune its parameters, and evaluate the performance differences between parameter settings. Another benefit of hypertuning the model is that it helps prevent overfitting the model to a specific training set, and therefore perform better on new, unseen data.
As explained above, we will modify the previous code for building a machine learning support vector machine classification model and include a new function to hypertune the parameters. Essentially, we need to loop through all possible model variations, and record the scoring metrics for each, and then stack and rank the best. For an SVM model, this includes a range of values for the penalty parameter, "C", and all of the kernels available. (ie. Linear, RBF, Polynomial, Sigmoid). Certain kernels have additional parameters that may need tuning, so we will start by picking a kernel and then adjusting all parameters underneath it. Another difference in the new code is that we break down the steps of model creation (splitting the data, standardizing/scaling it, and parameter setting) into their own functions to provide more flexibility. The script will save a results file called "model_tuning_results.csv" in the same folder that the script was executed and they will be sorted by the highest F1 Scores in descending order. As always, all of our code is commented line by line so that you can follow any logic exactly as we write it and you can modify to fit your purposes.
Results for the SVM model hyperparameter tuning show that the Polynomial kernel produced both the highest F1 Scores and Recall scores, but the RBF (Radial Basis function) kernel produces the best accuracy sores (although only slightly better than a coinflip). Maybe the most interesting thing to note is that although the overall accuracy is not high, the high recall score seems to show that the model does a great job of predicting positive days (and struggles predicting down days).
This is a premium post. Create Plus+ Account to view the live, working codebase for this article.
Notice: Information contained herein is not and should not be construed as an offer, solicitation, or recommendation to buy or sell securities. The information has been obtained from sources we
believe to be reliable; however no guarantee is made or implied with respect to its accuracy, timeliness, or completeness. Author does not own the any crypto currency discussed. The information
and content are subject to change without notice. CryptoDataDownload and its affiliates do not provide investment, tax, legal or accounting advice.
This material has been prepared for informational purposes only and is the opinion of the author, and is not intended to provide, and should not be relied on for, investment, tax, legal,
accounting advice. You should consult your own investment, tax, legal and accounting advisors before engaging in any transaction. All content published by CryptoDataDownload is not an