Loading...

[PLUS] Hypertuning Machine Learning Models for Better Performance


Overview
The ultimate goal of a machine learning model is to generalize well to unseen data (and hence make better predictions). Hypertuning allows us to find the best combination of hyperparameters for a model and by using different values can significantly impact the model's performance. Hyperparameters refer to the values that are the parameters that are manually set before training the model. So for example, the "C" penalty parameter in a Support Vector Machine (SVM) model is an example of a "hyperparameter". The big picture: By systematically exploring and selecting optimal values, we can enhance the model's accuracy, precision, recall, or other evaluation metrics, improving the overall model performance. We will expand on the code we wrote for Training an SVM Classification Model to hypertune its parameters, and evaluate the performance differences between parameter settings. Another benefit of hypertuning the model is that it helps prevent overfitting the model to a specific training set, and therefore perform better on new, unseen data.

The Code
As explained above, we will modify the previous code for building a machine learning support vector machine classification model and include a new function to hypertune the parameters. Essentially, we need to loop through all possible model variations, and record the scoring metrics for each, and then stack and rank the best. For an SVM model, this includes a range of values for the penalty parameter, "C", and all of the kernels available. (ie. Linear, RBF, Polynomial, Sigmoid). Certain kernels have additional parameters that may need tuning, so we will start by picking a kernel and then adjusting all parameters underneath it. Another difference in the new code is that we break down the steps of model creation (splitting the data, standardizing/scaling it, and parameter setting) into their own functions to provide more flexibility. The script will save a results file called "model_tuning_results.csv" in the same folder that the script was executed and they will be sorted by the highest F1 Scores in descending order. As always, all of our code is commented line by line so that you can follow any logic exactly as we write it and you can modify to fit your purposes.

The Results
Results for the SVM model hyperparameter tuning show that the Polynomial kernel produced both the highest F1 Scores and Recall scores, but the RBF (Radial Basis function) kernel produces the best accuracy sores (although only slightly better than a coinflip). Maybe the most interesting thing to note is that although the overall accuracy is not high, the high recall score seems to show that the model does a great job of predicting positive days (and struggles predicting down days).

This is a premium post. Create Plus+ Account to view the live, working codebase for this article.




Notice: Information contained herein is not and should not be construed as an offer, solicitation, or recommendation to buy or sell securities. The information has been obtained from sources we believe to be reliable; however no guarantee is made or implied with respect to its accuracy, timeliness, or completeness. Author does not own the any crypto currency discussed. The information and content are subject to change without notice. CryptoDataDownload and its affiliates do not provide investment, tax, legal or accounting advice.

This material has been prepared for informational purposes only and is the opinion of the author, and is not intended to provide, and should not be relied on for, investment, tax, legal, accounting advice. You should consult your own investment, tax, legal and accounting advisors before engaging in any transaction. All content published by CryptoDataDownload is not an endorsement whatsoever. CryptoDataDownload was not compensated to submit this article. Please also visit our Privacy policy; disclaimer; and terms and conditions page for further information.

THE PERFORMANCE OF TRADING SYSTEMS IS BASED ON THE USE OF COMPUTERIZED SYSTEM LOGIC. IT IS HYPOTHETICAL. PLEASE NOTE THE FOLLOWING DISCLAIMER. CFTC RULE 4.41: HYPOTHETICAL OR SIMULATED PERFORMANCE RESULTS HAVE CERTAIN LIMITATIONS. UNLIKE AN ACTUAL PERFORMANCE RECORD, SIMULATED RESULTS DO NOT REPRESENT ACTUAL TRADING. ALSO, SINCE THE TRADES HAVE NOT BEEN EXECUTED, THE RESULTS MAY HAVE UNDER-OR-OVER COMPENSATED FOR THE IMPACT, IF ANY, OF CERTAIN MARKET FACTORS, SUCH AS LACK OF LIQUIDITY. SIMULATED TRADING PROGRAMS IN GENERAL ARE ALSO SUBJECT TO THE FACT THAT THEY ARE DESIGNED WITH THE BENEFIT OF HINDSIGHT. NO REPRESENTATION IS BEING MADE THAT ANY ACCOUNT WILL OR IS LIKELY TO ACHIEVE PROFIT OR LOSSES SIMILAR TO THOSE SHOWN. U.S. GOVERNMENT REQUIRED DISCLAIMER: COMMODITY FUTURES TRADING COMMISSION. FUTURES AND OPTIONS TRADING HAS LARGE POTENTIAL REWARDS, BUT ALSO LARGE POTENTIAL RISK. YOU MUST BE AWARE OF THE RISKS AND BE WILLING TO ACCEPT THEM IN ORDER TO INVEST IN THE FUTURES AND OPTIONS MARKETS. DON’T TRADE WITH MONEY YOU CAN’T AFFORD TO LOSE. THIS IS NEITHER A SOLICITATION NOR AN OFFER TO BUY/SELL FUTURES OR OPTIONS. NO REPRESENTATION IS BEING MADE THAT ANY ACCOUNT WILL OR IS LIKELY TO ACHIEVE PROFITS OR LOSSES SIMILAR TO THOSE DISCUSSED ON THIS WEBSITE. THE PAST PERFORMANCE OF ANY TRADING SYSTEM OR METHODOLOGY IS NOT NECESSARILY INDICATIVE OF FUTURE RESULTS.

Latest Posts
Follow Us
Notify me of new content