Hyperparameter Tuning with GridSearchCV

In almost any Machine Learning project, we train different models on the dataset and select the one with the best performance. However, there is room for improvement as we cannot say for sure that this particular model is best for the problem at hand. Hence, our aim is to improve the model in any way possible. One important factor in the performances of these models are their hyperparameters, once we set appropriate values for these hyperparameters, the performance of a model can improve significantly. In this article, we will find out how we can find optimal values for the hyperparameters of a model by using GridSearchCV.

What is GridSearchCV?

GridSearchCV is the process of performing hyperparameter tuning in order to determine the optimal values for a given model. As mentioned above, the performance of a model significantly depends on the value of hyperparameters. Note that there is no way to know in advance the best values for hyperparameters so ideally, we need to try all possible values to know the optimal values. Doing this manually could take a considerable amount of time and resources and thus we use GridSearchCV to automate the tuning of hyperparameters.

GridSearchCV is a function that comes in Scikit-learn’s(or SK-learn) model_selection package.So an important point here to note is that we need to have the Scikit learn library installed on the computer. This function helps to loop through predefined hyperparameters and fit your estimator (model) on your training set. So, in the end, we can select the best parameters from the listed hyperparameters.

How does GridSearchCV work?

As mentioned above, we pass predefined values for hyperparameters to the GridSearchCV function. We do this by defining a dictionary in which we mention a particular hyperparameter along with the values it can take. Here is an example of it

 { 'C': [0.1, 1, 10, 100, 1000],  
   'gamma': [1, 0.1, 0.01, 0.001, 0.0001], 
   'kernel': ['rbf',’linear’,'sigmoid']  }

Here C, gamma and kernels are some of the hyperparameters of an SVM model. Note that the rest of the hyperparameters will be set to their default values

GridSearchCV tries all the combinations of the values passed in the dictionary and evaluates the model for each combination using the Cross-Validation method. Hence after using this function we get accuracy/loss for every combination of hyperparameters and we can choose the one with the best performance.

How to use GridSearchCV?

In this section, we shall see how to use GridSearchCV and also find out how it improves the performance of the model.

First, let us see what are the various arguments that are taken by GridSearchCV function:

sklearn.model_selection.GridSearchCV(estimator, param_grid,scoring=None,
          n_jobs=None, iid='deprecated', refit=True, cv=None, verbose=0, 
          pre_dispatch='2*n_jobs', error_score=nan, return_train_score=False)

We are going to briefly describe a few of these parameters and the rest you can see on the original documentation:

1.estimator: Pass the model instance for which you want to check the hyperparameters.
2.params_grid: the dictionary object that holds the hyperparameters you want to try
3.scoring: evaluation metric that you want to use, you can simply pass a valid string/ object of evaluation metric
4.cv: number of cross-validation you have to try for each selected set of hyperparameters
5.verbose: you can set it to 1 to get the detailed print out while you fit the data to GridSearchCV
6.n_jobs: number of processes you wish to run in parallel for this task if it -1 it will use all available processors.

Now, let us see how to use GridSearchCV to improve the accuracy of our model. Here I am going to train the model twice, once without using GridsearchCV(using the default hyperparameters) and the other time we will use GridSearchCV to find the optimal values of hyperparameters for the dataset at hand. I am using the famous Breast Cancer Wisconsin (Diagnostic) Data Set which I am directly importing from the Scikit-learn library here.

#import all necessary libraries
import sklearn
from sklearn.datasets import load_breast_cancer
from sklearn.metrics import classification_report, confusion_matrix 
from sklearn.datasets import load_breast_cancer 
from sklearn.svm import SVC 
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import train_test_split 

#load the dataset and split it into training and testing sets
dataset = load_breast_cancer()
X=dataset.data
Y=dataset.target
X_train, X_test, y_train, y_test = train_test_split( 
                        X,Y,test_size = 0.30, random_state = 101) 
# train the model on train set without using GridSearchCV 
model = SVC() 
model.fit(X_train, y_train) 
  
# print prediction results 
predictions = model.predict(X_test) 
print(classification_report(y_test, predictions))

OUTPUT:
 precision    recall  f1-score   support

           0       0.95      0.85      0.90        66
           1       0.91      0.97      0.94       105

    accuracy                           0.92       171
   macro avg       0.93      0.91      0.92       171
weighted avg       0.93      0.92      0.92       171

# defining parameter range 
param_grid = {'C': [0.1, 1, 10, 100],  
              'gamma': [1, 0.1, 0.01, 0.001, 0.0001], 
              'gamma':['scale', 'auto'],
              'kernel': ['linear']}  
  
grid = GridSearchCV(SVC(), param_grid, refit = True, verbose = 3,n_jobs=-1) 
  
# fitting the model for grid search 
grid.fit(X_train, y_train) 

# print best parameter after tuning 
print(grid.best_params_) 
grid_predictions = grid.predict(X_test) 
  
# print classification report 
print(classification_report(y_test, grid_predictions))

Output:
 {'C': 100, 'gamma': 'scale', 'kernel': 'linear'}
              precision    recall  f1-score   support

           0       0.97      0.91      0.94        66
           1       0.94      0.98      0.96       105

    accuracy                           0.95       171
   macro avg       0.96      0.95      0.95       171
weighted avg       0.95      0.95      0.95       171

A lot of you might think that {‘C’: 100, ‘gamma’: ‘scale’, ‘kernel’: ‘linear’} are the best values for hyperparameters for an SVM model. This is not the case, the above-mentioned hyperparameters may be the best for the dataset we are working on. But for any other dataset, the SVM model can have different optimal values for hyperparameters that may improve its performance.

Difference between parameter and hypermeter

Parameter	Hyperparameter
The configuration model’s parameters are internal to the model.	Hyperparameters are parameters that are explicitly specified and control the training process.
Predictions require the use of parameters.	Model optimization necessitates the use of hyperparameters.
These are specified or guessed while the model is being trained.	These are established prior to the start of the model’s training.
This is internal to the model.	This is external to the model.
These are learned & set by the model by itself.	These are set manually by a machine learning engineer/practitioner.

When you utilise cross-validation, you set aside a portion of your data to use in assessing your model. Cross-validation can be done in a variety of ways. The easiest notion is to utilise 70% (I’m making up a number here; it doesn’t have to be 70%) of the data for training and the remaining 30% for evaluating the model’s performance. To avoid overfitting, you’ll need distinct data for training and assessing the model. Other (somewhat more difficult) cross-validation approaches, such as k-fold cross-validation, are also commonly employed in practice.

Grid search is a method for performing hyper-parameter optimisation, that is, with a given model (e.g. a CNN) and test dataset, it is a method for finding the optimal combination of hyper-parameters (an example of a hyper-parameter is the learning rate of the optimiser). You have numerous models in this case, each with a different set of hyper-parameters. Each of these parameter combinations that correspond to a single model is said to lie on a “grid” point. The purpose is to train and evaluate each of these models using cross-validation, for example. Then you choose the one that performed the best.

This brings us to the end of this article where we learned how to find optimal hyperparameters of our model to get the best performance out of it.

To learn more about this domain, check out Great Learning’s PG Program in Artificial Intelligence and Machine Learning to upskill. This Artificial Intelligence course will help you learn a comprehensive curriculum from a top-ranking global school and to build job-ready Artificial Intelligence skills. The program offers a hands-on learning experience with top faculty and dedicated mentor support. On completion, you will receive a Certificate from The University of Texas at Austin.

To fully understand how to optimize your models using GridSearchCV, building a strong foundation in machine learning is key. Fortunately, there are free online courses available, which allow you to learn at your own pace and level up your skills.

GridSearchCV FAQs

What is GridSearchCV used for?

GridSearchCV is a technique for finding the optimal parameter values from a given set of parameters in a grid. It’s essentially a cross-validation technique. The model as well as the parameters must be entered. After extracting the best parameter values, predictions are made.

How do you define GridSearchCV?

GridSearchCV is the process of performing hyperparameter tuning in order to determine the optimal values for a given model.

What does cv in GridSearchCV stand for?

GridSearchCV is also known as GridSearch cross-validation: an internal cross-validation technique is used to calculate the score for each combination of parameters on the grid.

How do you use GridSearchCV in regression?

GirdserachCV in regression can be used by following the below steps
Import the library – GridSearchCv.
Set up the Data.
Model and its Parameter.
Using GridSearchCV and Printing Results.

Does GridSearchCV use cross-validation?

GridSearchCV does, in fact, do cross-validation. If I understand the notion correctly, you want to hide a portion of your data set from the model so that it may be tested. As a result, you train your models on training data and then test them on testing data.