Hyperparameter Tuning#

This notebook demonstrates how to use boax for parallel hyperparameter tuning of a scikit-learn Support Vector Classifier on the iris benchmark dataset.

from jax import config

config.update("jax_enable_x64", True)

import matplotlib.pyplot as plt
from jax import numpy as jnp
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC

plt.style.use('bmh')

from boax.experiments import optimization

First we download the iris dataset, split it into test and train datasets, and normalize the values using StandardScaler.

iris = load_iris()
X = iris.data
y = iris.target

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

Next we define our objective function which consists of fitting a Support Vector Classifier on the training data given the hyperparameters and evaluating the classifier on the test data.

def objective(C, gamma):
    svc = SVC(C=C, gamma=gamma, kernel='rbf')
    svc.fit(X_train, y_train)
    return svc.score(X_test, y_test)

Now we setup the hyperparameter optimisation experiment, by defining the two parameters we want to optimise and the batch size. The batch size defines how many parameterizations we can test in parallel at each step.

experiment = optimization(
    parameters=[
        {
            'name': 'C',
            'type': 'log_range',
            'bounds': [1, 1_000],
        },
        {
            'name': 'gamma',
            'type': 'log_range',
            'bounds': [1e-4, 1e-3],
        },
    ],
    batch_size=4,
)

Next we initialise the step and results values by setting them to None and an empty list.

step, results = None, []

Finally we run the experiment. For demonstration purposes we run the experiment for 25 steps and a batch size of 4, which requires a total of 100 training and evaluation runs. To make the implementation simpler we are not actually running the training and evaluation of the Support Vector Classifier in parallel, however in a more realistic scenario with a larger model with many parameters we should parallelise the training for faster conversion.

for _ in range(25):
    # Print progress
    print('.', end='')

    # Retrieve next parameterizations to evaluate
    step, parameterizations = experiment.next(step, results)

    # Evaluate parameterizations
    evaluations = [
        objective(**parameterization)
        for parameterization in parameterizations
    ]
    
    results = list(
        zip(parameterizations, evaluations)
    )

.........................

# Predicted best
experiment.best(step)

({'C': 1.0, 'gamma': 0.00010000000000000009}, 0.11405385237628252)