# Model Selection and validation Do not use MINST digit dataset. Use dataset.load_digits() from sklearn Due: Sep 28, 2021 11:59 PMFor this assignment,

Click here to Order a Custom answer to this Question from our writers. It’s fast and plagiarism-free.

Model Selection and validation Do not use MINST digit dataset. Use dataset.load_digits() from sklearn Due: Sep 28, 2021 11:59 PM

For this assignment, you are to determine which model is best for prediction, report the

right hyperparameters, and the resulting accuracy for the Digit Recognition data set that

was used in the previous assignment.

As before, create a PDF of your notebook showing your steps and include the table

below as mentioned. Attach both a jupyter notebook and the PDF version. Also, be sure

to include your name at the start of the notebook and at the top of the PDF.

Specifically, you are to test the following models

Model Hyperparameter Testing range Notes

Support Vector
Machine

Gamma – size of
the kernel
C – slack variable

10-x for x = -5 to 5 use the ‘rbf’: radial
basis function
kernel

K-nearest
neighbors

k – number of
neighbors

1,3,5,7,9 use the sklearn
function

Decision Trees min_samples_split 3,5,7,9 (1 was
removed)

use the defaults for
the other
hyperparameters.

Logistic Regression C – inverse of the
regularization
strength (smaller =
more regularization)

10-x for x = -5 to 5 with the L1 penalty
(Lasso)

Steps are as follows:

1. Separate your data into training and testing. We will use cross-validation over the

training set to select the right parameters

a. Use train_test_split to create a separate training and test set.

X_train, X_test, y_train, y_test = train_test_split(X,

y, stratify=True, test_size=0.20)

b. For the training set, you have two choices to perform hyperparameter

selection.

i. Use cross-validation to evaluate each model variant and select the

best hyperparameters (standard practice, most recommended)

ii. Create a hold-out validation set and train on one portion of the data

and use the accuracy on the hold-out validation set to pick the right

hyperparameters (also valid)

2. Steps to turn in for the assignment

a. Train the four models with their default parameters. Report the resulting

accuracy of each model using the default parameters.

b. For each of the four models, find the hyperparameters giving the highest

accuracy on the validation set by performing an exhaustive grid search.

Report the hyperparameter values and accuracy on the validation

set.

i. Consider using sklearn.model_selection.GridSearchCV

ii. For the models with two hyperparameters, you will need to search

both simultaneously to find the optimum combination

c. Now apply the highest accuracy trained models to the test set. Report the

accuracy of each model.

Fill the following table with the information.

Model Default
validation
accuracy

Tuned
validation
accuracy

Selected
hyperparameter
s

Final test set
accuracy

SVM

k-NN

Decision Trees

Logistic
Regression