New York University
ECE-GY 6143
Lab: Hyper-Parameter Optimization with PCA PCA is often applied as a pre-processing step with classifiers. When using PCA in this manner, one must select the number of PC components to use along with parameters in classifier. In this lab, we will demonstrate how to performing this hyper-parameter optimization. In doing the lab, you will learn t
...[Show More]
Lab: Hyper-Parameter Optimization with PCA PCA is often applied as a pre-processing step with classifiers. When using PCA in this manner, one must select the number of PC components to use along with parameters in classifier. In this lab, we will demonstrate how to performing this hyper-parameter optimization. In doing the lab, you will learn to: Combine PCA with data scaling. Compute and visualize PC components Select the number of PCs with K-fold cross validation Implement the multi-stage classifier pipeline in sklearn Perform automatic parameter search using GridSearchCV in combination with a pipeline. We first download the basic packages. In [1]: import numpy as np import matplotlib import matplotlib.pyplot as plt Downloading the Data We will use a very simple wine dataset, commonly used in teaching machine learning class. The problem is to classify the type of red wine from features of the wine such as the alchohol and other chemical components. There are three possible wine types. In [2]: from sklearn.datasets import load_wine from sklearn.model_selection import KFold data = load_wine() # TODO print the features names in data.feature_names and data.target_names print('Features Names' , data.feature_names) print('Target names', data.target_names) Get the data matrix X from data.data and the target values y from data.target . Print the number of samples, number of features and number of classes. In [3]: # TODO X = data.data y = data.target print('Number_Samples: ',X.shape[0]) print('Number_Features: ',X.shape[1]) print('Number_Classes: ',len(np.unique(y))) Perform PCA fo
[Show Less]