University of Waterloo
STAT 441
Practice Problems 1 Statistical Learning – Classification STAT 441 / STAT 841 / CM 763 version: 2023-01-30 21:30:33 Conceptual 1. Suppose that for K = 2, the Bayes classifier is given by gBayes(x) = I{h(x) > c}. Now suppose that instead of x, we observe z = G(x) for some invertible function G. What is the Bayes classifier gBayes(z) as a funct
...[Show More]
Practice Problems 1 Statistical Learning – Classification STAT 441 / STAT 841 / CM 763 version: 2023-01-30 21:30:33 Conceptual 1. Suppose that for K = 2, the Bayes classifier is given by gBayes(x) = I{h(x) > c}. Now suppose that instead of x, we observe z = G(x) for some invertible function G. What is the Bayes classifier gBayes(z) as a function of z? 2. So far we’ve assumed that the training data (y1, x1), . . . ,(yn, xn) are iid samples from some distribution p(y, x) = p(y | x)p(x), and our goal is to estimate p0(y | x). Let’s see what happens if the assumption is relaxed. a. Suppose that instead of being sampled from p(y, x), the training data are iid samples from a different distribution, q(y, x), such that q(y, x) = p(y | x)q(x). In other words, the conditional distribution of y given x in both p(y, x) and q(y, x) distributions is the same; it’s only the marginal distribution the x that differs. Can you apply an discriminative classifier to the data generated from q(y, x) and expect things to work the same way as if the data were from p(y, x)? Justify your answer. b. Can you do the same with a generative classifier, i.e., apply it to data generated from q(y, x) and expect things to work the same way as if they were generated from p(y, x)? Justify your answer. c. Now suppose that instead of obtaining iid samples from p(y, x), we sample nk data points iid from pk(x) = p(x | y = k) = p(y = k, x) Pr(y = k) for each k = 1, . . . , K. Suppose we are also given the true probabilities πk = Pr(y = k) = Z p(y = k, x) dx for k = 1, . . . , K. How would you set up a generative classifier in this situation? To make things concrete, you can answer this for the specific case of QDA. 1 Logistic Regression 1. The dataset gradschool contains data on admissions to graduate school for n = 967 applicants on the following variables: • gpa: The applicant’s college GPA. • reference: The strength of the applicant’s reference: weak, strong, or none. • parent: Whether a parent previously attended the university: no or yes. • admitted: Whether the application wa
[Show Less]