Georgia Institute Of Technology ISYE 6501 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A 1/34ISYE 6501 Midterm 2 | Due Jul 12 at 2am | Points 100.02 | Questions 48 | Available after Jul 1 at 2am | Time Limit 90 Minutes | Instructionsonorlock Chrome Extensions exam requires Google Chrome and the Honorlock Chrome Extension.Extension RequiredAdd the Honorlock ex
...[Show More]
ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
1/34
ISYE 6501 Midterm 2
| Due Jul 12 at 2am | Points 100.02 | Questions 48
| Available after Jul 1 at 2am | Time Limit 90 Minutes |
Instructions
onorlock Chrome Extension
s exam requires Google Chrome and the Honorlock Chrome Extension.
Extension Required
Add the Honorlock extension to continue
Need Help?
I agree to Honorlock's Terms of Service (https://honorlock.com/legal/terms) and acknowledge I have read and understand
Honorlock's Privacy Policy (/legal/app_privacy)
Get Started
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
2/34
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
3/34
Attempt History
Attempt Time Score
LATEST Attempt 1 24 minutes 86.02 out of 100.02
Score for this quiz: 86.02 out of 100.02
Submitted Jul 11 at 3:25pm
This attempt took 24 minutes.
© 2021 Honorlock Inc. Support (https://honorlock.com/support) Privacy Policy (/legal/app_privacy) Terms of Service
(https://honorlock.com/legal/terms)
90 Minute Time Limit
Instructions
Work alone. Do not collaborate with or copy from anyone else.
You may use any of the following resources:
One sheet (both sides) of handwritten (not photocopied or
scanned) notes
If any question seems ambiguous, use the most reasonable
interpretation (i.e. don't be like Calvin):
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
4/34
If you experience any technical issues (i.e. images not loading)
you may refresh the page without interrupting your exam
attempt. If the issue persists, then please finish the exam and let
the Instructors know about the issue in a private Piazza post
afterwards.
Good Luck!
INSTRUCTIONS FOR QUESTIONS 1-5
For each of the following five questions, select the probability distribution
that could best be used to model the described scenario. Each distribution
might be used, zero, one, or more than one time in the five questions.
These scenarios are meant to be simple and straightforward; if you're an
expert in the field the question asks about, please do not rely on your
expertise to fill in all the extra complexity (you'll end up making the
questions below more difficult than I intended).
| Question 1 1.4 / 1.4 pts
| Number of people clicking an online banner ad each hour
Binomial
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
5/34
Exponential
Geometric
Poisson
Weibull
| Question 2 1.4 / 1.4 pts
| Time from the beginning of Fall until the first snowflake is seen
Weibull
Binomial
Exponential
Geometric
Poisson
| Question 3 1.4 / 1.4 pts
| Number of arrivals to a flu shot clinic each minute
Binomial
Exponential
Geometric
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
6/34
Poisson
Weibull
| Question 4 1.4 / 1.4 pts
| Time between hits on a real estate web site
Exponential
Binomial
Geometric
Poisson
Weibull
| Question 5 1.4 / 1.4 pts
| Number of arrivals to the ID-check queue at an airport each minute
Poisson
Binomial
Exponential
Geometric
Weibull
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
7/34
INFORMATION FOR QUESTIONS 6-7
Five classification models were built for predicting whether a
neighborhood will soon see a large rise in home prices, based on public
elementary school ratings and other factors. The training data set was
missing the school rating variable for every new school (3% of the data
points).
Because ratings are unavailable for newly-opened schools, it is believed
that locations that have recently experienced high population growth are
more likely to have missing school rating data.
Model 1 used imputation, filling in the missing data with the average
school rating from the rest of the data.
Model 2 used imputation, building a regression model to fill in the
missing school rating data based on other variables.
Model 3 used imputation, first building a classification model to
estimate (based on other variables) whether a new school is likely to
have been built as a result of recent population growth (or whether it
has been built for another purpose, e.g. to replace a very old school),
and then using that classification to select one of two regression
models to fill in an estimate of the school rating; there are two different
regression models (based on other variables), one for neighborhoods
with new schools built due to population growth, and one for
neighborhoods with new schools built for other reasons.
Model 4 used a binary variable to identify locations with missing
information.
Model 5 used a categorical variable: first, a classification model was
used to estimate whether a new school is likely to have been built as a
result of recent population growth; and then each neighborhood was
categorized as "data available", "missing, population growth", or
"missing, other reason".
| Question 6 5 / 5 pts
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
8/34
If school ratings can be reasonably well-predicted from the other factors,
and new schools built due to recent population growth cannot be
reasonably well-classified using the other factors, which model would you
recommend?
Model 1
Model 2
Model 3
Model 4
Model 5
| Question 7 0 / 5 pts
| In which of the following situations would you recommend using Model 3?
Ratings can be well-predicted, and reasons for building schools can be
well-classified
Ratings can be well-predicted, and reasons for building schools cannot be
well-classified
orrect Answer orrect Answer
ou Answered ou Answered
Ratings cannot be well-predicted, and reasons for building schools can be
well-classified
Ratings cannot be well-predicted, and reasons for building schools cannot
be well-classified
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
9/34
INFORMATION FOR QUESTIONS 8-12
In a diet problem (like we saw in the lessons and homework), let be the
amount of food in the solution ( ), and let be the maximum
amount that can be eaten of any food.
Suppose we added new variables that are binary (i.e., they must be
either 0 or 1): if food is eaten in the solution, then it is part of the solution
( ); otherwise .
xi
i xi ≥ 0 M
yi
i
yi = 1 yi = 0
INSTRUCTIONS FOR QUESTIONS 8-12
For each of the following five questions, select the mathematical
constraint that best corresponds to the English sentence. Each constraint
might be used, zero, one, or more than one time in the five questions.
| Question 8 1.4 / 1.4 pts
| Select the mathematical constraint that corresponds to the following
English sentence:
Out of peanut butter and cheese sauce, exactly one must be eaten.
ypeanutbutter = 1 - ycheesesauce
ypeanutbutter + ycheesesauce = 0
ybroccoli ≤ ycheesesauce + ypeanutbutter
ybroccoli + ycheesesauce + ypeanutbutter ≤ 2
xcheesesauce ≤ M ycheesesauce
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
10/34
ycheesesauce = 1
xbroccoli ≤ M ypeanutbutter
xbroccoli ≥ M ypeanutbutter
| Question 9 1.4 / 1.4 pts
| Select the mathematical constraint that corresponds to the following
English sentence:
Broccoli can only be eaten if either cheese sauce or peanut butter (or
both) is also eaten.
ybroccoli ≤ ycheesesauce + ypeanutbutter
ypeanutbutter + ycheesesauce = 0
ypeanutbutter = 1 - ycheesesauce
ybroccoli + ycheesesauce + ypeanutbutter ≤ 2
xcheesesauce ≤ M ycheesesauce
ycheesesauce = 1
xbroccoli ≤ M ypeanutbutter
xbroccoli ≥ M ypeanutbutter
| Question 10 1.4 / 1.4 pts
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
11/34
Select the mathematical constraint that corresponds to the following
English sentence:
Either cheese sauce or peanut butter (or both) must be eaten with
broccoli.
ypeanutbutter + ycheesesauce = 0
ypeanutbutter = 1 - ycheesesauce
ybroccoli ≤ ycheesesauce + ypeanutbutter
ybroccoli + ycheesesauce + ypeanutbutter ≤ 2
xcheesesauce ≤ M ycheesesauce
ycheesesauce = 1
xbroccoli ≤ M ypeanutbutter
xbroccoli ≥ M ypeanutbutter
| Question 11 1.4 / 1.4 pts
| Select the mathematical constraint that corresponds to the following
English sentence:
Broccoli, cheese sauce, and peanut better all can't be eaten together.
ypeanutbutter + ycheesesauce = 0
ypeanutbutter = 1 - ycheesesauce
ybroccoli ≤ ycheesesauce + ypeanutbutter
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
12/34
ybroccoli + ycheesesauce + ypeanutbutter ≤ 2
xcheesesauce ≤ M ycheesesauce
ycheesesauce = 1
xbroccoli ≤ M ypeanutbutter
xbroccoli ≥ M ypeanutbutter
| Question 12 1.4 / 1.4 pts
| Select the mathematical constraint that corresponds to the following
English sentence:
If cheese sauce and peanut butter are not eaten, then broccoli can't be
eaten either.
ybroccoli ≤ ycheesesauce + ypeanutbutter
ypeanutbutter + ycheesesauce = 0
ypeanutbutter = 1 - ycheesesauce
ybroccoli + ycheesesauce + ypeanutbutter ≤ 2
xcheesesauce ≤ M ycheesesauce
ycheesesauce = 1
xbroccoli ≤ M ypeanutbutter
xbroccoli ≥ M ypeanutbutter
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
13/34
| Question 13 5 / 5 pts
| A company has created a stochastic discrete-event simulation model of its
customer service call center, including call arrivals, resource usage
(workers who specialize in answering each type of calls, supervisors,
etc.), and call duration.
The call center is not first-come-first-served; a call from a major client will
be answered first, ahead of even long-waiting callers with smaller
accounts.
When a new call comes in, the call center will run the simulation to quickly
give the caller an estimate of the expected wait time before being helped.
How many times does the company need to run the simulation for each
new caller (i.e., how many replications are needed)?
Many times, because of the variability and randomness
Once, because the outcome will be the same each time
Once, because each patient is unique
INFORMATION FOR QUESTIONS 14-17
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
14/34
The figure above shows the average of the first x simulated wait times, as
new replications ("runs") are run and added into the overall average. It is
not showing the wait time just for each replication. For example,
after x=101 replications, the wait time of the 101st replication is not
necessarily 72, but the average of those 101 replications is about 72.
INSTRUCTIONS FOR QUESTIONS 14-17
For each of the statements in Questions 14-17, select the choice that
makes the statement true. (Notice that Question 14 has two parts, each
of which must be answered.)
| Question 14 2 / 2 pts
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
15/34
Answer 1:
Answer 2:
i. The simulation COULD have been stopped after 400 runs (replications).
ii. The simulation COULD even have been stopped after 300 runs
(replications).
COULD
COULD
| Question 15 0 / 1 pts
| Answer 1:
The simulated wait time WAS 50 or less just once out of all the runs
(replications).
WAS
WAS NOT
ou Answered ou Answered orrect Answer orrect Answer
| Question 16 1 / 1 pts
| The expected wait time of simulated runs (replications) IS LIKELY to be
between 65 and 75.
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
16/34
Answer 1:
IS LIKELY
| Question 17 1 / 1 pts
| Answer 1:
There IS NOT very little variability in the simulated wait time of the runs
(replications).
IS NOT
| Question 18 6 / 6 pts
| Suppose it is discovered that simulated wait times are 50% higher than
actual wait times, on average. What would you recommend that they do?
Investigate to see what's wrong with the simulation, because it's a poor
match to reality.
Scale down all estimates by a factor of 1/1.50 to get the average
simulation estimates to match the average actual wait times.
Use the 50%-higher estimates, because that's what the simulation output
is.
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
17/34
INFORMATION FOR QUESTIONS 19-25
For each of the seven optimization problems below, select its most
precise classification. In each model, are the variables, all other letters
( , , ) refer to known data, and the values of are all positive.
Each classification might be used, zero, one, or more than one time in the
m
Convex program
Convex quadratic program
General non-convex program
Integer program
| Question 20 1 / 1 pts
| Minimze
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
18/34
Convex program
Convex quadratic program
General non-convex program
Integer program
Linear program
| Question 21 1 / 1 pts
| Minimze
subject to for all
all
∑i ci |xi - 6|
∑i aijxi ≥ bj j
xi ≥ 0
Convex program
Convex quadratic program
General non-convex program
Integer program
Linear program
| Question 22 1 / 1 pts
| Minimze
subject to for all
∑i cixi
∑i aijxi ≥ bj j
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
19/34
all xi ∈ {0, 1}
Convex program
Convex quadratic program
General non-convex program
Integer program
Linear program
| Question 23 1 / 1 pts
| Minimze
Linear program
Convex program
Convex quadratic program
General non-convex program
Integer program
INSTRUCTIONS FOR QUESTION 26
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
21/34
Answer all three parts of Question 26.
| Question 26 4 / 12 pts
| A supermarket is analyzing its checkout lines, to determine how many
checkout lines to have open at each time.
At busy times (about 10% of the times), the arrival rate is 5
shoppers/minute. At other times, the arrival rate is 2 shoppers/minute.
Once a shopper starts checking out (at any time), it takes an average of 3
minutes to complete the checkout.
[NOTE: This is a simplified version of the checkout system. If you have
deeper knowledge of how supermarket checkout systems work, please do
not use it for this question; you would end up making the question more
complex than it is designed to be.]
a. The first model the supermarket tries is a queuing model with 4 lines
open at all times. We would expect the queuing model to show that wait
times are [ Select ] .
b. The second model the supermarket tries is a queuing model with 20
lines open during busy times and 10 lines open during non-busy times.
We would expect the queuing model to show that wait times are
[ Select ]
.
The supermarket now has decided that, when there are 20 people waiting
(across all lines), the supermarket will open an express checkout line,
which stays open until nobody is left waiting.
The supermarket would like to model this new process with a Markov
chain, where each state is the number of people waiting (e.g., 0 people
waiting, 1 person waiting, etc.).
Notice that now, the transition probabilities from a state like "3 people
waiting" depend on how many lines are currently open, and therefore
depend on whether the system was more recently in the state "20 people
waiting" or "0 people waiting".
c. The process is memoryless and the Markov chain is an appropriate
model ONLY if the arrivals follow the Poisson distribution and the
checkout times follow the Exponential distribution. .
ou Answered ou Answered low at non-busy times and high at busy times
orrect Answer orrect Answer high at both busy and non-busy times
low at both busy and non-busy times
is memoryless and the Markov chain is an appropriate model ONLY if the
arrivals follow the Poisson distribution and the checkout times follow the
Exponential distribution.
ou Answered ou Answered
orrect Answer orrect Answer is not memoryless, so the Markov chain model would not be well-defined.
INSTRUCTIONS FOR QUESTION 27
Answer both parts of Question 27.
| Question 27 10 / 10 pts
| A retailer is testing two different customer retention approaches. The
retailer is using A/B testing: For each customer, the retailer randomly
selects one approach or the other to use. The results after 2000 trials are
shown below.
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
23/34
Answer 1:
Answer 2:
Trials
Customer
loss rate
95% confidence
interval
Note: The "customer loss rate" is the fraction of customers who stop doing
business with the retailer. Lower customer loss rates are better.
a. What should the retailer do?
More exploration (test both options; it is unclear yet which is better)
Later, the retailer developed 7 new options, so they used a multi-armed
bandit approach where each option is chosen with probability proportional
to its likelihood of being the best. The results after 2000 total trials are
shown below.
Customer loss
b. If the retailer's main goal is to find the option that has the lowest
customer loss rate (lowest fraction of customers who stop doing business
with the retailer), which type of test should they use to see if the option
that appears best is significantly better than each of the other options?
Binomial-based test
More exploration (test both options; it is unclear yet which is better)
Binomial-based test
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
24/34
INFORMATION FOR QUESTIONS 28-31
For each of the mathematical optimization models, select the variableselection/regularization method it most-precisely represents (or select
"none of the above" if none of the other choices are appropriate). In each
model, is the data, is the response, are the coefficients, is the
number of data points, is the number of predictors, and and are
appropriate constants.
Each of the choices might be used zero, one, or more than one time in the
four questions.
x y a n
m T λ
| Question 28 1 / 1 pts
| Minimize
subject to
∑n i=1 (yi - (a0 + ∑m j=1 ajxij))2
∑m j=1 (aj)2 ≤ T
Ridge regression
Elastic net
Lasso regression
None of the above
| Question 29 1 / 1 pts
| Minimize
subject to
∑n i=1 (yi - (a0 + ∑m j=1 ajxij))2
∑m j=1 |aj| ≤ T
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
25/34
Elastic net
Lasso regression
Ridge regression
None of the above
| Question 30 1 / 1 pts
| Minimize
subject to
∑n i=1 (yi - (a0 + ∑m j=1 ajxij))2
λ ∑m j=1 |aj| + (1 - λ) ∑m j=1 (aj)2 ≤ T
Elastic net
Lasso regression
Ridge regression
None of the above
| Question 31 1 / 1 pts
| Minimize ∑n i=1 (yi - (a0 + ∑m j=1 ajxij))2
Elastic net
Lasso regression
Ridge regression
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
26/34
None of the above
INFORMATION FOR QUESTIONS 32-35
Elastic net, lasso regression, linear regression, and ridge regression are
four models that can be used when doing regression. Out of the four
models, one usually selects the fewest variables, two select the most
variables, and one selects a mid-range number of variables (in the middle
of the others).
For each of the models in the questions below, specify whether the model
usually would select the fewest, mid-range, or most variables.
| Question 32 1 / 1 pts
| Lasso regression
Fewest
Mid-range
Most
| Question 33 1 / 1 pts
| Elastic net
Fewest
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
27/34
Mid-range
Most
| Question 34 1 / 1 pts
| Linear regression
Most
Fewest
Mid-range
| Question 35 1 / 1 pts
| Ridge regression
Most
Fewest
Mid-range
INFORMATION FOR QUESTION 36
For each of the following three statements, select whether it is a reason
you might want to use stepwise regression, lasso, etc. to limit the number
of factors in a model.
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
28/34
| Question 36 6 / 6 pts
| Answer 1:
Answer 2:
Answer 3:
i. Because there isn't enough data to avoid overfitting a model with many
factors
YES
ii. To find a more-complex model
NO
iii. To find a simpler model
YES
YES
NO
YES
| Question 37 3 / 3 pts
| In the simple linear regression model
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
29/34
Answer 1:
Answer 2:
minimize
i. x are variables from a regression perspective.
ii. a are variables from an optimization perspective.
∑n i=1 (yi - (a0 + ∑m j=1 ajxij))2
x
a
INSTRUCTIONS FOR QUESTIONS 38-43
Put the following six steps in order, from what is done first to what is done
last.
| Question 38 1.17 / 1.17 pts
| Answer 1:
First Remove outliers
First
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
30/34
| Question 39 1.17 / 1.17 pts
| Answer 1:
Sixth Test model on another different set of data to estimate quality
Sixth
| Question 40 1.17 / 1.17 pts
| Answer 1:
Third Fit lasso regression model on all variables
Third
| Question 41 1.17 / 1.17 pts
| Answer 1:
Fifth Pick model to use based on performance on a different data set
Fifth
| Question 42 1.17 / 1.17 pts
| Fourth Fit linear regression, regression tree, and random forest models
using variables chosen by lasso regression
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
31/34
Answer 1:
Fourth
| Question 43 1.17 / 1.17 pts
| Answer 1:
Second Impute missing data values and scale data
Second
INSTRUCTIONS FOR QUESTIONS 44-48
For each of the following five questions, select the most appropriate
model/approach to answer the question/analyze the situation described.
Each model/approach might be used zero, one, or more than one time in
the five questions.
| Question 44 1.4 / 1.4 pts
| In the MS Analytics program, which groups of electives are often taken by
the same students?
Louvain algorithm
Game-theoretic analysis
Non-parametric test
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
32/34
Queuing
Stochastic optimization
| Question 45 1.4 / 1.4 pts
| Determine the best marketing strategy, given that a competitor will react
to your choice in his/her decisions.
Game-theoretic analysis
Louvain algorithm
Non-parametric test
Queuing
Stochastic optimization
| Question 46 1.4 / 1.4 pts
| Find the best airline flight schedule given uncertain weather-related
delays and maintenance delays.
Game-theoretic analysis
Louvain algorithm
Non-parametric test
Queuing
7/26/2021 ISYE 6501 Midterm 2 : Intro Analytics Modeling - ISYE-6501-OAN/O01/QCH/A
33/34
Stochastic optimization
| Question 47 1.4 / 1.4 pts
| What is the best route for a delivery vehicle to take, given uncertainties in
upcoming traffic?
Stochastic optimization
Game-theoretic analysis
Louvain algorithm
Non-parametric test
Queuing
| Question 48 1.4 / 1.4 pts
| How much money should someone offer when buying a house, if there
are other potential buyers bidding strategically too?
Game-theoretic analysis
Louvain algorithm
Non-parametric test
Queuing
Stochastic optimization
[Show Less]