 Statistical Learning代考

## Exam 1 Due Date: 3/22/2021

Statistical Learning代考 Explain what the confusion matrix is telling you about the types of mistakes made by logistic regression.

### (1) Using the dataset provided

–  Consumer Price Index for All Urban Consumers: All Items in U.S. City Average (CPIAUCSL)

– Unemployment Rate (UNRATE)

a.Generate a scatterplot by decade beginning with the 1980s with the changes in the unemployment rate on the y-axis and inflation on the x-axis.  Statistical Learning代考

b.Estimate a simple regression for each decadein excel with the unemployment rate as your dependent variable and inflation as your explanatory variable (include an intercept term in your regression). Report the R-square for each regression and include a scatterplot with the “line of best fit.”.

d.Provide an intuitive interpretation of the p-values and R-Squares for the coefficients of the explanatory variable.

### (2) Statistical Learning代考

(a) Compute the matrix of correlations between the variables using the function cor().

(b)  Use the lm() function to perform a multiple linear regression with inflation as the response and all other variables except name as the predictors. Use the summary() function to print the results.

(c) Is there a relationship between the predictors and the response?

(d) Which predictors appear to have a statistically significant relationship to the response?   Statistical Learning代考

(e) Use the plot() function to produce diagnostic plots of the linear regression fit. Do the residual plots suggest any unusually large outliers? Does the leverage plot identify any observations with unusually high leverage?

### (3)The above regression used a lot of independent variables.

a). Use forward selection to select explanatory variables. Set the rule for the independent variables such that the p-value must be less than 0.05.

b) Use backward selection to select explanatory variables. Set the rule for the independent variables such that the p-value must be less than 0.05.

c) Describe the potential strengths and weaknesses of the model.

### (4) Use the variable INDPRO1 in the data set to create a recession dummy variable. If the variable INDPRO1 is <0, the dummy variable = 1 and if the INDPRO1 > 0 then the dummy variable =0.  Statistical Learning代考

a) Use the full data set to perform a logistic regression with recession dummy variable as the response and the rest of the variables as predictors.

b)Use the summary function to print the results. Do any of the predictors appear to be statistically significant? If so, which ones?

c) Compute the confusion matrix and overall fraction of correct predictions. Explain what the confusion matrix is telling you about the types of mistakes made by logistic regression.

### (5) Similar to problem 4, Use the variable INDPRO1 in the data set to create a recession dummy variable. If the variable INDPRO1 is <0, the dummy variable = 1 and if the INDPRO1 > 0 then the dummy variable =0.

a) Use the data from 1980 – 2010 as your training dataset to generate a logistic regression with recession dummy variable as the response and the rest of the variables as predictors.

b) Use the data from 2010 – 2018 as the validation dataset from part a’s logistic regression. How well does your model perform?