expioratory data analysis代写-代写计算机作业
expioratory data analysis代写

expioratory data analysis代写-代写计算机作业

PSet 4

expioratory data analysis代写 Create a missing values plot for the mycars dataset created below (slightly different from the one in the lecture slides).

Note: Grading is based both on your graphs and verbal explanations. Follow all best practices as discussed in class, including choosing appropriate parameters for all graphs. Do not expect the assignment questions to spell out precisely how the graphs should be drawn. Sometimes guidance will be provided, but the absense of  guidance does not mean that all choices are ok.

1.mycars missing patterns

Create a missing values plot for the mycars dataset created below (slightly different from the one in the lecture slides). Your plot should be in the style of extracat::visna() (no longer available on CRAN) using ggplot2 to create the main plot and two side plots and patchwork to put them together. It should show accurately: missing patterns, counts for missing by column and missing by pattern, and make it clear which row respresents complete cases.

Bars in the side plots should be sorted and correspond to the rows and columns of the main plot. An example is provided though the aesthetics of your plot do not have to conform precisely to the example.Some code is provided to get you started with finding missing patterns. (Keep in mind that in the next question you will be turning this code into a function so to save yourself time later on write as generically as possible.)

library(tidyverse)

library(patchwork)

# Add NAs to mtcars dataset 

set.seed(5702)

mycars <- mtcars

mycars[1:25, "gear"] <- NA

mycars[10:20, 3:5] <- NA
expioratory data analysis代写
expioratory data analysis代写

Hints:

  • You can use this code to find and count the missing patterns:
missing_patterns <- data.frame(is.na(mycars)) %>%

group_by_all() %>%

count(name = "count", sort = TRUE) %>%

ungroup()
  • To highlight the complete cases row you need a different fill scale than the one used to show missing vs.non-missing in the main plot (purple and grey in the example). This is a little tricky since you’re only allowed one fill scale in ggplot2. You can either use alpha to control the highlight with scale_alpha_manual(values = …) or use the ggnewscale package which provides for multiple fill scales in the same graph.
expioratory data analysis代写
expioratory data analysis代写

2.Missing value plot function  expioratory data analysis代写

a.Create a function for creating missing plots based on your code from question 1. It should provide an option to show either missing counts or missing percent. The percent option for mycars is shown below.You either put the function code in a separate .R file or include it in the .Rmd file.

source("plot_missing.R")

plot_missing(mycars, percent = TRUE)

b.Show the output for both options (counts / percent) for the economics dataset in the ggplot2 package.(This is a test to see if your function works if there are no missing values.)

c.Show the output for both options (counts / percent) for the HollywoodMovies2011 dataset in the Lock5withR package. You can shorten the column names so they don’t overlap in the plot.

3.Setup your GitHub final project repo

a.Set up your final project repository following the EDAVproject template

(https://github.com/jtr13/EDAVtemplate). You can either choose one team member’s GitHub account, or create an organization to house the final project. Be sure to follow all of the steps in the README so your bookdown book renders with your information, not the placeholders in the template. Edit the link below to point to your rendered book:

https://[ACCOUNT NAME].github.io/[REPO NAME]  expioratory data analysis代写

b.Make sure that all team members have write access to the repository and have practiced making contributions. Edit the link below to point to your contributors page, showing that all team members have made contributions to the repo (Note that we do not have the ability to see who has write access, only who has contributed):

https://github.com/[ACCOUNT NAME]/[REPO NAME]/graphs/contributors

c.Discuss a plan for dividing up the work for the final project and briefly summarize what each person will do.

4.Missing values chapter  expioratory data analysis代写

Write a first draft of the missing values chapter of your final project. You do not have to include all of the data you use in the final project. Choose one file and analyze it using techniques discussed in class for missing values.Include a plot using your function from Q2 as well as verbal interpretation of the plot. Edit this link to point to your chapter:

https://[ACCOUNT NAME].github.io/[REPO NAME]/missing-values.html

 

更多代写:宏观经济网课托管  gmat网考代考费用  国外代网课  管理学report代写  公共政策论文代写  怎么写essay

合作平台:essay代写 论文代写 写手招聘 英国留学生代写

expioratory data analysis代写
expioratory data analysis代写

发表回复