探索性数据分析代写-data analysis代写-R代写
探索性数据分析代写

探索性数据分析代写-data analysis代写-R代写

Note: Grading is based both on your graphs and verbal explanations. Follow all best practices as discussed in class, including choosing appropriate parameters for all graphs. Do not expect the assignment questions to spell out precisely how the graphs should be drawn. Sometimes guidance will be provided, but the absense of guidance does not mean that all choices are ok.

Read Graphical Data Analysis with R, Chap. 6, 7

1. Crime  探索性数据分析代写

[10 points]

Data source: https://data.ny.gov/Public-Safety/Index-Crimes-by-County-and-Agency-Beginning-1990/ca8h-8gjq

You do not need to submit the data with your assignment. You may either download and read from your local copy or read directly from the web site with df <- read_csv("https://data.ny.gov/api/views/ca8h-8gjq/rows.csv").

  1. Create a parallel coordinates plot showing the number of crimes in each of the categories listed for 2020. Show actual counts; do not rescale. Your graph should have one line for each county in New York State. (Use GGally::ggparcoord())
  2. Now experiment with alpha blending, splines, and rescaling to create the clearest version of the plot that you can. What patterns do you observe? (Use GGally::ggparcoord())
  3. Create an interactive parallel coordinates plot of the same data, coloring by Region. Discuss outliers, clusters, and correlations in detail.
探索性数据分析代写
探索性数据分析代写

2. Sleep

[10 points]

Data: SleepStudy from Lock5withR package

Draw the following graphs and answer the questions.

  1. Is there an association between ClassYear and AnxietyStatus? Between ClassYear and NumEarlyClass? Justify your answers with mosaic plots.
  2. Perform chi square tests to test for associations between the sets of variables graphed in part a). What are the results? Discuss in relation to the mosaic plots.
  3. How is the relationship between anxiety status and number of early classes affected by class year? Create a mosaic plot showing all three variables, treating anxiety status as the dependent variable. Discuss the results.
  4. Use the base pairs() function to draw a mosaic pairs plot of all all categorical (factor) variables in SleepStudy. (Note: The vcd package must be loaded for pairs() to find the correct method.) Name a pair of variables which appear to have a very strong association. Name a pair of variables which appear not to be associated.

3. Wait List  探索性数据分析代写

[10 points]

The file stats_wl.csv contains information about waitlist movement for a Fall 2021 Columbia U undergraduate statistics class.

There are 640 rows and 4 variables:

Name name of student (actual names were replaced with names generated from the randomNames package)

Date since SSOL updates overnight, waitlist positions were collected each morning during the change of program period    探索性数据分析代写

Priority position in waitlist, for example 1 = top position on list

Status final outcome, Registered = received a place in class and remained; Dropped Class = received a place in class and left; Left List = left waiting list; Joined = remained on waiting list at the end of the change of program period. (Note that the status reflects what ultimately happened, not what the status was on a particular date.)

Create an alluvial diagram that shows waitlist movement during the change of program period. It is not necessary to include the Name column in the diagram, but it should be possible to observe movement of individual students: for example, that the student who was 22nd in the waitlist on Sept 9th moved up to 15th place on Sept 16th and then left the list.

 

更多代写:宏观经济学网课代上  托福代考  英国经济学assignment代写  化学专业essay代写  国外大学论文代写 论文大纲

合作平台:essay代写 论文代写 写手招聘 英国留学生代写 

探索性数据分析代写
探索性数据分析代写

发表回复