代写应用统计作业-Applied Statistics代写-ECS764P代写
代写应用统计作业

代写应用统计作业-Applied Statistics代写-ECS764P代写

Applied Statistics (ECS764P) – Lab 3

 17 Nov 2022

代写应用统计作业 You can assume that numpy, matplotlib and scipy are  installed on the machine of the person who will run  and mark your notebook.

1 Theory  代写应用统计作业

1.It can be shown that the sum of two χ2-distributions is a χ2-distribution. Specically:

χ2(k1) + χ2(k2) = χ2(k1 + χ2)

Show that the sample mean of N independent and identically distributed χ2 distributions χ2(k) is exactly given by a Gamma distribution with shape parameter  and scale parameter, i.e.

Hint: recall from the lecture notes that we’ve computed the density of αd (where α is some positive number and d is some distribution) in terms of the density of d. Look up the densities of the χ2 and Gamma distribution online. The rest is just simple algebra.

2.In this exercise you will learn how to normalise/standardize a normal distribution, i.e. show that you can always reduce the computation of a probability mass under Norm (µ, σ) to the computation of a probabilitymass under Norm (0, 1). You will then use this to prove the weak Law of Large Numbers for normal distributions.

(a)Showfrom rst principles (i.e. from the de nition of the pushforward probability measure, see slides) that

(b)Show from rst principlesthat

代写应用统计作业
代写应用统计作业

(c)Conclude that

and express this quantity in terms of the standard normal cdf Φ.

(d)Inthe previous Lab you were asked to show that the distribution of sample means of n independent and identically distributed normal distributions is given by

Using this fact you can give a simple and direct proof of the weak Law of Large Number for the case of normal distributions. For this, x ε, δ > 0 and nd an n such that

代写应用统计作业
代写应用统计作业

In other words,  nd n  such that the probably of a sample mean landing within ε  of the sample mean   is at least 1 δ. (Hint: use the inverse cdf/quantile function of the standard normal distribution, usually denoted Φ1 .)

代写应用统计作业
代写应用统计作业

2 Practice  代写应用统计作业

You can assume that numpy, matplotlib and scipy are  installed on the machine of the person who will run  and mark your notebook. There is no need to force an install with the ! command. For textual answers please use a markdown cell.

1.Itcan be shown (see theory exercises) that the sample mean of N independent and identically distributedχ2-distributions χ2(k) is exactly given by a Gamma distribution with shape parameter and scale parameter(see https://en.wikipedia.org/wiki/Gamma_distribution). Now follow these steps:

(a)Create a 2-by-3 array of subplots. Fix k = 3 and instantiate an array N = [5, 10, 50] and avariable size = 100, 000.  代写应用统计作业

(b)Using a for loop, for each value n in N samplea size n array of samples from the distribution χ2(k).

(c)Compute the sample average along each row (i.e. you should get size sample averages), and plot their histogram in a subplot.

(d)Over the histogram (i.e. in the same subplot), plot the exact density of the distribution of sample aver- ageswhich is given by the gamma distribution described above (Hint: make sure you input the values of N and k correctly, and check how shape and scale parameters are input in the stats.gamma documentation). You should get an extremely good agreement between the pdf and the histogram.

(e)In a separate subplot, display the QQ plot of the sample means versus their exact (Gamma) distri- bution.

2.The Central Limit Theorem (CLT) shows that for su ciently large values of N , the sample meanof N  independent and identically distributed χ2-distributions χ2(k) is approximately  given by a Normal Distribution with mean E[χ2(k)] = k and variance   Follow these steps (the  rst three are exactly like in Q1):

(a)Create a 2-by-3 array of subplots. Fix k = 3 and instantiate an array N = [5, 10, 50] and avariable size = 100, 000.

(b)Using a for loop, for each value n in N samplea size n array of samples from the distribution χ2(k)

(c)Compute the sample average along each row (i.e. you should get size sample averages), and plot their histogram in a subplot.  代写应用统计作业

(d)Over the histogram (i.e. in the same subplot), plot the approximate density of the distribution of sampleaverages which is given by the CLT as described above.

(e)Ina separate subplot, display the QQ plot of the sample means versus their approximate distribution

For which value N is the approximate density of sample means given by the CLT indistinguishable from the actual density you’ve plotted in Q1?

3.Suppose that you’re working for the UK Department of Health and Social Care, and that you have been tasked with establishing the prevalence of COVID-19 among primary school children.  代写应用统计作业

For the sake of simplicity we will assume that every class has 30 pupils. The number of COVID-positive children in a class is modelled by a binomial distribution Binom (30, p), and the last time this study was done it was established that p = 0.1. You want to test if this has changed signi cantly, where signi cantly means with α = 95%. State your H0.

You start your study with 4 classes, test every children and report a total of 20 cases. Since Binom (N, p) is a sum of N Bernoulli distributions with parameter p, it is clear that the sum of four Binom (30, p) is Binom (120, p). In your notebook compute:

  • The probability under H0 of having observed 20 cases.
  • The probability under H0 of having observed an outcome at least as extreme as 20 cases.  代写应用统计作业

Discuss whether to reject H0 or not. What does it say about the epidemic?

Your research gets extended, and you now test 300 classes.  With so many classes, it is no longer practical  to use an exact test with a Binomial distribution (N  = 9000).  Explain why in at most two  sentences.  (Hint: think about what you have just computed.)

You therefore use the CLT to test for the value of p. The variance of Binom (30, p) can be easily computed exactly, so it makes sense to use the Z-test. Your con dence level remains at α = 95% and you now observe a total of 945 cases.

Compute the Z-statistic. For this, make sure you can answer the following:

  • What is the theoretical mean under H0?
  • What is the observed sample mean?
  • What is the sample size?
  • What is the variance of Binom (30, p)?

Now compute the p-value in your notebook and discuss whether you would you reject your H0 or not and what is says about the epidemic and your previous, small-scale trial.

 

更多代写:北美代考价格推荐  gre作弊被抓   北美会计代考  北美数学math作业代写  毕业paper代写  大学report格式

合作平台:essay代写 论文代写 写手招聘 英国留学生代写

代写应用统计作业
代写应用统计作业

发表回复