The following is a racing chart of covid death counts over time. I wanted to see how each of the countries were effected by Covid-19 over time from January of 2020 to June of 2020. The goal of this report is to test the following hypotheses.
• Is the average number of deaths in the USA greater than 1000 per day?
upload the dataset
# set the working directory in my pc to read the csv file
setwd("C:/Users/Mufaddal/Desktop")
covid <- read.csv("Covid_19dataset.csv", header = T)
Lets take a quick look at the covid dataset (Last Updated: 05/17/2020)
head(covid)
## ï..Province_State Country_Region Last_Update Lat Long_
## 1 Alabama US 2020-04-12 23:18:15 32.3182 -86.9023
## 2 Alaska US 2020-04-12 23:18:15 61.3707 -152.4044
## 3 Arizona US 2020-04-12 23:18:15 33.7298 -111.4312
## 4 Arkansas US 2020-04-12 23:18:15 34.9697 -92.3731
## 5 California US 2020-04-12 23:18:15 36.1162 -119.6816
## 6 Colorado US 2020-04-12 23:18:15 39.0598 -105.3111
## Confirmed Deaths Recovered Active FIPS Incident_Rate People_Tested
## 1 3563 93 NA 3470 1 75.98802 21583
## 2 272 8 66 264 2 45.50405 8038
## 3 3542 115 NA 3427 4 48.66242 42109
## 4 1280 27 367 1253 5 49.43942 19722
## 5 22795 640 NA 22155 6 58.13773 190328
## 6 7307 289 NA 7018 8 128.94373 34873
## People_Hospitalized Mortality_Rate UID ISO3 Testing_Rate
## 1 437 2.610160 84000001 USA 460.3002
## 2 31 2.941176 84000002 USA 1344.7116
## 3 NA 3.246753 84000004 USA 578.5223
## 4 130 2.109375 84000005 USA 761.7534
## 5 5234 2.812020 84000006 USA 485.4239
## 6 1376 3.955112 84000008 USA 615.3900
## Hospitalization_Rate
## 1 12.26494527
## 2 11.39705882
## 3
## 4 10.15625
## 5 22.9611757
## 6 18.8312577
nrow(covid)
## [1] 2243
Let us now define our null and alternate hypotheses:
- Null Hypothesis: Mean number of deaths is 1000. \[
Ho: \mu = 1000
\]
- Alternate Hypothesis: Mean number of deaths is greater than 1000 \[
H_{1}: \mu > 1000
\]
# Perform the ttest calculation to find a conclusion.
t.test(covid_us$Deaths, mu = 1000, alternative = "greater")
##
## One Sample t-test
##
## data: covid_us$Deaths
## t = 1.345, df = 1872, p-value = 0.0894
## alternative hypothesis: true mean is greater than 1000
## 95 percent confidence interval:
## 977.6092 Inf
## sample estimates:
## mean of x
## 1100.145
As you can see by our conclusion, since our P-Value is greater than 0.05 we fail to reject the null hypothesis. Therefore we can conclude that we dont have sufficient evidence to claim that the average number of deaths is equal to 1000 per day.