Measures of variability Measures of Variability 1. Range: = max - min (simplest, but not always useful) Covariance and Correlation Numerical summary of data Covariance and Correlation Measures of variability 2. Variance: based on the difference between each observation and the mean. Population variance: ∑ σ2 = (xi − µ)2 N Sample variance (almost always): ∑ (xi − x ¯ )2 2 s = n−1 Remarks: Variance is always non-negative (≥ 0) 0 variance means there is no variation. i.e. the whole data set…
Words 1049 - Pages 5
FantasyBaseball <- read.table(paste(myDataFolder, 'FantasyBaseball.txt', sep=''), header=TRUE, sep="\t") a) Produce side-by-side boxplots to compare the selection times for each participant. Comment on these. Also, calculate the average selection time for each participant using the aggregate() function. boxplot(Time~Person, data=FantasyBaseball) According to the boxplot, JW has the highest average selection time and TS has the lowest average selection time among the participants. To calculate the…
Words 1598 - Pages 7
a well-labeled histogram, and a boxplot for spending allowing for eight intervals. Are the data skewed, if so how? Interval width = rounded to 110 to allow for 8 intervals The data has a rightward skew in both the histogram and boxplot (explain the boxplot later with the whiskers and all) due to the majority of students who participated in the survey spending $640 or less on their textbooks and supplies. 3. Determine a whether a relationship exists between some of the numerical values, GPA…
Words 803 - Pages 4
responses are merely named uniquely. Ordinal level of measurement is the next level. Data that is noted or recorded at this level is ranked or counted. The data is arranged in some type of order however the difference between the data values cannot be determined or are meaningless. The differences in two ordinal scales cannot be assumed, they are only ranked or rated on a relative scale. Ordinal scales are usually non-numeric thoughts like satisfaction, happiness or discomfort. You can determine…
Words 840 - Pages 4
However, the median in 2013 was 84.5, therefore showing that the temperatures in 2013 had lower temperatures than 2014. All the observations in the five number summary in 2014 were respectably higher than the observations in 2013. The side by side boxplots and the histograms are the best graphs used to compare these data observations. This is because the histogram groups the values together and clearly graphs the distribution so…
Words 403 - Pages 2
the following topics: (a) Explain the difference between a continuous and a discrete variable, using examples to illustrate your answer. 2 (b) Name two probability sampling methods you have learnt in this course. How would any non-probability sampling method differ from these methods? 2 (c) Explain how ‘descriptive’ and ‘inferential’ statistical methods differ, giving one example of each. 2 (d) Use examples to illustrate the difference between categorical and numerical variables.…
Words 2724 - Pages 11
How do we describe a variable in terms of its Who, What, When, Where, Why and How? Chapter 3: Displaying and Describing Categorical Data (Pages 1523) How do we identify whether a variable is categorical and choose an appropriate display for it? How do we examine the association between categorical variables by comparing conditional marginal percentages? Chapter 4: Displaying Quantitative Data (Pages 3646) How do we identify an appropriate display for any quantitative variable? How do we guess the shape of the distribution of a variable by knowing something about…
Words 4395 - Pages 18
INTRODUCTION AJ Davis is a department store chain with many customers who hold credit accounts at the store. The company’s management group wants to analyze the data collected and summarized to determine if there is any connection or relationship between the information gathered from the customers. Sample of 50 credit customers with the following variables: 1) LOCATION (Rural, Urban, Suburban) 2) INCOME (in $1,000’s) 3) SIZE (the number of people living in the household) 4) YEARS (the number…
Words 978 - Pages 4
in {Alice, Bob, Carol} has a 1/3 chance of being the perpetrator. However, as long as none of them has been identified as being the perpetrator with 100% certainty, we can say that the perpetrator remains anonymous. Association any relationship between two measured quantities that renders them statistically dependent The term "association" is closely related to the term "correlation." Both terms imply that two or more variables vary according to some pattern. Assumptions a statement that is assumed…
Words 7770 - Pages 32