Metrics and Information Visualization
The mean, standard deviation, and 5-number summary of this sample.
The five-number summary comprises of the minimum, maximum, 1st quartile, 3rd quartile, and median.
Statistic | Value |
Mean | 75.5 |
Standard Deviation | 19.99 |
Maximum | 99 |
1st Quartile | 67.75 |
Median | 80.5 |
3rd Quartile | 87 |
Minimum | 18 |
The 5-number summary depiction in the box plot:
Figure 1: Box Plot for the Five-Number Summary of the Data Sample
The presence of outliers in the dataset
Outlier values are values that do not fall within the ranges where the data is concentrated (Field, 2018). They are either too small or too large. In the above visualization using a box plot, two outlying values are shown by the two dots below the bottom whisker. The two values are 18 and 27. The top whisker shows the maximum value, the top of the box represents the third quartile, the strip in the middle of the box shows the median value, the bottom of the box represents the first quartile, and the bottom whisker is the minimum value within the area the data is concentrated.
Are you interested in obtaining a unique version of this copy. Get in touch with us. Our team of experts is ready to help.
Preference between mean or median as a measure of central tendency in the dataset
According to Doane and Seward (2019), the median is the best measure of central tendency for data that is not highly skewed and with a few outliers. The data presented above has only two outliers, which means that using the median to interpret the data would be an unbiased interpretation of the dataset. However, the context also determines the type of statistic appropriate for use since, for situations where data is expected to have several outliers, the mean is the best measure of central tendency.
References
Doane, D. P., & Seward, L. E. (2019). Applied statistics in business and economics (6th Ed.). McGraw-Hill: New York, NY.
Field A. (2018). Discovering Statistics Using IBM SPSS Statistics, Sage Publishers, California.
ORDER A PLAGIARISM-FREE PAPER HERE
We’ll write everything from scratch
Question
Respond to the following in a minimum of 175 words:
Metrics and Information Visualization
The most frequently used measures of central tendency for quantitative data are the mean and the median. The following table shows civil service examination scores from 24 applicants to law enforcement jobs:
83 74 85 79
82 67 78 70
18 93 64 27
93 98 82 78
68 82 83 99
96 62 93 58
Using Excel, find the mean, standard deviation, and 5-number summary of this sample.
Construct and paste a box plot depicting the 5-number summary.
Does the dataset have outliers? If so, which one(s)?
Would you prefer to use the mean or the medianas this dataset’s measure of central tendency? Why?