Read More
Date: 22-4-2021
1145
Date: 22-2-2021
1012
Date: 4-4-2021
1267
|
As we said, in order to answer the question, “what is the typical value of the data,” we look at the measures of central tendency. We already mentioned two of these, the median and the mean. Another is the mode.
We shall refer to the number of scores in a distribution as its size. We do not mean just the number of scores; we take the frequency of a score into account. For example, in Worked Example 1.1in(Describing Data) the distribution has size 25. The median is the middle value of the distribution; if a distribution has size 2n + 1, and we list the values as x1,x2,...,x2n+1, where x1 ≤ x2 ≤ ... ≤ x2n+1, then the median is xn+1.
If the number of entries is even, a small modification is needed: the median of x1,x2,...,x2n, where x1 ≤ x2 ≤ ... ≤ x2n is 1/2 (xn +xn+1).
The mean, or average, of a distribution is found by adding all the scores and dividing by the size. We write µ ( mu, the Greek letter equivalent to m) for the mean of a population. The mode is the score with the highest frequency. Sometimes there will be no mode, because the highest frequency is attained by two or more scores.
Sample Problem 1.2 Find the median, mean, and mode of the scores in Worked Example 1.1in(Describing Data).
Solution. The scores, in ascending order, are
3,7,7,7,8,9,9,10,10,12,13,13,13,14,15,15,15,15,16,16,17,17,18,18,19.
There are 25 scores. So the median is the 13th score, which happens to be 13.
The sum of the scores is 317, so the mean is 317/25 = 12.68. The mode, or most common score, is 15.
A histogram is often the best tool for getting an overall picture of the distribution.
In large population, the histogram often looks like a smooth curve (like the examples in Fig. 1.1, below). The highest point in the curve will be the mode; if there are two high points with a drop in between them, the distribution is called bimodal, but only the higher of the two high points is called a “mode.”
People often expect the median to be near the center of the histogram. This is not always true. A symmetric distribution is one in which the part of the curve to the left of the median is roughly a mirror image of the part to the right of the median.
A distribution is called skewed if the observations on one side of the median extend noticeably farther from the median than the observations on the other side. (We say the distribution is skewed to the left if the long tail is to the left, and skewed to the right if it is in the other direction.) When a distribution is skewed, the mean will lie nearer to the tail than the median does. For example, in most towns there are a small number of very wealthy people, and a histogram of family incomes will be skewed to the right. The mean income will be greater than the median income.
Figure 1.1 shows examples of a bimodal histogram and a histogram that is skewed to the left, from large populations.
Fig. 1.1 A bimodal and a skewed histogram
Some histograms will have gaps, whole groups with no entries. An outlier is an individual score that is separated from the rest of the distribution. For example, in the set of quiz scores in Worked Example 1.1(Describing Data), the score 3 is an outlier. If the histogram is symmetric, the mean and median are about the same. If it is skewed to left, the median is usually greater than the mean (although things like outliers can affect this). If it is skewed to right, median is usually smaller than the mean.
People often confuse the median and the mean. For example, a real estate agent who notes that the mean housing price for an area is $175,000 might conclude that half of the houses in the area cost more than that. The real estate agent is confusing the median of the set of data with the mean. If there are outliers in the data (two very expensive houses, with all other homes around $100,000) that can make the mean value of the homes larger, half the homes will not have prices that fall above the mean.
Note that the length of the groups can make a difference in the information we get from a histogram. For example, the outlier in Worked Example 1.1 (Describing Data), does not show up when groups of width 3 are used. As another example, consider the following histogram; the entries are scores 1, 2, 3, . . . .
|
|
علامات بسيطة في جسدك قد تنذر بمرض "قاتل"
|
|
|
|
|
أول صور ثلاثية الأبعاد للغدة الزعترية البشرية
|
|
|
|
|
العتبة الحسينية تطلق فعاليات المخيم القرآني الثالث في جامعة البصرة
|
|
|