Grouping Data
Grouping Data
A common way of handling continuous quantitative data is to subdivide the full range of values into a few sub-ranges. By assigning each continuous value to the sub-range or class within which it falls, the data set changes from continuous to discrete.
Grouping is done by defining a set of ranges and then counting how many of the data fall inside each range. The sub-ranges must not overlap and must cover the entire range of the data set.
One way of visualising grouped data is as a histogram. A histogram is a collection of rectangles, where the base of a rectangle (on the \(x\)-axis) covers the values in the range associated with it, and the height of a rectangle corresponds to the number of values in its range.
The following video explains how to group data.
Example
Question
The heights in centimetres of \(\text{30}\) learners are given below.
|
\(\text{142}\) |
\(\text{163}\) |
\(\text{169}\) |
\(\text{132}\) |
\(\text{139}\) |
\(\text{140}\) |
\(\text{152}\) |
\(\text{168}\) |
\(\text{139}\) |
\(\text{150}\) |
|
\(\text{161}\) |
\(\text{132}\) |
\(\text{162}\) |
\(\text{172}\) |
\(\text{146}\) |
\(\text{152}\) |
\(\text{150}\) |
\(\text{132}\) |
\(\text{157}\) |
\(\text{133}\) |
|
\(\text{141}\) |
\(\text{170}\) |
\(\text{156}\) |
\(\text{155}\) |
\(\text{169}\) |
\(\text{138}\) |
\(\text{142}\) |
\(\text{160}\) |
\(\text{164}\) |
\(\text{168}\) |
Group the data into the following ranges and draw a histogram of the grouped data:
\begin{align*} 130 \le h < 140 \\ 140 \le h < 150 \\ 150 \le h < 160 \\ 160 \le h < 170 \\ 170 \le h < 180 \end{align*}(Note that the ranges do not overlap since each one starts where the previous one ended.)
Count the number of values in each range
|
Range |
Count |
|
\(130\le h<140\) |
\(\text{7}\) |
|
\(140\le h<150\) |
\(\text{5}\) |
|
\(150\le h<160\) |
\(\text{7}\) |
|
\(160\le h<170\) |
\(\text{9}\) |
|
\(170\le h<180\) |
\(\text{2}\) |
Draw the histogram
Since there are \(\text{5}\) ranges, the histogram will have \(\text{5}\) rectangles. The base of each rectangle is defined by its range. The height of each rectangle is determined by the count in its range.
The histogram makes it easy to see in which range most of the heights are located and provides an overview of the distribution of the values in the data set.
This lesson is part of:
Statistics and Probability