Eric Gaze is the Director of the Quantitative Reasoning Program at Bowdoin College, and also a Senior Lecturer in Mathematics. The example outlined here is an activity he uses in his Quantitative Reasoning course for non-mathematics majors. It can also be used in an Introductory Statistics Course.
Histograms in general can be a major conceptual hurdle for students, as can the idea of sampling variability. The Central Limit Theorem requires students to visualize random sampling resulting in the theoretical distribution of sample means. This instructional example starts with imagining a data set of 9,000 random digits from 1 – 9. Students then brainstorm in groups of 3 or 4 students on what the distribution of this data set would look like and estimate its mean and standard deviation. Gaze has repeatedly been surprised by how few of the student groups correctly predict a uniform distribution when brainstorming. Asking students to sketch possible histograms and predict the measures of center and spread is a great metacognitive task, requiring a deep understanding of these concepts that simple calculation alone doesn’t immediately develop.
Next, students create this distribution in Excel using the RANDBEWTEEN(1,9) function, compute the mean and standard deviation using the AVERAGE and STDEV.P functions, and create a histogram which results in a uniform distribution. This activates prior knowledge and provides active learning through peer collaboration. Students can tangibly see why the mean of this data set with 9,000 random digits should be 5 and the standard deviation about half of this around 2.5. The class moves on to simulate random sampling by simply generating 1,000 samples of size 30 using the same RANDBETWEEN(1,9) function. This is much more intuitive than generating random samples using more technical statistical programs like R and SPSS. Students can actually see all 1,000 random samples, and the respective sample means. Students are again asked to brainstorm and sketch what they feel the distribution of sample means will look like and associated measures of center and spread.
The activity wraps up with a robust discussion of why the standard error must be smaller than the population standard deviation but the means of both distributions should be identical. Students consider questions like, “Why is it almost impossible for a sample of size 30 to have a mean of 7?”
Digital Resources
Excel or any other spreadsheet
Creating the histogram in Excel results in a beautiful approximation to the normal distribution! The mean and standard deviation are computed and compared to the predicted values from the Central Limit Theorem. The standard error is almost perfectly equal to the population standard deviation divided by the square root of the sample size. Students can then hit the F9 key and recalculate all the random values instantly, causing the histograms to “dance” illustrating the variability inherent is such modeling.
Active learning like this is particularly valuable for students with weaker math foundations, allowing them to tangibly see and discuss the abstract concepts involved. Gaze recommends preparing students with practice creating formulas and computing basic descriptive statistics using Excel or another spreadsheet in advance. This can allow all students to quickly move through the instructions and see the efficiency of the technology use, rather than getting frustrated by a new tool. Overall, this is a collaborative and visual activity that emphasizes why the normal distribution is so important for the discipline of statistics.
Digital Enablement
Modeling using spreadsheets allows students to tangibly see and discuss the abstract concepts involved with the Central Limit Theorem. The fact that we can run the simulation over and over by hitting the F9 key in Excel is a powerful way to get students to appreciate the random variability inherent in the Central Limit Theorem. Asking students to first sketch the distributions and estimate measures of center and spread prepares them to fully process the “answers” when they create the spreadsheet model for themselves. Using Excel like this is equivalent to having students actually play an instrument versus just talking about playing it.