The other day, I tweeted out the following photo and link:
As advertised, the survey (click here if you’d like to respond) consisted of one question: “In your opinion, what’s the perfect banana ripeness? (For eating, NOT for making banana bread.)” The options were “More green than 1,” “1,” “2,” “3,” … and so on until “15,” and “More brown than 15.”
Here’s what I learned:
Y’ALL LOVE TALKING ABOUT BANANAS.
You wrestled with the choices, shamed your fellow MTBoS-ers for their preferences, and pondered why some might lean towards one end of the scale or the other. So far, I’ve received 1609 responses… and counting. Clearly, I hit a strong nerve.
Now, some of you may have assumed that I was collecting the data for a brilliant lesson in my classroom… well, prepare to be disappointed (although I do offer some suggestions for lessons below). In fact, the question emerged in my Grade 9 classroom when I saw a student eating a very green banana. Given that my Banana Number is 7 (I expect this metric to become no less iconic than the Erdős number) and that I find any banana less ripe than this to be basically inedible, I was perplexed. I brought up the following image on Google and did a quick poll:
Like you, my students were eager to debate the issue, which made me think it might offer an opportunity to take a brief detour into some data analysis. That night, I found the 1-15 photo (by Rebecca Wright; click here for original; I’m not sure who added the numbers) and made the poll, which I tweeted out and also had my students take in class. We briefly discussed what they expected the results to look like, and we moved on to that day’s lesson on fraction multiplication, as I was hoping to get a few more responses before analyzing them with my students. I expected a few dozen, maybe a hundred responses… so it would be an understatement to say that I was bowled over by the overwhelming reaction.
So, what now?
First, here are the (preliminary, given that responses are still rolling in) results (click here to see the data in Google sheets):
The mean ripeness preference in this sample is 7.34 and the standard deviation is 1.96. This means that about 74.8% (mean +/- 1 standard deviation) of people prefer bananas that are between 6 and 9 on the ripeness scale, and about 95.8% (mean +/- 2 standard deviations) prefer bananas between 4 and 11 on the ripeness scale.
What I find interesting is that the frequency decreases on both sides of the mean, but begins to tick up again both at 1 and 14. This gives the distribution “heavy tails,” with more extreme values than a normal distribution would predict. (For example, the normal distribution would predict that essentially 0 people in the same sample would choose 15 as their most preferred banana to eat, as compared to 14 in the actual sample.) I wonder if this may be because there are no values past 15 or lower than 1, so the frequencies “pool” there (i.e., if the photo included bananas that were even more ripe than 15 and even more green than 1, this would distribute the frequencies currently at 1 and 15 to neighbouring numbers, because these groups would now be able to fine-tune their preference).
More importantly: What might you do with this in your classroom?
This is where I hope you will chime in with your ideas. Like I said, I don’t expect to spend a lot of time on this with my own students, because it’s outside the scope of our Grade 9 curriculum. But here are some directions I might take it if I had the time and/or if I was lucky enough to teach statistics:
- Have students predict the shape of the distribution (after having taken the survey themselves). After analyzing the results and noting the main features of the distribution (how far you take this depends on your students and their background knowledge), ask students to design and carry out a survey where they think they might get a similar distribution. For example, Nat suggested that a similar distribution might be observed for milk & coffee preferences, with extreme left being black – no milk – and extreme right being milk – no coffee – (although I think this might be skewed towards the left if polling adults, and right if polling teenagers). (N.b. for SK readers: This may be an appropriate task for outcome SP9.2 – Demonstrate an understanding of the collection, display, and analysis of data through a project, or an introductory task for outcome FM20.6 – Demonstrate an understanding of normal distribution.)
- Connect to the topic of probability by asking students to make estimates for various samples. For example, in a sample of 500 people, how many would you expect to have a banana number of 5? 10?
- Have students refine the experiment to consider the effect of different factors on banana ripeness preference, such as age. (Others suggested nationality, but that would be more difficult to test.) Are there other ways to improve the experiment? (Of this, I am sure – I spent not more than 3 minutes designing the survey, certainly not expecting it to go as far as it did.)
- As Marla Goldberg suggested, reverse the task by presenting the results first and asking students, “What’s Going on in This Graph”? For instance, you might show students the following sequence, asking them to reflect after the first two and predict / explain the axis labels:
- Statistics students can test the data for normality. As I noted earlier, the distribution has heavy tails, and although the normal distribution predicts that 99.7% of the data would fall within 3 standard deviations of the mean, only 97.3% of the data does so in this sample (the difference seems small, but the data sample is quite large). I haven’t performed the necessary analysis, so I leave it to you. Might the distribution be multimodal?
I’d love to hear your ideas. (Here’s the link to the data again.)
What’s your toast number?