# Probability, Spreadsheets, and the Citizen Database

I've grown tired of the standard probability questions involving numbers of red, blue, and green marbles. Decks of cards are culturally biased and require a lot of background information to get in the game, as I wrote about a while ago. It seems that if there's any place where computational thinking should come into play, it's with probability and statistics. There are lots of open data sets out there, but few of them are (1) easy to parse for what a student might be looking for and (2) are in a form that allows students to easily make queries.

If you know of some that you've used successfully with classes, by all means let me know.

A couple of years ago, I built a web programming exercise to use to teach students about database queries. Spreadsheets are a lot more accessible though, so I re-wrote it to generate a giant spreadsheet of data for my Precalculus students to dig into as part of a unit on counting principle, probability, and statistics. I call it the Citizen Database, and you can access it here.

I wanted a set of data that could prompt all sorts of questions that could only be answered easily with a spreadsheet counting command. The citizens in the database can be described as follows:

• Each citizen belongs to one of twelve districts, numbered 1 - 12.
• Citizens are male or female.
• Citizens have their ages recorded in the database. Citizens 18 and below are considered minors. Citizens older than 18 and younger than 70 are adults. All citizens aged 70 and above are called seniors.
• Citizens each prefer one of the two sports teams: the Crusaders or the Orbiters.
• If a citizen is above the age of 18, they can vote for Mayor. There are two families that always run for mayor: the Crenshaw family and the Trymenaark family.
• Each citizen lives in either a home, apartment, villa, or mansion.
• A citizen above the age of 18 also uses some type of vehicle for transportation. They may rent a car, own a car, have a limousine, or take a helicopter.

I wrote another document showing how to do queries on a spreadsheet of data using some commands here. My students asked for some more help on creating queries using the COUNTIFS command on Google Sheets, so I also created the video below.

The fun thing has been seeing students acknowledge the fact that answering these questions would be a really poor use of the human brain, particularly given how quickly the computer comes up with an answer. One student went so far as to call this side-trip into spreadsheet usage "really actually useful", a comment which I decided only to appreciate.

Programming in Javascript, Python, Swift, whatever is great, but it takes a while to get to the point where you can do something that is actually impressive. Spreadsheets are an easy way in to computational thinking, and they are already installed on most student (and teacher) computers. We should be using them more frequently than we probably are in our practice.

If you are interested in how I generated the database, you can check out the code here at CodePen:

See the Pen CitizenDatabaseCreator by Evan Weinberg (@emwdx) on CodePen.