Making Groups – A Genetic Algorithm Experiment

I’ve wanted to experiment with genetic algorithms for a long time, but never quite found the time to make it happen. I guess part of it was that I never believed the algorithm would work and find what I wanted.  I decided recently to pick this project back up after a long time (with some prodding from good people) and actually make it work. I think the major push came when I realized that I wanted the ability to make mixed groups, or homogeneous groups, and balance gender, and prevent certain students from being together. Was I over-constraining? Was the possibility that I was over-constraining keeping me from even trying to do this in the first place?

Thus, I decided to actually make this happen. I also decided I wanted to make it look much nicer than the Python version I’ve been using now for over four years.

You can see the code directly here at CodePen, or play with it below.

[codepen_embed height=”265″ theme_id=”0″ slug_hash=”rmebEQ” default_tab=”js,result” user=”emwdx”]See the Pen GroupMaker by Evan Weinberg (@emwdx) on CodePen.[/codepen_embed]

The basic algorithm is this:

  • Fill in a list of students with names, genders, skill levels, and an optional list of names with whom a given student should not be grouped. Line 2
  • Generate a bunch of random groups of students with these properties. For each group, calculate a series of metrics that the fitness of a given group. Lines 45-156
  • Calculate the score of a grouping, which consists of a full set of groups that contain all of the students of the class. Line 200
  • Generate a bunch of groupings, sort them according to score. Take the top 10 groups, and make swaps of students between groups to make a long list of groups. (This is lines 214-224.) This is the mutation step of the genetic algorithm. Sort them again according to score.
  • Repeat for a few generations, then take the top group.

It’s pretty fascinating to watch it work. I made the scoring step uniform for gender by tweaking the coefficients in Line 200. You could also make this score value gender balance, a range of abilities, or anything else.

This is screaming for a nicer UI, which I have in the works. For now, it’s out there in its fairly un-commented state. If you want to hack this to use with your own students, you’ll want to tweak the student list in Line 2, the numbers of groups of each size (groups of 1, groups of 2, groups of 3, and so on) in Line 242, and possibly the values I use in the score generation line in line 200.

Automating conference scheduling using Python

I’ve always been interested in the process of matching large sets of data to a set of constraints – apparently the Nobel committee agreed this past week in awarding the economics prize. The person in charge of programming at my school in the Bronx managed to create an algorithm that generated a potential schedule for over four thousand students given student requests and needs. There was always some tweaking that needed to be done at the end to make it work, but the fact that the computer was able to start the process always amazed me. How do you teach a computer to do this sort of matching in an efficient way?

This has application within my classroom as well – generating groups based on ability, conflicting personalities, location – all complex situations that required time and attention to do correctly. In the end though, this is the same problem as arranging the schedules. It’s easy to start with a random arrangement and then make adjustments based on experience. There has to be a way to do this in an automated way that teaches the computer which placements work or do not. Andy Rundquist does this using genetic algorithms – I must know more about how he does it, as this is another approach to this type of problem.

This became a more tangible challenge for me to attempt to solve last year when I saw that the head of school was doing the two days of parent-teacher conference scheduling by hand. This is a complex process given the following constraints he was working to fulfill:

  • Parent preferences for morning/afternoon conference times.
  • Consecutive conference times for parents that had siblings so that the amount of time parents had to wait around was minimized.
  • Balanced schedules between the two days for all teachers.
  • Teachers with children had breaks in their schedule to attend conferences of their children.

This was apparently a process of 4 – 5 hours that sometimes required starting over because he discovered that the schedule he had started putting together was over constrained and could not meet all requirements. During this process, however, he had figured out an algorithm for what was most likely to work. Schedule the families with the largest number of children first, and work down the list in order of decreasing size. Based on the distribution of younger vs. older children in the school, start by scheduling the youngest children in a family first, and move to the older ones. Save all families with single children for last.

Hearing him talk about this process was interesting and heartbreaking at the same time – he works incredibly hard on all aspects of his job, and I wanted to provide some way to reduce the requirements of at least this task on his schedule. I was also looking for a reason to really learn Python, so this challenge became my personal exercise in problem based learning.

It took a while to figure out all of the details, but I broke it down into stages. How do you input the family data based on how it is already stored by the front office? (I didn’t want to ask the hard-working office staff to reformat the data to make it easier for me – this was supposed to make things easier on everyone.) How do you create a structure for storing this data in Python? How do you implement the algorithm that the head of school used and balance it with the idea of fairness and balance to all families and teachers?

Over the following few months, I was able to piece it together. It was, needless to say, a really interesting exercise. I learned how to ask the right questions that focused on the big picture needs of the administration, so that I could wrestle with the details of how to make it happen. The students learned that I was doing this (“Mr. Weinberg is using his robots to schedule conferences!”) and a few wanted to know how it worked. I have posted the code here as a gist.

I put in more than the 4-5 hours required to do this by hand. It was a learning experience for me. It also paid serious dividends when we needed to schedule conferences again for this year. We wanted to change the schedule slightly to be one full day rather than two half days, and it was a simple task adjusting the program to do this. We wanted to change the times of conferences so that the lower and upper schools had different amounts of time for each, rather than being a uniform twenty minutes each. (This I was not able to figure out before we needed conferences to go out, but I see a simple way to do it now.)

The big question that administration was about the upper school conferences. Last year we had seven different rooms for simultaneous conferences, and the question was whether we could reduce the number to five. I ran the program with five rooms a number of different times, and it was unable to find a working schedule, even with different arrangements and constraints. It was able to find one that worked with six rooms though, which frees administrators from needing to be in individual conference rooms so that they can address issues that come up during the day. Answering that question would not have been possible if scheduling was done by hand.

The question of using computers to automate processes that are repetitive has been in my head all this year. I’ve come to recognize when I am doing something along these lines, and try to immediately switch into creating a tool in Python to do this automatically. On the plane during a class trip last week, we needed to arrange students into hotel rooms, so I wrote a program to do this. I used it this week to also arrange my Algebra 2 students in groups for class. Generating practice questions for students to use as reassessment? I always find myself scrambling to make questions and write them out by hand, but my quiz generator has been working really well for doing this. Yesterday I had my first day of generating quizzes based on individual student needs.

The students get a kick out of hearing me say that I wrote a Python program to do XXX or YYY, and their reactions certainly are worth the effort. I think it just makes sense to use programming solutions when they allow me to focus on more important things. I have even had some success with getting individual students to want to learn to do this themselves, but I’ll write more about that later.