## Making Groups - A Genetic Algorithm Experiment

I've wanted to experiment with genetic algorithms for a long time, but never quite found the time to make it happen. I guess part of it was that I never believed the algorithm would work and find what I wanted.  I decided recently to pick this project back up after a long time (with some prodding from good people) and actually make it work. I think the major push came when I realized that I wanted the ability to make mixed groups, or homogeneous groups, and balance gender, and prevent certain students from being together. Was I over-constraining? Was the possibility that I was over-constraining keeping me from even trying to do this in the first place?

Thus, I decided to actually make this happen. I also decided I wanted to make it look much nicer than the Python version I've been using now for over four years.

You can see the code directly here at CodePen, or play with it below.

See the Pen GroupMaker by Evan Weinberg (@emwdx) on CodePen.0

The basic algorithm is this:

• Fill in a list of students with names, genders, skill levels, and an optional list of names with whom a given student should not be grouped. Line 2
• Generate a bunch of random groups of students with these properties. For each group, calculate a series of metrics that the fitness of a given group. Lines 45-156
• Calculate the score of a grouping, which consists of a full set of groups that contain all of the students of the class. Line 200
• Generate a bunch of groupings, sort them according to score. Take the top 10 groups, and make swaps of students between groups to make a long list of groups. (This is lines 214-224.) This is the mutation step of the genetic algorithm. Sort them again according to score.
• Repeat for a few generations, then take the top group.

It's pretty fascinating to watch it work. I made the scoring step uniform for gender by tweaking the coefficients in Line 200. You could also make this score value gender balance, a range of abilities, or anything else.

This is screaming for a nicer UI, which I have in the works. For now, it's out there in its fairly un-commented state. If you want to hack this to use with your own students, you'll want to tweak the student list in Line 2, the numbers of groups of each size (groups of 1, groups of 2, groups of 3, and so on) in Line 242, and possibly the values I use in the score generation line in line 200.

## An Experiment: Swapping Numerical Grades for Skill-Levels and Emoji

I decided to try something different for my pre-Calculus class for the past three weeks. There was a mix of factors that led me to do this when I did:

• The quarter ended one week, with spring break beginning at the end of the next. Not a great time to start a full unit.
• I knew I wanted to include some conic sections content in the course since it appears on the SAT II, and since the graphs appear in IB and AP questions. Some familiarity might be useful. In addition, conic sections also appear as plus standards within CCSS.
• The topic provides a really interesting opportunity to connect the worlds of geometry and algebra. Much of this connection, historically, is wrapped up in algebraic derivations. I wanted to use technology to do much of the heavy lifting here.
• Students were exhibiting pretty high levels of stress around school in general, and I wanted to provide a bit of a break from that.
• We are not in a hurry in this class.

Before I share the details of what I did, I have to share the other side to this. A long time ago, I was intrigued by the conversation started around the Twitter hashtag #emojigrading, a conversational fire stoked by Jon Smith, among many others. I like the idea of using emoji to communicate, particularly given my frustrations over the past year on how communication of grades as numbers distort their meaning and imply precision that doesn't exist. Emoji can be used communicate quickly, but can't be averaged.

I was also very pleased to find out that PowerSchool comments can contain emoji, and will display them correctly based on the operating system being used.

So here's the idea I pitched to students:

• Unit 7 standards on conic sections would not be assessed with numerical grades, ever. As a result, these grades would not affect their numerical average.
• We would still have standards quizzes and a unit exam, but instead of grades of 6, 8, and 10, there would be some other designation that students could help select. I would grade the quizzes and give feedback during the class, as with the rest of the units this year.
• Questions related to Unit 7 would still appear on the final exam for the semester, where scores will be point based.

I also let students submit some examples of an appropriate scale. Here's what I settled on based on their recommendations:

I also asked them for their feedback before this all began. Here's what they said:

• Positive Feedback:
• Fourteen students made some mention of a reduction in stress or pressure. Some also mentioned the benefits of the grade being less specific being a good thing.
• Three students talked about being able to focus more on learning as a result. Note that since I already use a standards based grading system, my students are pretty aware of how much I value learning being reflected in the grade book.
• Constructive Feedback:
• Students were concerned about their own motivation about studying or reassessing knowing that the grades would not be part of the numerical average.
• Some students were concerned about not having knowledge about where they are relative to the boundaries of the grades. Note: I don't see this by itself as a bad thing, but perhaps as the start of a different conversation. Instead of how to raise my grade, it becomes how I develop the skills needed to reach a higher level.
• There were also mentions of 'objectivity' and how I would measure their performance relative to standards. I explained during class that I would probably do what I always do: calculate scores on individual standards, and use those scores to inform my decisions on standards levels. I was careful to explain that I wasn't going to change how I generate the standards scores (which students have previously agreed are fair) but how I communicate them.

I asked an additional question about what their parents would think about the change. My plan was to send out an email to all parents informing them of the specifics of the change, and I wanted students to think proactively about how their parents would respond. Their response in general: "They won't care much." This was surprising to me.

So I proceeded with the unit. I used a mix of direct instruction, some Trello style lists of tasks from textbooks, websites, and Desmos, and lots of circulating and helping students individually where they needed it. I tried to keep the only major change to this unit to be the communication of the scores through the grade book using the emoji and verbal designation of beginner, intermediate, expert. As I also said earlier, I gave skills quizzes throughout.

The unit exam was a series of medium level questions that I wanted to use to gauge where students were when everything was together. As with my other units, I gave a review class after the spring break where students could work on their own and in groups, asking questions where they needed it. Anecdotally, the class was as focused and productive as for any other unit this year.

The fact that the stress level was the same, if not less, was good to see. The effort level did drop in the case of a couple of students here, but for the most part, there isn't any major change. This class as a whole values working independently, so I'm not surprised that none reported working harder during this unit.

I also asked them to give me general feedback about the no-numerical-grades policy. Some of them deleted their responses before I could take a look, but here's some of what they shared:

• Three students confirmed a lower stress level. One student explained that since there was no numerical grade, she "...couldn't force/motivate [her]self to study."
• Five students said the change made little to no difference to them. One student summed it up nicely: "It wasn't much different than the numerical grades, but it definitely wasn't worse."
• One student said this: "The emojis seemed abstract so I wasn't as sure of where I was within the unit compared to numbers." This is one of a couple of the students that had concerns about knowing how to move from one level to the next, so the unit didn't change this particular student's mind.

• This was a really thought-provoking exercise. A move away from numerical grades is a compelling proposition, but a frequent argument against it is that grades motivate students. By no means have I disproven this fact in the results of my small case study. If a move like this can have a minimal effect on motivation, and students get the feedback they need to improve, it offers an opportunity for considering similar experiments in my other classes.

There are a couple questions I still have on this. Will students choose to reassess on the learning standards from unit 7, given that they won't change the numerical average when we return to numerical grades for unit 8? The second involves the longer term retention of this material. How will students do on these questions when they appear on the final exam?

## Trello for Class Organization

Our school hosted the Vietnam Technology Conference this past February.

(Yes, I'm just getting around to talking about it. Don't judge.)

One of the sessions I attended was about agile development in education, specifically as a way to organize the classroom into a room of independently functioning teams that are all trying to reach the goal of learning content. The full details on the philosophy can be found at http://eduscrum.com. I most certainly am not following the full implementation described there.

My interest was piqued by the possibility of using a Trello board to organize tasks for my classroom. I always make a digital handout for each class that consists of a series of tasks, links, problems, and chunks of information. Within the class block, I weave these together in a mix of direct instruction, group tasks, PearDeck activities, Desmos explorations, and so on. I advise students not to just do every problem on my handouts from start to finish because there is an order to my madness. I have a plan for students to go through the different activities, but I don't always clearly indicate that plan on these handouts.

This is where Trello came in. For my past two units in PreCalculus, I broke up the tasks on my digital handout into tasks on a Trello board. This consists of a list of tasks, and then three columns labeled 'to-do', 'in progress', and 'completed'.

I put students in groups, and then shared this Trello board here with them. Their group needed to make a Trello board for their group, and then copy the day's tasks onto their group's board. I told students how long a 'sprint' (an agile development term) was going to be, and the group would decide which tasks they would collectively (or individually) do during that time. They moved these tasks into the appropriate column of the board. As I continued to use the system, I realized that I could color code tasks according to the learning standards, and identify them according to the day of class. This helped students to understand the context of individual tasks later on.

The thing I liked the most about this experiment was that it actually enabled students to take charge of what they were doing during the lesson. I sometimes said that I was going to go over a concept at a particular time during the class block, and that teams could pay attention or not depending on their needs. This could be replaced by videos of this direct instruction to allow for more asynchronous learning for the students that weren't ready at that time. There were some great conversations between students about what they did and didn't understand. I could circulate and interject when I saw the need.

This put me in the position of curating interesting and productive tasks related to the course content, which is a lot more fun than planning a lecture. The students also liked being able to talk to each other and work at their own pace. Here's some feedback from the students:

#### What they liked:

• "I think it was nice how I could do things by whatever pace I felt more comfortable with to an extent since we did things as a small group rather than as an entire class."
• "It kept me working and thinking the whole class. It also helped me work out problems independently which helped my understanding."
• "I liked the ability to keep track of all my work, as well as knowing all the problems beforehand. I also like being able to have more time to discuss with friends to understand how we each came up with various solutions."

#### What could be improved:

• "Maybe I rather stick with traditional teaching methods. This is borderline self-taught and it's not so much better with group of people that I don't know well."
• "I think it would be better to go through the theory and concepts of the standard first, meaning how to do a problem as a class before splitting into smaller groups for individual/team work."
• "For future classes, I would also like informative videos to be included so that we can learn new topics this way."

This feedback made it easy to adjust for the next classes, and I continued to tweak in the next unit. The students really like the act of moving tasks between the different columns on the Trello board too. I really like the ease with which students can copy tasks, move them around, and plan their time independently. There are some good habits here that I'll be thinking about expanding to other classes later this semester or for the next school year.

## Generating the Mandelbrot Set with PearDeck

One of the benefits of being a digital packrat having a digital file cabinet is that every old file can be a starting point for something new.

In PreCalculus, I decided to do a short conic sections unit to fill the awkward two weeks between the end of the third quarter and the start of spring break. We've centered all of our conversations around the idea of a locus of points. I realized yesterday afternoon that the Algebra 2 activity I described here would be a great way to have some inquiry and experimentation on the day before break.

The online collaborative tools have improved considerably since 2012 when I first did this. I put much of the lesson into Google Docs and PearDeck which made sharing answers for the final reveal much easier. Here's what the students had for values that either "escaped" or were "trapped" in the Complex Plane:

I compared this to the pixelated Mandelbrot set I hacked together in Processing from Daniel Shiffman's code five years ago. Still works!

You can access the entire digital lesson with links as a Google Doc here.

## My Reassessment Queue

We're almost at the end of the third quarter over here. Here's the current plot of number of reassessments over time for this semester:

I'm energized though that the students have bought into the system, and that my improved workflow from last semester is making the process manageable. My pile of reassessment papers grows faster than I'd like, but I've also improved the physical process of managing the paperwork.

While I'm battling performance issues on the site now that there's a lot of data moving around on there, the thing I'm more interested is improving participating. Who are the students that aren't reassessing? How do I get them involved? Why aren't they doing so?

There are lots of issues at play here. I'm loving how I've been experimenting a lot lately with new ways of assessing, structuring classes, rethinking the grade book, and just plain trying new activities out on students. I'll do a better job of sharing out in the weeks to come.

## SBG and Leveling Up, Part 3: The Machine Thinks!

Read the first two posts in this series here:

...or you can read this quick review of where I've been going with this:

• When a student asks to be reassessed on a learning standard, the most important inputs that contribute to the student's new achievement level are the student's previously assessed level, the difficulty of a given reassessment question, and the nature of any errors made during the reassessment.
• Machine learning offers a convenient way to find patterns that I might not otherwise notice in these grading patterns.

Rather than design a flow chart that arbitrarily figures out the new grade given these inputs, my idea was to simply take different combinations of these inputs, and use my experience to determine what new grade I would assign. Any patterns that exist there (if there are any) would be determined by the machine learning algorithm.

I trained the neural network methodically. These were the general parameters:

• I only did ten or twenty grades at any given time to avoid the effects of fatigue.
• I graded in the morning, in the afternoon, before lunch, and after lunch, and also some at night.
• I spread this out over a few days to minimize the effects of any one particular day on the training.
• When I noticed there weren't many grades at the upper end of the scale, I changed the program to generate instances of just those grades.
• The permutation-fanatics among you might be interested in the fact that there are 5*3*2*2*2 = 120 possibilities for numerical combinations. I ended up grading just over 200. Why not just grade every single possibility? Simple - I don't pretend to think I'm really consistent when I'm doing this. That's part of the problem. I want the algorithm to figure out what, on average, I tend to do in a number of different situations.

After training for a while, I was ready to have the network make some predictions. I made a little visualizer to help me see the results:

You can also see this in action by going to the CodePen, clicking on the 'Load Trained Data' button, and playing around with it yourself. There's no limit to the values in the form, so some crazy results can occur.

The thing that makes me happiest about the result is that there's nothing surprising about the results.

• Conceptual errors are the most important ones that limit students from making progress from one level to the next. This makes sense. Once a student has made a conceptual error, I generally don't let students increase their proficiency level
• Students with low scores that ask for the highest difficulty problems probably shouldn't.
• Students that have an 8 can get to a 9 by doing a middle difficulty level problem, but can't get to a 10 in one reassessment without doing the highest difficulty level problem. On the other hand, a student that is a 9 that makes a conceptual error on a middle difficulty problem are brought back to a 7.

When I shared this with students, the thing they seemed most interested to use this to do is decide what sort of problem they want for a given reassessment. Some students with a 6 have come in asking for the simplest level question so they can be guaranteed a rise to a 7 if they answer correctly. A lot of level 8 students want to become a 10 in one go, but often make a conceptual error along the way and are limited to a 9. I clearly have the freedom to classify these different types of errors as I see fit when a student comes to meet with me. When I ask students what they think about having this tool available to them, the response is usually that it's a good way to be fair. I'm pretty happy about that.

I'll continue playing with this. It was an interesting way to analyze my thinking around something that I consider to still be pretty fuzzy, even this long after getting involved with SBG in my classes.

## SBG and Leveling Up - Part 2: Machine Learning

In my 100-point scale series last June, I wrote about how our system does a pretty cruddy job of classifying students based on raw point percentages. In a later post in that series, I proposed that machine learning might serve as a way to make sense of our intuition around student achievement levels and help provide insight into refining a rubric to better reflect a student's ability.

In I last post, I wrote about my desire to become more methodical about my process of deciding how a student moves from one standard level to the next. I typically know what I'm looking for when I see it. Observing students and their skill levels relative to a given set of tasks is often required to identify the level of a student students. Defining the characteristics of different levels is crucial to communicating those levels to students and parents, and for being consistent among different groups. This is precisely what we intend to do when we define a rubric or grading scale.

I need help relating my observations of different factors to a numerical scale. I want students to know clearly what they might expect to get in a given session. I want them to understand my expectations of what is necessary to go from a level 6 to a level 8. I don't believe I have the ability to design a simple grid rubric that describes all of this to them though. I could try, sure, but why not use some computational thinking to do the pattern finding for me?

In my last post, I detailed some elements that I typically consider in assigning a level to a student: previously recorded level, question difficulty, number of conceptual errors, and numbers of algebraic, and arithmetic errors. I had the goal of creating a system that lets me go through the following process:

• I am presented with a series of scenarios with different initial scores, arithmetic errors, conceptual errors, and so on.
• I decide what new numerical level I think is appropriate given this information. I enter that into the system.
• The system uses these examples to make predictions for what score it thinks I will give a different set of parameters. I can choose to agree, or assign a different level.
• With sufficient training, the computer should be able to agree with my assessment a majority of the time.

After a lot of trial and error, more learning about React, and figuring out how to use a different machine learning library than I used previously, I was able to piece together a working prototype.

You can play with my implementation yourself by visiting the CodePen that I used to write this. The first ten suggested scores are generated by increasing the input score by one, but the next ten use the neural network to generate the suggested scores.

In my next post in this series, I'll discuss the methodology I followed for training this neural network and how I've been sharing the results with my students.

## Exploring Dan Meyer's Boat Dock with PearDeck

In PreCalculus, I tend to be application heavy whenever possible. This unit, which has focused on analytic trigonometry, has been pretty high on the abstraction ladder. I try to emphasize right triangle trigonometry in nearly everything we do so that students have a way in, but that's still pretty abstract. I decided it was time to do something more on the application side.

Enter Dan Meyer's Boat Dock, a makeover concept he put together a year ago on his blog.

I decided to put some of it into Pear Deck to allow for efficient collection of student responses. The start of my activity was the same as what Dan suggested in his blog post:

After collecting the data, I asked students to clarify what they meant by 'best' and 'worst'. Student comments were focused on safety, cost, and limiting the movement of the ramp.

I shared that the maximum safe angle for the ramp was 18˚, and then called upon PearDeck to use one of its best features to see what the class was thinking visually. I asked students to draw the best ramp.

After having them draw it, I had them calculate the length of the best ramp. This is where some of the best conflict arose. Not everyone responded, for a number of reasons, but the spread was pretty awesome in terms of stoking conversation. Check it out:

The source of some of the conflict was this commonly drawn triangle, which prompted lots of productive discussion.

When students built their safest ramp using the Boat Dock simulator, it prompted the modelling cycle to return to the start, which is always great to have the ability to do.

I then asked students to create a tool using a spreadsheet, program, or algorithm by hand for finding the safest ramp of least cost for every random length of the ramp in the simulator. This open-ended request led to a lot of students nodding their heads about concepts learned in their programming classes being applied in a new context. It also lead to a lot of confusion, but productive confusion.

This was a lot of fun - I need to do this more often. I say that a lot about things like this though, so I also hope I follow my own advice.

## Standards Based Grading and Leveling Up

I've been really happy since joining the SBG fan club a few years ago.

As I've gained experience, I've been able to hone my definitions of what it means to be a six, eight, or ten. Much of what happens when students sign up to do a reassessment is based on applying my experience to evaluating individual students against these definitions. I give a student a problem or two, ask him or her to talk to me about it, and based on the overall interaction, I decide where students are on that scale.

And yet, with all of that experience, I still sometimes fear that I might not be as consistent as I think I am. I've wondered if my mood, fatigue level, the time of day affect my assessment of that level. From a more cynical perspective, I also really really hope that past experiences with a given student, gender, nationality, and other characteristics don't enter into the process. I don't know how I would measure the effect of all of these to confirm these are not significant effects, if they exist at all. I don't think I fully trust myself to be truly unbiased, as well intentioned and unbiased as I might try to be or think I am.

Before the winter break, I came up with a new way to look at the problem. If I can define what demonstrated characteristics should matter for assessing a student's level, and test myself to decide how I would respond to different arrangements of those characteristics, I might have a way to better define this for myself, and more importantly, communicate those to my students.

I determined the following to be the parameters I use to decide where a student is on my scale based on a given reassessment session:

1. A student's previously assessed level. This is an indicator of past performance. With measurement error and a whole host of other factors affecting the connection between this level and where a student actually is at any given time, I don't think this is necessarily the most important. It is, in reality, information that I use to decide what type of question to give a student, and as such, is usually my starting point.
2. The difficulty of the question(s). A student that really struggled on the first assessment is not going to get a high level synthesis question. A student at the upper end of the scale is going to get a question that requires transfer and understanding. I think this is probably the most obvious out of the factors I'm listing here.
3. Conceptual errors made by the student during the reassessment. In the context of the previous two, this is key in whether a student should (or should not) advance. Is a conceptual error in the context of basic skills the same as one of application of those skills? These apply differently at a level six versus a level eight. I know this effect when I see it and feel pretty confident in my ability to identify one or more of these errors.
4. Arithmetic/Sign errors and Algebraic errors. I consider these separately when I look at a student's work. Using a calculator appropriately to check arithmetic is something students should be able to do. Deciding to do this when calculations don't make sense is a sign of a more skilled student in comparison to one that does not. Observing these errors is routinely something I identify as a barrier to advancement, but not necessarily in decreasing a student's level.

There are, of course, other factors to consider. I decided to settle on the ones mentioned above for the next steps of my winter break project.

I'll share how I moved forward on this in my next post in the series.

## After Individualized Learning, What Comes Next?

This was my classroom in the latter part of the last block of the day.

I should point out that I usually have students seated closer together in groups. Conversation happens more organically in that configuration. I gave a quiz where I didn't want to set hard time limits. As each finished, I nudged them to work on their own on a PearDeck assignment.

This is what it looks like when everyone is working at their own pace. Each student with a single screen, each solving problems and answering questions.

I like that I can drift from student to student and either ask or answer questions when the time seems right. I can see each student's answers on the online teacher dashboard. I can decide which conversations I need to have. Students can also decide if they need to have conversations with me. I involved myself in student learning with surgical precision.

Some claim this is the future of learning in schools.

For me, the silence in the room today was unsatisfying. No sharing of ideas. No excitement shared between friends. Nothing that might compel a student to contemplate the other living, breathing beings in the room.

I don't do this every day, so I know this isn't how it will always be. Whenever I do this type of lesson, I know that the students are better off when they get what they need. I know it is good for them. My thinking always go to the next step. What will we do when we are back together in a big group, or at least in groups larger than one?

I asked the students the following question:

We have all worked independently today. What is the best way to use our time when we are back together?

Their answers gave me the direction I needed to think about the next steps:

• Either going over answers so people who haven't put the answers in will know what to do and what the correct answer is.
• Maybe review the questions people were confused with or got wrong a lot. This would help to review what we did.
• Go over some of the answers together or answer some of the difficult problems.
• I don't know.
• To check where the majority of us got stuck and had trouble, and discuss how to figure out those problems. That, and to possibly discuss new concepts that we have yet to master.
• Going through the questions that could be tricky and solve challenging problems together
• I think it is good to have a mini lesson to quickly teach what we are learning and to go over, but also save time for students to work independently.
• Give us a few examples at the beginning of class so we don't forget what we learned, and then learn new things.
• To quickly go through and review all the topics we've learnt about polynomials.
• Learn something new

Knowing what to go over is certainly where the online tools are helpful - they make incorrect answers or misconceptions stand out.

I know that the students prefer the social aspects of the classroom. Whatever our next step is, it should involve coming together and acknowledging and appreciating we are in a room of people learning together.

We need to make sure that we are social when being social is productive to learning.

We need to make sure students have time to learn and think on their own.

We need to make sure students can also learn what they need to know in the hands of an experienced guide.

All of these are crucial. Any one channel we use loses its effectiveness to learning when it becomes routine. I think this is especially the case when that routine involves staring at an entity that can't talk or laugh back.