Monthly Archives: February 2017

SBG and Leveling Up, Part 3: The Machine Thinks!

Read the first two posts in this series here:

SBG and Leveling Up, Part 1
SBG and Leveling Up, Part 2: Machine Learning

...or you can read this quick review of where I've been going with this:

  • When a student asks to be reassessed on a learning standard, the most important inputs that contribute to the student's new achievement level are the student's previously assessed level, the difficulty of a given reassessment question, and the nature of any errors made during the reassessment.
  • Machine learning offers a convenient way to find patterns that I might not otherwise notice in these grading patterns.

Rather than design a flow chart that arbitrarily figures out the new grade given these inputs, my idea was to simply take different combinations of these inputs, and use my experience to determine what new grade I would assign. Any patterns that exist there (if there are any) would be determined by the machine learning algorithm.

I trained the neural network methodically. These were the general parameters:

  • I only did ten or twenty grades at any given time to avoid the effects of fatigue.
  • I graded in the morning, in the afternoon, before lunch, and after lunch, and also some at night.
  • I spread this out over a few days to minimize the effects of any one particular day on the training.
  • When I noticed there weren't many grades at the upper end of the scale, I changed the program to generate instances of just those grades.
  • The permutation-fanatics among you might be interested in the fact that there are 5*3*2*2*2 = 120 possibilities for numerical combinations. I ended up grading just over 200. Why not just grade every single possibility? Simple - I don't pretend to think I'm really consistent when I'm doing this. That's part of the problem. I want the algorithm to figure out what, on average, I tend to do in a number of different situations.

After training for a while, I was ready to have the network make some predictions. I made a little visualizer to help me see the results:

You can also see this in action by going to the CodePen, clicking on the 'Load Trained Data' button, and playing around with it yourself. There's no limit to the values in the form, so some crazy results can occur.

The thing that makes me happiest about the result is that there's nothing surprising about the results.

  • Conceptual errors are the most important ones that limit students from making progress from one level to the next. This makes sense. Once a student has made a conceptual error, I generally don't let students increase their proficiency level
  • Students with low scores that ask for the highest difficulty problems probably shouldn't.
  • Students that have an 8 can get to a 9 by doing a middle difficulty level problem, but can't get to a 10 in one reassessment without doing the highest difficulty level problem. On the other hand, a student that is a 9 that makes a conceptual error on a middle difficulty problem are brought back to a 7.

When I shared this with students, the thing they seemed most interested to use this to do is decide what sort of problem they want for a given reassessment. Some students with a 6 have come in asking for the simplest level question so they can be guaranteed a rise to a 7 if they answer correctly. A lot of level 8 students want to become a 10 in one go, but often make a conceptual error along the way and are limited to a 9. I clearly have the freedom to classify these different types of errors as I see fit when a student comes to meet with me. When I ask students what they think about having this tool available to them, the response is usually that it's a good way to be fair. I'm pretty happy about that.

I'll continue playing with this. It was an interesting way to analyze my thinking around something that I consider to still be pretty fuzzy, even this long after getting involved with SBG in my classes.

SBG and Leveling Up - Part 2: Machine Learning

In my 100-point scale series last June, I wrote about how our system does a pretty cruddy job of classifying students based on raw point percentages. In a later post in that series, I proposed that machine learning might serve as a way to make sense of our intuition around student achievement levels and help provide insight into refining a rubric to better reflect a student's ability.

In I last post, I wrote about my desire to become more methodical about my process of deciding how a student moves from one standard level to the next. I typically know what I'm looking for when I see it. Observing students and their skill levels relative to a given set of tasks is often required to identify the level of a student students. Defining the characteristics of different levels is crucial to communicating those levels to students and parents, and for being consistent among different groups. This is precisely what we intend to do when we define a rubric or grading scale.

I need help relating my observations of different factors to a numerical scale. I want students to know clearly what they might expect to get in a given session. I want them to understand my expectations of what is necessary to go from a level 6 to a level 8. I don't believe I have the ability to design a simple grid rubric that describes all of this to them though. I could try, sure, but why not use some computational thinking to do the pattern finding for me?

In my last post, I detailed some elements that I typically consider in assigning a level to a student: previously recorded level, question difficulty, number of conceptual errors, and numbers of algebraic, and arithmetic errors. I had the goal of creating a system that lets me go through the following process:

  • I am presented with a series of scenarios with different initial scores, arithmetic errors, conceptual errors, and so on.
  • I decide what new numerical level I think is appropriate given this information. I enter that into the system.
  • The system uses these examples to make predictions for what score it thinks I will give a different set of parameters. I can choose to agree, or assign a different level.
  • With sufficient training, the computer should be able to agree with my assessment a majority of the time.

After a lot of trial and error, more learning about React, and figuring out how to use a different machine learning library than I used previously, I was able to piece together a working prototype.

You can play with my implementation yourself by visiting the CodePen that I used to write this. The first ten suggested scores are generated by increasing the input score by one, but the next ten use the neural network to generate the suggested scores.

In my next post in this series, I'll discuss the methodology I followed for training this neural network and how I've been sharing the results with my students.