Category Archives: standards based grading

My Reassessment Queue

We're almost at the end of the third quarter over here. Here's the current plot of number of reassessments over time for this semester:

I'm energized though that the students have bought into the system, and that my improved workflow from last semester is making the process manageable. My pile of reassessment papers grows faster than I'd like, but I've also improved the physical process of managing the paperwork.

While I'm battling performance issues on the site now that there's a lot of data moving around on there, the thing I'm more interested is improving participating. Who are the students that aren't reassessing? How do I get them involved? Why aren't they doing so?

There are lots of issues at play here. I'm loving how I've been experimenting a lot lately with new ways of assessing, structuring classes, rethinking the grade book, and just plain trying new activities out on students. I'll do a better job of sharing out in the weeks to come.

SBG and Leveling Up, Part 3: The Machine Thinks!

Read the first two posts in this series here:

SBG and Leveling Up, Part 1
SBG and Leveling Up, Part 2: Machine Learning

...or you can read this quick review of where I've been going with this:

  • When a student asks to be reassessed on a learning standard, the most important inputs that contribute to the student's new achievement level are the student's previously assessed level, the difficulty of a given reassessment question, and the nature of any errors made during the reassessment.
  • Machine learning offers a convenient way to find patterns that I might not otherwise notice in these grading patterns.

Rather than design a flow chart that arbitrarily figures out the new grade given these inputs, my idea was to simply take different combinations of these inputs, and use my experience to determine what new grade I would assign. Any patterns that exist there (if there are any) would be determined by the machine learning algorithm.

I trained the neural network methodically. These were the general parameters:

  • I only did ten or twenty grades at any given time to avoid the effects of fatigue.
  • I graded in the morning, in the afternoon, before lunch, and after lunch, and also some at night.
  • I spread this out over a few days to minimize the effects of any one particular day on the training.
  • When I noticed there weren't many grades at the upper end of the scale, I changed the program to generate instances of just those grades.
  • The permutation-fanatics among you might be interested in the fact that there are 5*3*2*2*2 = 120 possibilities for numerical combinations. I ended up grading just over 200. Why not just grade every single possibility? Simple - I don't pretend to think I'm really consistent when I'm doing this. That's part of the problem. I want the algorithm to figure out what, on average, I tend to do in a number of different situations.

After training for a while, I was ready to have the network make some predictions. I made a little visualizer to help me see the results:

You can also see this in action by going to the CodePen, clicking on the 'Load Trained Data' button, and playing around with it yourself. There's no limit to the values in the form, so some crazy results can occur.

The thing that makes me happiest about the result is that there's nothing surprising about the results.

  • Conceptual errors are the most important ones that limit students from making progress from one level to the next. This makes sense. Once a student has made a conceptual error, I generally don't let students increase their proficiency level
  • Students with low scores that ask for the highest difficulty problems probably shouldn't.
  • Students that have an 8 can get to a 9 by doing a middle difficulty level problem, but can't get to a 10 in one reassessment without doing the highest difficulty level problem. On the other hand, a student that is a 9 that makes a conceptual error on a middle difficulty problem are brought back to a 7.

When I shared this with students, the thing they seemed most interested to use this to do is decide what sort of problem they want for a given reassessment. Some students with a 6 have come in asking for the simplest level question so they can be guaranteed a rise to a 7 if they answer correctly. A lot of level 8 students want to become a 10 in one go, but often make a conceptual error along the way and are limited to a 9. I clearly have the freedom to classify these different types of errors as I see fit when a student comes to meet with me. When I ask students what they think about having this tool available to them, the response is usually that it's a good way to be fair. I'm pretty happy about that.

I'll continue playing with this. It was an interesting way to analyze my thinking around something that I consider to still be pretty fuzzy, even this long after getting involved with SBG in my classes.

Standards Based Grading and Leveling Up

I've been really happy since joining the SBG fan club a few years ago.

As I've gained experience, I've been able to hone my definitions of what it means to be a six, eight, or ten. Much of what happens when students sign up to do a reassessment is based on applying my experience to evaluating individual students against these definitions. I give a student a problem or two, ask him or her to talk to me about it, and based on the overall interaction, I decide where students are on that scale.

And yet, with all of that experience, I still sometimes fear that I might not be as consistent as I think I am. I've wondered if my mood, fatigue level, the time of day affect my assessment of that level. From a more cynical perspective, I also really really hope that past experiences with a given student, gender, nationality, and other characteristics don't enter into the process. I don't know how I would measure the effect of all of these to confirm these are not significant effects, if they exist at all. I don't think I fully trust myself to be truly unbiased, as well intentioned and unbiased as I might try to be or think I am.

Before the winter break, I came up with a new way to look at the problem. If I can define what demonstrated characteristics should matter for assessing a student's level, and test myself to decide how I would respond to different arrangements of those characteristics, I might have a way to better define this for myself, and more importantly, communicate those to my students.

I determined the following to be the parameters I use to decide where a student is on my scale based on a given reassessment session:

  1. A student's previously assessed level. This is an indicator of past performance. With measurement error and a whole host of other factors affecting the connection between this level and where a student actually is at any given time, I don't think this is necessarily the most important. It is, in reality, information that I use to decide what type of question to give a student, and as such, is usually my starting point.
  2. The difficulty of the question(s). A student that really struggled on the first assessment is not going to get a high level synthesis question. A student at the upper end of the scale is going to get a question that requires transfer and understanding. I think this is probably the most obvious out of the factors I'm listing here.
  3. Conceptual errors made by the student during the reassessment. In the context of the previous two, this is key in whether a student should (or should not) advance. Is a conceptual error in the context of basic skills the same as one of application of those skills? These apply differently at a level six versus a level eight. I know this effect when I see it and feel pretty confident in my ability to identify one or more of these errors.
  4. Arithmetic/Sign errors and Algebraic errors. I consider these separately when I look at a student's work. Using a calculator appropriately to check arithmetic is something students should be able to do. Deciding to do this when calculations don't make sense is a sign of a more skilled student in comparison to one that does not. Observing these errors is routinely something I identify as a barrier to advancement, but not necessarily in decreasing a student's level.

There are, of course, other factors to consider. I decided to settle on the ones mentioned above for the next steps of my winter break project.

I'll share how I moved forward on this in my next post in the series.

Scaling Reassessments, Yet Again (Part 3)

I've been quietly complaining recently to myself about how the reassessment sign-up and quiz distribution tool I created (for myself) isn't meeting my needs. Desmos, Peardeck, and the other online tools I use have a pretty impressive responsiveness when it comes to requests for features or queries about bugs, and that's at least partly because they have teams of expert programmers ready to go at any given moment. When you are an ed-tech company of one, there's nobody else to blame.

This is the last week when reassessment is permitted, so the size of the groups I've had for reassessments has been pretty large. Knowing this, I worked hard this past Sunday to update the site's inner workings to be organized for efficiency.

Now I know what my day looks like in terms of which reassessments students have signed up for, what their current grade is, and when they plan to see me during the day:

screen-shot-2016-12-06-at-12-53-11-pm

With one or two reassessments at a time, I got along just fine with a small HTML select box with names that were roughly sorted. Clicking those one at a time and then assigning a given question does not scale well at all. I can now see all of the students that have signed up for a reassessment, and then easily assign a given question to groups of them at a time:

screen-shot-2016-12-06-at-1-54-28-pm

The past two days have broken the reassessment records that I wrote about at the end of the first quarter - today, for example, there are over sixty five students taking quizzes. In the scrambling of getting everyone their quizzes in the past, I've made concessions by giving simpler questions that may not honestly assess students at either end of my 5 - 10 learning standard scale.

With the user experience more smooth now, I have been able to focus this week on making sure that the questions I assign truly assess what students know. I could not do this without the computer helping me out. It feels great to know things are working at this higher scale, and I'm looking forward to having this in place when we get going again in January.

Getting Grade Data from PowerSchool Pro (#TeachersCoding)

Given that I use standards based grading with most of my classes, the grades I assign to students change quickly. I'm modifying those scores multiple times a day in some cases in my school's instance of PowerSchool Pro.

What the system currently lacks is an easy way to get that data out. For whatever reason, the only export format is PDF. This makes it difficult to get things into a spreadsheet.

After some hacking around in the console, I was able to put together a script that scrapes a class scoresheet page for the student names and assignment names and stores the result in a variable called exportData. This code is included below, and is also here in a gist. Paste the entire code into the console and run it. Then type in exportData and the scraped data will appear.

screen-shot-2016-11-09-at-8-26-58-am

You can then copy and paste the resulting string (leaving out the quotes) into Excel, OpenOffice, or Google Sheets and the data will appear there, ready to be spreadsheet-ified.

The only place where this doesn't work perfectly is when there are more students than will fit on the page. As far as I could tell after poking around, the grade data is re-rendered to fit the page as scrolling occurs. I didn't work that hard to see if the data is stored somewhere else on the page, so someone with a bit more insight might be able to improve upon my work.

Here is the full code:

var nameElements = $('.student-name').toArray();
var assignmentElements = $('var').toArray();
var names = [];
var assignments = [];
var assignmentNumber;

assignmentElements.forEach(function(name,index){

assignments.push(name.innerHTML)

})

names = names.slice(0,0.5*(names.length))

var rows = $( "tr[id*='std']" ).toArray()
rows.forEach(function(row){
var currentName = $(row).find('.student-name')[0].innerHTML;
var gradeElements = $(row).find('var');
gradeElements = gradeElements.slice(1,gradeElements.length).toArray();
grades = [];

gradeElements.forEach(function(grade){
var currGrade = (parseFloat(grade.innerHTML)!=NaN)?parseFloat(grade.innerHTML):'';

grades.push(currGrade)
})
if(grades.length>0){
names.push([currentName,grades])

}

})

assignmentNumber = names[0][1].length;

assignmentString = 'Name \t';

for(var i = 0;i<2*assignmentNumber-1;i+=2){ assignmentString += assignments[i] + '\t ' } var gradeString = ''; names.forEach(function(name){ var currentString = ''; currentString += name[0]+ "\t " name[1].forEach(function(grade){ currentString += grade + "\t " }) gradeString += currentString + "\n " }) var exportData = assignmentString+"\n"+gradeString;

The How and Why of Standards Based Grading @ Learning2.0

For those of you that are readers of my blog, you already know that I've become a believer in the power of standards based grading, or SBG. It's amazing looking back at my first post on my decision to commit to it four years ago. Seeing how this system has changed the way I plan my lessons, think about class activities, and interact with students about learning makes me wonder where I would be at this point without it.

I'm now trying to help others see how they might make standards based grading have a similar change to their classrooms. I'm running a one hour workshop this Friday at 1:30 in room C315 to introduce Learning2 attendees to how a teacher might go about this. More important for those considering a change to such a system is the fact that I run my system in a non-standards based PowerSchool environment. Here's the workshop description:

Suppose a student has earned a 75 in your class. How do you describe that student's progress? What has that student learned in your class? Obviously a student with an 85 has done better than the student with a 75, but what exactly has the 85 student achieved that the other student has not? Is it ten percent more understanding? Two more homework assignments during the quarter? Perhaps most importantly, what can the 75 student do to become an 85 student?

Grades are part of our school culture and likely aren't going anywhere soon. We can work to tweak how we generate and communicate the meaning of those grades in a way that better represents what students have actually learned. One approach for doing this is called Standards Based Grading, or SBG.

In this one hour workshop, you will learn about SBG and how it can clarify the meaning of grades, as well as how it can be implemented effectively within a traditional reporting system. You will also learn how a SBG mindset encourages productive changes to the process of planning units, activities, and assessments. We will also discuss the ways such a system can be run in the context of various subject areas.

It's a lot to cover in an hour, but I'm hoping I can nudge a few folks to try this out moving forward.

The link to my workshop is here.

I'm really excited about the Learning 2.0 conference this year. I first attended back in 2011 in Shanghai and the experience was what prompted me to become active on Twitter and begin blogging back then. I know the next few days will be filled with inspiring conversations and ideas that challenge my thinking and push me to grow as a teacher.

Stay tuned to the blog and to Twitter to see what I'm up to over the weekend.

Scaling Reassessments, Part 2

A quick comment before hitting the hay after another busy day: the reassessment system has hit it big in my new school.

Some facts to share:

  • In the month since my reassessment sign-up system went up, 87% of my students have done at least one self-initiated reassessment, 69% doing more than one. This is much more usage than my system has had, well, ever.
  • Last Friday was an all time high number of 53 reassessments over the course of a day. I will not be doing that again, ever.
  • Students are not hoarding their credits, they are actually using them. I've committed to expiring them if they go unused, and they will all be expired by the end of the quarter, which is essentially tomorrow.

I need to come up with some new systems to manage the volume. I'll likely limit the number of slots available in the morning, at lunch, and after school to encourage them to spread these out throughout the upcoming units instead of waiting, but more needs to be done. This is what I've been hoping for, and I need to capitalize on the enthusiasm students are showing for the system. Now I need to make it so I don't pull all my hair out in the process.

Scaling up SBG for the New Year

In my new school, the mean size of my classes has doubled. The maximum size is now 22 students, a fact about which I am not complaining. I've missed the ease of getting students to interact with simple proximity as the major factor.

I have also been given the freedom to continue with the standards based grading system that I've used over the past four years. The reality of needing to adapt my systems of assessment to these larger sizes has required me to reflect upon which aspects of my system need to be scaled, and what (if anything) needs to change.

The end result of that reflection has identified these three elements that need to remain in my system:

  • Students need to be assessed frequently through quizzes relating to one to two standards maximum.
  • These quizzes need to be graded and returned within the class period to ensure a short feedback cycle.
  • There must still be a tie between work done preparing for a reassessment and signing up for one.

Including the first element requires planning ahead. If quizzes are going to take up fifteen to twenty minutes of a class block, the rest of the block needs to be appropriately planned to ensure a balance between activities that respond to student learning needs, encourage reinforcement of old concepts, and allow interaction with new material. The second element dictates that those activities need to provide me time to grade the quizzes and enter them as standards grades before returning them to students. The third happens a bit later in the cycle as students act on their individualized needs to reassess on individual standards.

The major realization this year has been a refined need for standards that can be assessed within a twenty minute block. In the past, I've believed that a quiz that hits one or two aspects of the topic is good enough, and that an end of unit assessment will allow complete assessment on the whole topic. Now I see that a standard that has needs to have one component assessed on a quiz, and another component assessed on a test, really should be broken up into multiple standards. This has also meant that single standard quizzes are the way to go. I gave one quiz this week that tested a previously assessed standard, and then also assessed two new ones. Given how frantic I was in assessing mastery levels on three standards, I won't be doing that again.

The other part of this first element is the importance of writing efficiently targeted assessment questions. I need students to arrive at a right answer by applying their knowledge, not by accident or application of an algorithm. I need mistakes to be evidence of misunderstanding, not management of computational complexity. In short, I need assessment questions that assess what they are designed to assess. That takes time, but with my simplified schedule this year, I'm finding the time to do this important work.

My last post was about my excitement over using the Numbas web site to create and generate the quizzes. A major bottleneck in grading these quizzes quickly in the past has been not necessarily having answers to the questions I give. Numbas allows me to program and display calculated answers based on the randomized values used to generate the questions.

Numbas has a feature that allows students to take the exam entirely online and enter their answers to be graded automatically. In this situation, I have students pass in their work as well. While I like the speed this offers, that advantage primarily exists in cases where students answer questions correctly. If they make mistakes, I look at the written work and figure out what went wrong, and individual values require that I recalculate along the way. This isn't a huge problem, but it brings into question the need for individualized values which are (as far as I know right now) the only option for the fully online assessment. The option I like more is the printed worksheet theme that allows generation of printable quizzes. I make four versions and pass these out, and then there are only four sets of answers to have to compare student work against.

With the answers, I can grade the quizzes and give feedback where needed on wrong answers in no more than ten or fifteen minutes total. This time is divided into short intervals throughout the class block while students are working individually. The lesson and class activities need to be designed to provide this time so I can focus on grading.

The third element is still under development, but my credit system from previous years is going to make an appearance. Construction is still underway on that one. Please pardon the dust.


P.S:

If you're an ed-tech company that wants to impress me, make it easy for me to (a) generate different versions of good assessment questions with answers, (b) distribute those questions to students, (c) capture the student thinking and writing that goes with that question so that I can adjust my instruction accordingly, and (d) make it super easy to share that thinking in different ways.

That step of capturing student work is the roughest element of the UX experience of the four. At this time, nothing beats looking at a student's paper for evidence of their thinking, and then deciding what comes next based on experience. Snapping a picture with a phone is the best I've got right now. Please don't bring up using tablets and a stylus. We aren't there yet.

Right now there are solutions that hit two or three, but I'm greedy. Let me know if you know about a tool that might be what I'm looking for.

Standards Based Grading & Streamlining Assessments

I give quizzes at the beginning of most of my classes. These quizzes are usually on a single standard for the course, and are predictably on whatever we worked on two classes before. I also give unit exams as ways to assess student mastery of the standards all together. Giving grades after exams usually consists of me looking at a single student's exam, going standard by standard through the entire paper, and then adjusting their standards grades accordingly. There's nothing groundbreaking happening here.

The two downsides to this process are that it is (a) tedious and (b) is subject to my discretion at a given time. I'm not confident that I'm consistent between students. While I do go back and check myself when I'm not sure, I decided to try a better way. If you're a frequent reader of my blog, you know that either a spreadsheet or programming is involved. This time, it's the former.

Screen Shot 2016-02-25 at 9.07.41 AM

One sheet contains what I'm calling a standards map, and you can see this above. This relates a given question to the different standards on an exam. You can see above that question 1 is on only standard 1, while question 4 spans both standards 2 and 3.

The other sheet contains test results, and looks a lot like what I used to do when I was grading on percentages, with one key difference. You can see this below:

Screen Shot 2016-02-25 at 9.10.02 AM

Rather than writing in the number of points for each question, I simply rate a student's performance on that question as a 1, 2, or 3. The columns S1 through S5 then tally up those performance levels according to the standards that are associated with each question, and then scale those values to be a value from zero to one.

 

This information was really useful when going through the last exam with my ninth graders. The spreadsheet does the association between questions and standards through the standards map, so I can focus my time going through each exam and deciding how well a student completed a given question rather than remembering which standard I'm considering. I also found it much easier to make decisions on what to do with a student's standard level. Student 2 is an 8 on standard 1 before the exam, so it was easy to justify raising her to a 10 after the exam. Student 12 was a 7 on standard 4, and I left him right where he was.

 

I realize that there's a subtlety here that needs to be mentioned - some questions that are based on two or three standards might not communicate effectively a student's level with a single 1, 2, or 3. If a question is on solving systems graphically, a student might graph the lines correctly, but completely forget to identify the intersection. This situation is easy to address though - questions like this can be broken down into multiple entries on the standards map. I could give a student a 3 on the entry for this question on the standard for graphing lines, and a 1 for the entry related to solving systems. Not a big deal.

I spend a lot of time thinking about what information I need in order to justify raising a student's mastery level. Having the sort of information that is generated in this spreadsheet makes it much clearer what my next steps might be.

 

You can check out the live spreadsheet here:

Standards Assessment - Unit 5 Exam

Another WeinbergCloud Update

I decided a full overview of my online WeinbergCloud application was in order, so I recorded a screencast of me going through how it currently works. It's kind of neat that this has been a project under development for nearly two years. I've learned a lot about HTML, Javascript, the Meteor framework, and programming in general in the process, and it has been a lot of fun.

Stay for as long as you like, and then let me know your thoughts in the comments.