Reflections on My First Year of Standards-Based Assessment, Part 2
After one year of keeping a disaggregated gradebook, I can confidently say that I will never go back to my old way of doing things. That’s not to say that I will repeat exactly how I did things last year. There is certainly plenty of room for growth and a need for refinement, but the things I explored during the 2013/14 academic year will be at the heart of my assessment system for years to come.
In this post I’d like to examine a few of my policies and procedures from the year, what I liked and think went well, and where I see room for improvement. (For background and a bit about the basic set-up of my system last year, see Part 1 of this series. Part 3 is here. My course syllabus with all policies outlined is available here.)
In my calculus classes last year, I gave 23 quizzes, 0 tests, and 2 exams. With the exception of the two summative semester exams, no assessment was any more significant than another. Gone is my old hand-wringing about destroying anyone’s grade by catching students with a test on a bad day. Also, although it took some of my students well into the second semester, I think we were eventually able to do away with assessment anxiety. It took a while since I was working with seniors who had been dealing with permanent grade-punishment (or -reward) for many years. By the end of the year, students weren’t choking on assessments, I don’t remember seeing any tears, and the worst (actually, the best) I would hear as students filed out of the room was, “Man, I need to work on [learning goal].” I can’t quite claim the student excitement about assessment that Dan Meyer reported, but I am certainly much happier with the way things went.
There were added benefits of shifting how I assessed that weren’t immediately obvious to me when I first decided to make the leap, but that I now see as crucial. One is that with assessments no longer scored as a percentage of possible points, I am now free to differentiate to my heart’s content, during in-class quizzes. If I put a really challenging question that requires mastery of a particular learning goal (or goals), no harm is done to students who can’t answer it. They simply won’t get a score reflecting that mastery, instead receiving a mark that shows how well they did on the more straightforward questions involving that goal. Since the quiz isn’t a tool to accumulate points, no points have been lost, rather some just haven’t been gained yet. Previously I would include just one or two tough, concept-extending questions per test, and they ended up separating the As from the Bs. Any more than that and my average students suffered excessively. Now I can continually challenge my strongest students without driving everyone else to madness.
Another non-obvious benefit of changing my assessment method, namely switching to the following five-point scale, helped students on the other side of mastery.
Before, when my percentage grades were supposed to indicate what fraction of covered material that a given student understood, someone encountering a problem they weren’t sure how to solve (maybe having difficulty with where to begin, maybe with some other crucial step) was forced to stare at the problem until serendipity struck and they figured it out. Most of the time they didn’t get there. What could have been a useful learning experience was wasted, instead converted into minutes of idle frustration, my students as victims of summative assessment. With my new scoring scale and a focus on assessment as formative, it was clear that I could instead provide floundering students with a gentle nudge, enough to get them back on track and making use of the time, without clouding the issue on what score they should receive. I simply gave a hint in pen to any student who wanted one and took the fact that I had done so into account when scoring their paper later. It worked beautifully. I’m not saying that this isn’t possible outside of a standards-based approach, but in this system it just seems so natural.
The screenshot at the top of this post nicely demonstrates one of the chief reasons I changed my assessment method. For those of you unfamiliar with ActiveGrade, one of the built-in views for the gradebook is ‘Improvement,’ where red shows a decline in score for a learning goal and green shows improvement, the darker the color the more extreme the shift. I sleep better at night with the knowledge that I’m allowing and encouraging my students to continually demonstrate improved understanding.
Last year I made sure that we had regular, in-class assessments on every learning goal at least two or three times. After that, students were on their own to schedule a reassessment if they felt they could demonstrate that they had improved since I last checked. In order to reassess, my students had to fill out a Google Form à la Sam Shah. After the logistics of name, learning goal(s) to reassess, and time to reassess, the meat of the request form was two questions:
1. Why do you think you have not done well with this goal (or these goals) in the past. Please be specific.
2. Since your last assessment, what things have you done to improve your understanding. Please be specific.
Below is a sample of student responses.
At my most cynical, this is concrete evidence of my 32 students coming in to practice more problems or attempt to explain concepts to me 303 times last year. Not at my most cynical, I’m ecstatic to see my students reflecting on their learning and making an effort to improve, which is exactly what I want to see happening. However, the form and my reassessment system need some revisions.
Last year, I allowed a maximum of two goals per reassessment and a minimum of 48 hours notice. That worked well, but I also allowed students to schedule reassessments at any time that worked for both of us. It was generally more stressful for me and turned into a bit of a madhouse toward the end of grading periods, unfortunately. That, together with the fact that next year I’ll be leading a student’s directed study which will regularly require a portion of my time outside of class, has lead to my decision to do all reassessments once per week during a common time next year. Since they won’t be able to reassess as often, I’ll probably up the maximum number of goals per reassessment.
Although my students typically had trouble articulating a good answer to the first request form question above (Why have you not done well?), I think I’ll leave that in because I want them to try. The second question (What have you done to improve?) didn’t really convey what I wanted to know from my students, though. Invariably, what I ended up with was a laundry list of students someone had sought help from, websites they had looked at, and things of that nature. My fault, bad question. Next year it will be something more like:
2. Convince me that you deserve a chance to reassess. Explain how you now understand the goal(s) better than before. Please be specific.
Last year I required my students to keep tracking binders, compiling evidence and charting their performance on each learning goal throughout the year. I had noble intentions, but I didn’t really follow through in a way that accomplished my goals. The metacognition piece is an important aspect to me, but we didn’t really devote class time to reflecting on what was showing up in the tracking binders. I foolishly trusted that my students would use the binders to self-reflect and put the pieces together on their own. As much as I want students to leave my course with a clear understanding of the complexity of the notion of ‘rate of change’, I will feel even more accomplished as an educator if they leave with a clear understanding of how they learn and strategies that help them be better learners. I also ran into students not really grasping the arc of their learning and therefore not feeling like we were accomplishing as much as we were. By spending a lot more time with the tracking binders next year (and coincidentally probably going digital with them), I hope to answer both of these issues. I’m thinking about trying some journaling activities based around tracking the learning goals which will aid the metacognition and more explicitly highlight our progress throughout the year.
This is the toughest one for me and it still occupies my mind the most. When coming up with last year’s learning goals, I spent a lot of time worrying about grain-size and I still feel like I’ve got my work cut out for me there; a few goals feel too specific and others feel too vague. One thing I realized early on in the year was that I hadn’t thought through the consequences of having basic skills as learning goals quite well enough. My two goals for basic algebra skills quickly presented a problem: should I go back and mark a student’s score down every time they made an algebra mistake for the rest of the year? I thought not, so away those goals went. Instead I opted to remember that my scoring scale allows for ‘insignificant mistakes’ and that not everything has to be tied to a learning goal.
The bigger shift I’m considering next year came as a result of looking back over my second-semester goals and involves a rethinking of the conjunctive way I determined students’ grades (i.e., that I required students to perform acceptably on every single learning goal for the year in order to pass). At the time I wrote the goals, I felt like I had done a pretty good job not only of establishing the basic concepts of differential calculus, but also of providing a window into some application. Looking back over the second semester, I don’t think I did as good of a job with integral calculus. Exacerbated by the fact that we didn’t make it as far into differential equations as I normally do, the list of goals looked like basics and then a bunch of calculation methods with solids of revolution as the lone application. I can’t really be proud of that, but in thinking of what to do about it, I see an opportunity. What if, instead of having thirty-ish learning goals all with the same level of importance, I divided the goals into multiple tiers, perhaps two? In this way, I could differentiate basic conceptual underpinnings and computational skills from more complex, analytic skills. Within each tier, I could require proficiency for some (everyone needs to be able to apply the Fundamental Theorem) but not for others (maybe not everyone needs to be able to integrate by parts). Having another tier really opens up opportunities for more interesting things, too. Maybe you don’t have to know how to solve first-order linear differential equations by hand, but instead you could focus on more authentic applications where WolframAlpha handles the messy equation(s) for you. Or maybe I don’t use tiers where some goals from each are required and others not, but instead just have a mix of mandatory and non-mandatory goals. I’m still thinking about this, so we’ll see where it ends up by August.
Last year I started with a 75% – 25% weighted averaging method for each learning goal, where 75% of a goal’s average score came from the most recent assessment and 25% from the prior average. I chose this method because I wanted my students’ averages to reflect where they currently were in their learning and it seemed like this would do a much better job than the typical mean. However, I was hesitant to completely throw out information from initial assessments, hence the 25%. This stemmed from a fear that, because I was opening up the ability of my students to reassess on anything they wanted, they would blow off early assessments. While the method did probably help to keep some students serious about initial assessments, it also lead to a consequence that bothered me. Routinely, a student who initially scored poorly on some goal (but had since significantly improved their understanding) would have to reassess as many as three times before bringing their average up to a desirable level. I’m happy with requiring consistent demonstration of understanding, but often it was clear after the first or second reassessment that the student was good to go. The averaging method was creating a lot more work for everyone involved and, more importantly, frustrating a significant number of students.
Halfway through the year, and with the unanimous consent of my students, we switched away from averaging, instead moving to students’ most recent scores for a given learning goal being all that counted. A handful of students took this shift as license to goof off (actually, they had been goofing off beforehand anyway), but most did not and continued (or at least seemed) to focus and put forth their best effort all the way through. It is unclear to me whether that is how the majority of students would have reacted from the beginning of the year anyway or whether they had learned from the first semester that it would take a deal of hard work to do well in the class. For that reason, I’m still hesitant to completely do away with prior scores from the outset.
My plan for next year is to base my reporting for learning goals on students’ last two assessments. Specifically, their score for any given goal will be the lower of the scores from their most recent two assessments on it. This will require students to consistently demonstrate understanding, but no one will ever be more than two reassessments away from the score they want regardless of how they have done in the past.
Selling It to Students
Last year I spent entirely too much time at the beginning of the year selling the new system to my students. Since it was a pilot program, and since I generally want my students to understand why things are set up the way they are, I ended up taking the first two days to exclusively talk through all of the procedures and the philosophy behind them. In the end, it didn’t work well at all; three weeks into the year I was finding that students still didn’t really understanding the reassessment procedure or how the averaging worked. Instead, what I need to do is perhaps mention briefly that this system is likely different from what students have experienced in the past, but then go about teaching my course and take care of the discussions about particular aspects of the system as they come up. Until context has made the procedure real for students, talking about theories on assessment and reporting will have little meaning and do less good.