Grades Are Stupid

And Education Has Failed Us

Jan 11, 2025

This was originally the final paper I wrote for one of my English classes in college, that I edited for this blog. I went… a little overboard with it. I’m not a writer, not really anyway. I’ll get right into it, hope you enjoy my freshman year hyperfixation.

I have been a victim of the U.S. grading system for as long as I remember. I have slogged through test after test, hoping my instructors would give me a good grade, even though I crammed for it the previous night and had forgotten most of the information within a few days. I have stressed myself out over these, because if I got a low grade then colleges and employers would see that, and I wouldn’t get to do the things I wanted, and the people I care about would be disappointed in me, and I wouldn’t have a job I like, and I wouldn’t have a life I like, and everything would go wrong because there’s so much riding on every single test grade so I tell myself I might as well stress myself out now and not screw up so I can maybe avoid some of that later.

It’s not fun. The worst part is, I’m no exception.

Many students are constantly stressed out over grades, and they’re suffering for it. However, there are a multitude of problems with the grading system as it is beyond student stress. It should go without saying that we need to identify and solve these problems, because education is how we grow as a society, and it determines the future of a lot of people. School systems should not grade their students or administer standardized tests because they fail to motivate students in a healthy way, do not accurately reflect intelligence, are often outright discriminatory, and cause significant mental health issues.

Background

Grades were first introduced in the U.S. at Yale. Students got one of 4 possible grades, and the grades were hidden from students (Durm). Since then, grades have steadily increased in popularity, and the hybrid percentage/A-F/Pass-Fail/0.00-4.00 system we have today slowly emerged. They serve a few purposes: they motivate students, measure learning, and communicate that learning to students, teachers, colleges, and more (Bee). Those people can then use those grades to, in theory, determine which people are smarter than others.

As a side note: do people put inline sources in blog posts? I’m still not very familiar with the medium. I’ve left them in because they’re pretty harmless and it’s always good to credit your sources. I’ll repeat this at the bottom, but I have a spreadsheet with all the sources.

(Side side note… is there a better way to add small notes like these? This is header formatting.)

Most people seem to agree that grades are a flawed system. The part people don’t always agree on is just how flawed they are, and what should be done, if anything. Most of the debate circles around standardized testing, which is definitely a significant part of what makes the grading system what it is, but it’s far from the whole thing. This paper will focus on grading as a whole, not focusing on standardized tests much but not forgetting them either.

Motivation

One common practice in psychology is to divide motivation into two categories: intrinsic and extrinsic. Extrinsic motivation involves some external factor, like a reward or punishment attached to an action. It can also be a combination of those, or lack thereof. For example, if you go to the gym regularly, you’ll gain muscle and maintain your health. On the other hand, if you don’t exercise, you’re much more likely to have worse health, and you won’t build muscle. Being more fit acts as a reward and poor health acts as a punishment. Intrinsic motivation, on the other hand, is derived from the action itself. Someone who is intrinsically motivated to do something wants to do that thing for its own sake. For example, many people enjoy going to the gym, regardless of the muscle gain and health benefits. Working out has an intrinsic value for them, and they would do it even if there were no external benefits. Intrinsic motivation tends to be more effective than extrinsic motivation, but both have their place.

Intrinsic and extrinsic motivation can and often do coexist. Many people love going to the gym, while also enjoying the health benefits. However, studies in cognitive evaluation theory show that this can only go so far. Consider the overjustification effect: an extrinsic motivator will usually devalue an existing intrinsic motivator. Imagine you had a plate of your favorite food in front of you. Assuming you’re not too full, you’d probably want to eat it. This is an example of intrinsic motivation simply because you enjoy eating that kind of food. However, if you were presented with the same plate of food and subsequently offered $100 to eat it, something might feel off. You might wonder if someone had hidden something bad in it, or if there was some other reason you wouldn’t eat it unless you had those $100. Of course, $100 is $100, so you may choose to eat it anyway, but the fact that it’s your favorite food is no longer as relevant. In this case intrinsic motivation was turned into extrinsic motivation due to the appearance of a reward. In some cases, the intrinsic aspect of the motivation is diminished even further. If you were paid $100 every single time you ate that food, you would be more likely to associate the food with money. If that money were to be taken away, you would no longer find reason to eat the food at all, despite the fact that it was previously your favorite.

This is an extremely general problem, and it applies to much more than food. Extrinsic rewards can also diminish enthusiasm, creativity, and overall quality of whatever action is being rewarded. There is a lot of evidence backing this claim, in addition to what we can see in our everyday lives. However, as with many scientific theories, it is important to acknowledge that it is not perfect, and things aren’t always this simple. Some types of rewards can increase intrinsic motivation in certain situations. Still, cognitive evaluation theory and the overjustification effect have value. The nuances in the relationship between intrinsic and extrinsic motivation become harder and harder to navigate. This can be seen in everyday applications, a prime example of which includes the current academic grading system.

Grades are a primarily extrinsic motivator. While students sometimes feel good for doing well on an assignment or test, ultimately they did it for the grade. Grades themselves are also tied to external factors; while it’s possible for students to get a kick out of seeing an A, that A is only going to get colleges or employers to think higher of them. It’s the same with an F. Students might feel horrible (more on that later), but the more pressing reason to avoid an F is the fact that institutions will see that F later on. They’re also a two-sided motivator, meaning that they are simultaneously a reward and punishment, almost regardless of the grade received (100% and 0% being the usual exceptions). This works to make the extrinsic motivation stronger—as well as the overjustification effect.

Being an extrinsic motivator, grades (and systems adjacent to grades) are subject to the overjustification effect. When a teacher assigns something with no grade attached to it, students consistently ignore it. This is because they have come to expect a grade in return for doing class assignments, and even if they enjoy learning, they see no point in doing an assignment if the reward is taken away. They are more generally encouraged to do the “bare minimum”, whatever that may be. If a homework assignment at the end of the semester won’t change their letter grade in the class, students usually won’t do it. If there’s no point value directly attached to being creative or going above and beyond, students are encouraged to not do either of those and instead spend that time on other assignments. If students are given a rubric or outline for what an essay should look like, they (usually) will stick to that outline without taking a critical look at it and understanding why it’s a good structure. If it’s not a good structure, they won’t make an effort to use a better structure either. Grades transform into something that makes students tell themselves “do our work quickly and quietly and behave ourselves and sit down and shut up” (Bee 2021).

What Grades Are For

There is also a dissonance between what grades encourage and what they are trying to encourage, which is connected to many previous points. Grades should be encouraging learning; instead, they encourage getting higher grades. These two goals do not have to be aligned, and in fact often aren’t. Instead of writing an essay to aid in learning about a topic, a student can prompt an AI to write it for them, submit the AI-written essay, and get full credit without having learned anything. This may be considered cheating, but it is (often) unverifiable cheating, which as far as grades are concerned is just as good as academic integrity. On a smaller scale, students may figure out how to efficiently plug numbers into formulas, without ever understanding why the formulas work or the deeper mathematical concepts behind them. That strategy will earn them the grade just fine, but it won’t foster any real learning. This pattern can be seen in almost every graded assignment.

Another example of this dissonance is the concept of cramming. Grades, and tests in particular, only require that students know the content at a certain point in time. This makes it possible, and even encouraged, for students to learn everything the night before an exam, take the exam and get a high score, and forget everything the next day. Not only is this stressful and generally unhealthy for the student, but it means that test scores do not reflect what a student will know in the future, even if that’s as close as a month from when the test was taken. This is especially problematic because colleges and other institutions look at those scores to determine what a student knows; if the scores are no longer accurate, those institutions cannot make an informed decision.

This is just part of the more general problem that grades do not accurately reflect students’ knowledge of the subject, and even more generally that they don’t do what they’re supposed to do. Before delving into this, it is important to understand what is meant by knowledge. For the purposes of this paper, “knowledge” will refer to both the literal facts taught in class and a deeper understanding of how and why those concepts are present, how they interact with each other, and other nuances that go beyond what can be memorized and regurgitated. This is not the same as “knowledge” in the colloquial sense, but it is similar, and more useful for this discussion.

What Do Grades Try To Measure?

One important property of knowledge is that it does not lie on a single spectrum. One person may know a lot about math but very little about history, and another person may have those switched. These are evidently not part of a single axis, and it is not the case that one of these people “knows more” than the other. They simply have different sets of knowledge. The same thing applies within a specific field; one person may be an expert at the theory of aerodynamics, while another may be much better at actually designing a plane. Different people have different skills, which is good because of specialization, but makes it harder to evaluate them.

Brian A. Jacob, an Education and Education Policy professor at University of Michigan, claims that “At best, test scores are ‘ordinal’ measures, meaning that they allow you to order students on a continuum from lowest to highest ability” (Jacob). However, this is inherently dissonant with knowledge because of that nature. There is no way for a grade to fully capture a student’s knowledge, so meaning will always be lost. This may not be a strictly bad thing for reasons discussed later, but it definitely means that grades can’t fully capture knowledge accurately.

When discussing accuracy, it is also important to ask what grades are trying to do. If they are inaccurate in something they aren’t trying to do, the argument is void. Possible goals for grades that should be considered are to measure the student’s current knowledge on the subject, their achievement after being instructed, and their learning potential (Gong and Keng 2020). These seem similar but are subtly different in important ways. The context and timing of these differ in ways that fundamentally change how they should be measured. The way most commonly used forms of assessment are set up, they ask students to answer questions at the time of the test, meaning they don’t measure learning potential. They also do not necessarily measure achievement after being instructed, because students (particularly in college) can learn the content on their own instead of showing up to class, and still take the test legitimately. While there may be other potential purposes, a measurement of current knowledge is by far the most common one. Standardized tests solidify this further; they usually ask the same types of questions, without any standardized instruction to go with them. CollegeBoard’s SAT is a standardized test whose original purpose was to test innate intelligence (Carlton 2022). It of course fails at this in many ways, and that statement has been retracted long ago.

Testing is noisy. There are a lot of factors that contribute to a test score, in addition to the student’s knowledge. Factors such as the people in the room, how much desk space students get, and literal noise can affect test scores, and those are just properties of the testing environment. If the student didn’t eat that morning, or they didn’t get the right amount of sleep the previous night, or they’re going through a rough patch mentally, or they really like their teacher, or they have a catchy song stuck in their head, or they’re thinking about the next test they have to take in an hour, or they’re thinking about the test they just bombed 2 hours ago, or realistically anything else, their test scores can be affected. This is just a tiny portion of the multitude of factors that affect test scores that nobody meant to test for. Many people try to control these factors, by telling students about tests ahead of time and controlling the environment, but there’s only so much they can do. Even the fact that there are “test taking strategies” contributes to this problem; test scores can be affected just by how good a student is at taking tests, which is a skill that has nothing to do with knowledge. If there’s so much noise that goes into a test score, are they really communicating something valuable?

Tests aren’t the only thing that is prone to this; when trying to finish this paper, I ended up getting heat exhaustion and had to recover instead of working on this. The next day, I had 2 final exams which took up most of my mental energy, and the day after I performed in 2 concerts, followed by another final exam the following day. I just barely squeezed it in on the day of 2 exams, but I’m confident that that noise affected the parts of the paper that I wrote last. (But, I don’t feel like going through and changing everything, and I think I got my points across just fine).

The Bare Minimum

Another dynamic encouraged by tests is what’s known as “teaching to the test”. This is the act of teaching students with the sole (or primary) intention of having them pass a test. This is commonly seen with standardized tests, but it can apply to any system that uses tests. Teachers and students sacrifice a deep understanding of the content in exchange for a higher test score. This is actually a very effective strategy for getting better grades, and it’s no surprise that people use it. The problem is that it forgets why grades are implemented in the first place: learning. In fact, one study conducted by researchers at Harvard and MIT showed that teaching to the AP Calculus exam hurt students with a weaker math background in the long run (Sonnert et al). Even if class is not taught in this way, students will ask what content will be on the test, if instructors don’t provide a list themselves. Students will use that information to focus only on that particular content, while ignoring the rest. It is infeasible to test students on every single concept discussed in class, so some amount of content will be lost in this way every time.

This activity is part of the larger trend of students doing the bare minimum. Colleges and other institutions only see the grade at the end of the course, not the students’ contributions towards getting that grade, so they don’t usually have a good reason to do anything that won’t improve their grade. In other words, they are encouraged to do the bare minimum. In a student’s eyes, rubrics don’t look like a set of guidelines. They look like a list of checkboxes to fill out, and once they’re all checked off, the assignment is complete and any potential improvements can be ignored. Going above and beyond can often help students truly learn something, yet this system stifles that.

In the same vein, students are encouraged to simply memorize facts, rather than understand content. One study even showed that when students were supposed to learn critical thinking skills, under a testing system it devolved into rote memorization of responses to questions about critical thinking (Zohar & Alboher Agmon). A test can’t tell the difference between a memorized answer that’s identical to a well thought out answer, so students (and teachers) choose the path of least resistance and, once again, sacrifice understanding for a test score. Plus, in multiple choice tests, students can get lucky and see a correct answer choice they would have forgotten otherwise. They can even get lucky enough to guess the correct answer without having any knowledge of it. The chances are often as high as 25%!

Grades also discourage students from taking risks, because a bad grade sticks. Because so much weight is put onto grades, this means that taking a risk is almost never optimal, and students play things safe instead (“3 Reasons”). This stifles creativity, which means that it also stifles learning (Bee). Students and others are constantly told that failure is how people learn, yet grades punish students for failure.

Feedback

Grades can also suppress other forms of feedback. A grade on a paper or other open-ended assignment tells the student nothing about how they can improve for the future, but other forms of feedback (such as written feedback) can. As long as that is present, students should be able to know how to improve for the future, even if there is also a grade next to it. However, research shows that “if a paper is returned with both a grade and a comment, many students will pay attention to the grade and ignore the comment” (Qtd. in Schinske & Tanner; Winstone et al). If the comment is ignored, it is just as useful as if there were no comment at all, so student improvement is impeded. In fact, a study shows that withholding grades can improve overall performance (Jackson & Marks).

When it comes to students and learning, the end result has become more important than the means (Tagg). The way one author put it, grades “may be more of a reflection of a students’ ability to understand and play the game of school than anything to do with learning” (Schinske & Tanner). From a game design standpoint, this is completely true, rather than just a “maybe”. After all, if getting a good grade is the local end goal of school, the only thing they’ll measure is how skilled students are at getting good grades. If that overlaps with learning, that’s great. The problem is that they don’t overlap anywhere near as often as they should. Another author described the system by claiming that “[s]chool administrators have been using with confidence an absolutely uncalibrated instrument” (Qtd. in Durm).

Discrimination

Perhaps the worst part is that while there are many reasons grades are not an accurate reflection of knowledge, there is still one major reason that’s more harmful than the rest: discrimination. Grading systems are constantly used to perpetuate racism, ableism, sexism, and other forms of discrimination, whether intentionally or not. It goes without saying that this is a bad thing for many reasons. It is morally wrong, it means MANY institutions are hypocritical, and it means that grades are based on yet another major factor that has nothing to do with knowledge.

People in different demographic groups score differently on tests. That is, unfortunately, a fact of the system. While it is absolutely possible that there are legitimate trends dictating that certain groups should score better than others, these trends are no doubt highly bloated. This is particularly true when tests don’t deliberately take into account equity directly in the coursework, even though that’s not something you’d expect most classes to do (Yilmaz et al). One writer, talking specifically about the SAT, claims that it not only started out as a way to vet the immigrants from the WASPs, but that it still serves that purpose today (Hammond). Eugenics, as it is called, is a form of scientifically inaccurate “scientific racism”, that claims that intelligence is genetic and stupidity should be filtered out of humanity (“Eugenics and Scientific Racism.”). It is the basis for many standardized tests, including the SAT, and unfortunately they still do their jobs, because they haven’t changed in very fundamental ways (Hammond).

Research shows that it’s actually “possible to predict the percentages of students who will score proficient or above on some standardized tests,” just by looking at demographic data and community data (Tienken). This research did not take into account any factors that were properties of the schools themselves, such as teacher quality. Therefore, demographic data has a substantial impact on passing rates of standardized tests, making them discriminatory while further reducing their accuracy. This fits with a broader theme of “moral luck” playing into grades, or the idea that the circumstances someone is in that are completely out of their control can affect their grades (Klauber). For example, students who have to work a job, take care of family members, or experience particular mental conditions may not have enough time to complete their homework, which would result in lower grades than those without. This is another form of discrimination, which is just as bad despite not fitting neatly into “racism” or some other category.

tests arent the only thing thats discriminatory, essays are too. they expect you to talk in whats known as the arcolect, which is basically the “elite” form of language. its what ive been using for like this entire essay up to this paragraph. but thats not how everyone learned to talk. lots of ppl first learned black vernacular, or a creole, or southern speak, or whatever else. in an essay (or blog post), tho, ppl are graded just as much on dialect as actual content (Lowther). ppl who first learned to speak in these baselects are discriminated against because they gotta go through the added barrier of learning the arcolect (even tho the dialect thats natural to them is totally consistent) to code switch to it while more privileged ppl already speak it basically (Young).

Disability is another area with major potential for discrimination. While schools are required to provide some form of accommodations for disability, many people (including myself) can attest that they are usually inadequate (“Section 504”). Regardless, instructors often design their rubrics in ways that are discriminatory towards neurodivergent students. Even something as simple as grading attendance can make things difficult for them (Birdwell & Bayley). Autism can affect a student’s mode of communication, which makes certain assignment formats (e.g., presentations) substantially more difficult than they have to be (Birdwell & Bayley).

Beyond that, turning assignments in on time can be a barrier for those with executive functioning difficulties. This affected me directly in one of my classes recently. A close friend in the class and I both have ADHD, which is known for its executive functioning difficulties. We would actively participate in class discussions, ask questions, and generally understand the material, but turning assignments in was difficult for both of us. The teacher told us this, and that she wanted to give both of us very high grades, but she couldn’t because she would get fired for giving us points when there was no submission, even though we clearly were doing well in the class by her standards.

Rejection Sensitive Dysphoria (RSD) is a state of extreme emotional pain triggered by rejection, commonly associated with ADHD (“Rejection Sensitive Dysphoria”). Speaking from personal experience, it can be very hard to find the motivation to do anything in this state, even enjoyable things. Schoolwork is the last thing on a student’s mind in this state, yet that same work can cause it pretty directly. If a student works hard on an assignment, but receives a poor grade, that can trigger RSD. Even getting a 0 on an assignment they didn’t do can occasionally trigger RSD, which creates a vicious cycle. Another thing that school is good at triggering is PTSD-associated panic attacks. When in this state, people with PTSD “may be flooded with anxiety to the point of struggling to draw breath, and feeling disoriented, dizzy and nauseated,” creating another of those vicious cycles (Manne). Discrimination on its own is clearly bad, but it can have negative impacts on individual students at another level: mental health.

I apologize if I’ve used any terminology incorrectly; I think I said things the right way but I’m no expert. I just know what I’ve felt firsthand, this isn’t something I’ve studied rigorously.

Mental Health

Grades can have a serious impact on students’ mental health. Ask almost any student; they’ll tell you all about how their latest exam stressed them out, or how they stayed up all night writing a paper and weren’t able to enjoy their weekend because of it, or that poor grade they just got that has them feeling miserable. School takes a serious toll on students, but it doesn’t have to be this way. While perhaps not the only contributing factor, grades are a significant reason for the state of students’ mental health.

Studies have shown that cortisol, which is a neurochemical related to stress, increases with daily academic stressors, and with declining grades in particular (Lee et al). While there are good reasons for school to be challenging, it should not be to the point of stressing students out. It’s important to understand that not all challenges are made equal. Challenge is what promotes learning because overcoming challenges means that students must have the necessary skills and knowledge to do so. The general idea of challenge is not good in itself, and if the challenge of school comes from areas other than what students need to learn, it can only hinder learning. If grades are causing such a tangible increase in stress, it is clear that schools are adding too much unnecessary challenge, and it stresses students out (creating new unnecessary challenges).

Research also shows that grades can take precedence over general happiness. One study of British schools showed that “the British government's focus on standardized testing in schools stifles students' emotional development” and student wellbeing is ignored in favor of tests (Mansell). It is imperative that school systems see students as actual human beings and not just numbers in a system. One big thing they can do to make that happen is to show genuine care for their mental health. Many schools do that, by providing counseling and wellness workshops, which is good, but it conveniently avoids addressing what caused that stress in the first place.

This stress can go very far, even to the point of suicide. Joshua Eyler, the director of the Center for Teaching Excellence at Rice University, makes a point about many schools acknowledging that grades are a major stressor, but focusing their efforts instead on other stressors (Eyler). Particularly in schools where academic pressure is high due to competition, this can drive students to suicide. Worcester Polytechnic Institute even had at least 3 suicides over the course of about a year (Eyler). If academic pressure kills even one person, we should rush to fix whatever system is causing it, yet it keeps causing suicides and we keep doing too little about it. WPI isn’t alone here. Lots of schools, even less competitive ones, suffer from this problem.

CollegeBoard in particular makes most of the existing problems with grades even worse. Standardized tests are even less accurate than teacher-graded tests, are often more discriminatory than them, and are a much more unhealthy motivator, putting beyond unreasonable pressure on students. They’re also predatory, taking money from vulnerable students left and right so they can have a chance at college, whilst paying their CEO millions and calling themselves a 501(c)3 (“The Real College Board”). There’s a lot more to say, but I have a finite amount of time, so I’ll leave this topic here.

Counterarguments

Despite everything, many people believe that grades are necessary, and that school would not function without them. One function of grades is to motivate students, which is definitely important. Grades do turn out to be effective at this. While some proponents of grades may agree that the way it does this is unhealthy to some extent, they argue that they are the only way to effectively motivate students, or at least that any other effective system would have the same problems. However, this generally stems from the idea that there are only two possibilities, which are to reward or punish students (Bee). However, another option is to foster an interest from within the students. This is definitely more difficult to do, but if students create their own projects with their own goals and the teacher acts as a guide, it can work. Teachers should be getting students excited about what they’re learning anyways; removing grades makes that so much easier because the overjustification effect is no longer in the picture. Students will for the most part need to have some legitimate degree of control over their work in order for intrinsic motivation to be able to take over, which is perfectly reasonable as long as the teacher can step in as the final arbiter as to what does and doesn’t constitute adequate work for the class. And even if it doesn’t, students should (often) be encouraged to do that work anyways provided they have the time, because they can still learn something from it, even if it’s not part of the class.

Even under systems that stick to extrinsic motivation, there are healthier ways to handle it. A system that stresses students out to the degree that grades do is clearly terrible, so removing the factors that make it stressful by altering the motivation would be beneficial. Grades have a lot of weight attached to them, which is a big part of why they’re stressful, so reducing how impactful they are would work towards that. Students don’t typically tend to stress themselves out over a 1-point 3-sentence assignment. Removing tests and exams can help with this, because on top of importance, they have time pressure.

Part of that weight comes from colleges, jobs, scholarships, and whatever else a student might want that looks at their grades to determine whether or not they get it. A school can either send a list of letter grades to these institutions, or they can send an actual letter detailing the student’s abilities and accomplishments (Bee). The former is a rough estimation of how much a student knows, while the latter is a detailed explanation of it, which is clearly more useful.

Some people argue that the ordinality of grades is a good thing. Despite losing some information, they believe that it is more useful to have something to compare students with than to keep all the details and lose ordinality. The main problem with this argument is that different institutions evaluate students differently. What one college thinks is A work, another college may think is B work, while another considers it F work. Nobody is “wrong” here; there is no right answer. More nuance is needed in order to effectively evaluate students, no matter who is doing it. This system is also less prone to teacher bias, as teachers are stating rather objective facts, instead of sending in a number they have control over without explaining how that number was chosen.

Additionally, proponents of standardized testing claim that the tests are fair (“Pros and Cons”). Standardized tests create a near identical environment for testing so that everyone who takes the test is evaluated in the same way. While it succeeds at that, there’s more that goes into making an effective measurement of knowledge. If a standardized test does not accurately reflect knowledge, then it doesn’t matter that it’s graded with extreme consistency; it’s still inaccurate.

There is also the matter of certification. Most people don’t want a surgeon to perform an operation on them if the surgeon isn’t qualified. However, an alternate system to grades can be used: binary certification. This is similar to grades, with the primary difference being that in place of more granular grades, the only result given (or the only result shown to others outside the classroom) is a “pass” or a “fail”, with nothing in between. This loses almost all of the information that students get from grades, in exchange for a much clearer form of grading, that will (presumably) be accompanied with feedback anyways.

Certification is very good for two types of education: general education, and terminal education. General education, the types of classes students take outside of their specialization so they can be more well-rounded, don’t really need to provide much information in the long term; schools only need to see that the student is adequate in the subject, so they can move on to other classes. Many students already have the mindset of “I only need a C- in this class”, and a proper pass/fail system would no longer punish their GPA for focusing on their more important classes. Terminal education is similar. If someone is in a graduate program, or otherwise doesn’t intend to continue education, their grade will not be used for anything in the future, so all they need to know is whether or not they were good enough at what they were doing.

However, outside of these cases, there is one major problem remaining: schools need to improve, and grades are very legible to schools. In order to improve, schools need to know how well they’re doing, and to know how well they’re doing, they need to know how well students are doing. Not only that, but they need to know how well all of the students are doing, and they need to be able to process that information every year. Something like written feedback is simply too much information for a system like this to work (Lowther). Faculty can’t process the thousands of pieces of written feedback they would receive if they just got a single statement from each teacher about each student, and so without losing a lot of information about how students are doing, they can’t effectively improve. It’s similar with college admissions; they can’t process nearly as many applications as they want to if they’re given too much information (Lowther).

Solutions

Because of this, giving detailed feedback to students and nothing else is not a viable solution. It’s what’s best for the students, and arguably for the teachers too, but it’s not sustainable for the school as a whole. It would, however, solve the vast majority of the problems with grading, so alternate solutions should seek to follow in its footsteps. There are many possibilities to consider, all of which have benefits and drawbacks. The general trend is that the better a system works, the fewer people it is effective for. For example, the current 0-100 system has lots of flaws but isn’t that heavily affected by who is subject to it, while a standards based system, in which students are graded subjectively by the teacher on a set of standards, works really well, but is extremely prone to discrimination, almost nullifying its effectiveness (Will; Lowther).

There is no one-size-fits-all solution, which makes sense. If people are different, why would they respond the same way to a given system? Interval systems (e.g. 0-100) and certification systems (pass/fail) have already been discussed, but there are also discrete systems, which are partially in effect and are a middle ground between those systems. In a discrete system, there are only a handful of possible levels of feedback students can receive. The A-F system is an example of a discrete system if you disconnect it from the 0-100 system entirely. An “A” is no longer at least 90%, but rather the highest of the options available. While discrete systems (particularly without changing much else) keep many of the same problems as interval systems, they are mostly not as severe.

Perhaps the easiest grading reform that doesn’t suffer from this legibility problem of having too much information is to hide grades from students. Grades can be made into a much more internal mechanism, which will stress students out a lot less, and get rid of many of the other problems that come with grading. In order to encourage students to get good grades (which is needed so that schools can see how well they’re doing), the grades need to matter in at least some capacity to students, so it makes sense to send those grades to colleges and others to accomplish that. That makes it questionable at best to hide grades from students entirely, so students should be able to request their current grade in any given class. This is a practice that does get used at some schools, and it works well; many students don’t even request their grades once (Lowther).

Another course of action that is very feasible is a revision of assessment. Timed tests (including standardized tests) in particular are flawed, as described before. There are plenty of ways to assess a student’s knowledge, and different methods are better for different subjects and different students. Examples which are already commonly in use include projects, papers, and presentations. Other lesser used examples include 1-on-1 meetings where the student explains a concept to the teacher, graded in-person or text-based discussions (where students are graded on quality of content in addition to quantity), and something similar to a portfolio of student-driven projects even outside of art classes (this could work especially well for programmers). There are of course more viable alternatives.

Another important change is in how questions are asked. Questions should be truly open-ended so that students can show more of their knowledge and generally have a deeper understanding of the topic, rather than a half-open question (or, even worse, a closed-ended question) which limits creativity and expression of knowledge. For example, if a question asks something like “find the net force on this object”, once the student finds that vector they’re done. However, if it asks something more like “discuss the forces acting on the object”, they could talk about how gravity and normal force with the ground cancel each other out, and how there’s tension from a rope that pulls it forward but doesn’t pull it up strongly enough to cancel gravity, and they could also comment on the object’s acceleration. This is an extremely arbitrary example, but the principle applies in general. Each question will of course take longer to answer and be harder to grade, but students will get so much more out of it, and won’t really be able to regurgitate memorized facts without having a deeper understanding of the content.

Here’s a spreadsheet with my sources. Thank you for reading. Hopefully I’ll include some more pictures next time.

Additional Thoughts From 2025

I’ve done some more researching and thinking from when I first wrote this, and I have a clearer image of what I think school should look like. It’s different from the ground up. Teachers are instead mentors, coaching a group of students on their educational journeys. They have connections with experts in the students’ fields of interest, and help students fulfill their own goals. Grades aren’t the only thing gone, semesters are as well. Who had the idea that students should learn at the same pace? What a weird thought. Assignments as we know them may or may not exist, but the mindset would be “this is a thing to help you learn and achieve your goals”, rather than “do this thing so I can give you points”. Students would gain autonomy over time (because it’s not exactly practical to have 5 year olds guide their own long term learning), and whenever it makes sense for them in their situation, schooling would naturally shift towards securing a job. I’m sure there are flaws in my idealized system, but I have other hyperfixations to get to now.

A little guy, for your time.

Hystrex’s Substack

Discussion about this post

Ready for more?