4

The New SAT I and Recentering

In the first part of this chapter, I will describe in detail how the SAT I has changed and what the effects of “recentering” are in both layman's and more technical terms. In the second part of the chapter, I will explain what effects these changes have had on highly selective admissions. Anyone over the age of eighteen should read this section carefully, because the SAT I that you took has undergone dramatic changes. Therefore, it is impossible to compare old scores with new scores.

When the SAT was first developed, the median score was set at 500 for both the math and the verbal. Keep in mind that the median represents the midpoint score, so if half the class scored over 50 on a test and half scored under 50, the median score would be 50. Thus, in 1950, if you got a 560 on the verbal and a 480 on the math, you were above the national median in verbal and below it in math. In other words, it was fair to say you were stronger in the verbal area. However, in the last forty years or so, the number and composition of people taking the SAT has increased dramatically and the median has changed, so that now both verbal and math have different median scores, neither of which is a 500. In 1995, the median verbal score was 428, while the median math score was 482. Thus, in 1995, if a student scored 540V and 540M, he was not equal in these two areas—in fact, that student was stronger in verbal, since the verbal median was actually lower than the math median.

Why did the median score change from the original 500V, 500M scale? The simple answer is that the population of test takers changed dramatically in both number and composition. In 1941, ten thousand students from private secondary high schools took the SAT, averaging 500V, 500M. The incredible thing is that from 1942 until 1995, all SAT takers were compared with this class of 1941, even though by 1995 more than 2 million people took the SAT, a third of whom were minority students. Needless to say, these 2 million students made up a much more diverse group of test takers than the 1941 group.

Another side effect of not adjusting the test scale was that there was not an even bell-curve distribution over the score range 200 to 800. The curve peaked and then glided down toward 800, but not all scores were represented. If you had a perfect score on the verbal section, you would get an 800, but then one or two wrong answers and you'd be down to 760, then 720, and so on—with no one getting 790, 780, or 770. It was nearly impossible to get over a 700, since you had to make very few mistakes, and each mistake would take you down by a much larger margin in this high range than it would if you were going from a 500 to a 490. In other words, a raw-score (your raw score is equivalent to the total number of the questions you answered correctly minus one-quarter of all the questions you answered incorrectly) difference of one point might cause a thirty-point drop on the high end of the scale, whereas in a lower range, it might not cause any difference in the final reported score. Thus, on the old scale, only .1 percent of all test takers scored between 750 and 800 on the verbal section. In order to get a perfect 800, you had to answer every question correctly. Only .9 percent scored between 700 and 750 on the old verbal, so in total, a paltry 1 percent of all test takers scored over 700 on the verbal, whereas a whopping 75 percent of test takers scored below 500. In the math section, only 4.2 percent of the population scored over 700.

Although from experience I knew when I took the SAT that the high scores on the verbal section were stretched out and the low scores were scrunched together, I didn't really understand why until recently—that is, until I read the special report prepared by the College Board (distributed in May 1995) for the Ivy League deans of admissions to explain how score recentering would affect the Ivy League applicant pools. After I read the official explanation, I immediately understood why no one ever has the slightest idea what the College Board is talking about. Let me summarize the main points of their explanation regarding why the old SAT was scored unfairly.

When the statisticians at the Educational Testing Service equated raw scores with scaled scores, they always found a few (about 1 percent of the test-taking population) that fell below the 200 score minimum. Because no score can be reported lower than 200, all of these low scores were reported as 200. Likewise, in the upper range, it was almost impossible to get a perfect raw score, so the statisticians were forced to extrapolate raw scores in the upper end to reach a scaled score of 800. Since these scores were basically stretched out over the high range, there were gaps, so that a raw score of 78 might correspond with an 800, but then the next-highest score (77 raw) would be 760, then 740, and so on. Thus, a difference of two points on the raw scale resulted in a difference of sixty points on the actual scale.

Believe it or not, this “two raw points equals sixty scaled points” formula was not true all along the scale, only at the high end. In the lower ranges, since the scores were not so compressed, a two-point raw-score difference would only equal a real score difference of maybe ten, twenty, or thirty points. The problem was the lack of a consistent relationship between raw score and scaled score, a fact that caused many scoring inequities on the old SAT The extrapolated scores in the upper range were apparently not based on actual data, so high scores were not consistent with scores in other parts of the test range.

In simpler terms, it was impossible to equate each raw score with a scaled score between 200 and 800 because some students would have scored under 200 (not allowed), while no one would have scored over 700. To remedy this situation, ETS had to extrapolate at the extreme ends of the scale based on their own judgment, one that seems not to have been based on hard data. I do not mean to criticize their procedures, but there seems to be a disturbing lack of mathematical method in their assignment of scores in both the very high and the very low ranges. It almost sounds as if they had to fudge the data to some extent.

From all this, what we really need to understand is that on the old scale, most people scored in the middle range, and many of those scores were bunched together, so that the difference between a 530 verbal and a 500 verbal might have been only one or two questions on the entire three-hour test. In addition, very few scored over 700 in either verbal or math. Finally, a 600 verbal score was not directly comparable to a 600 math score, since the median for both tests was different.

In any case, because of all the inequities in scoring, the College Board decided in 1995 to set the median once again at a true 500 for both verbal and math. This decision also had the benefit of restoring the true bell curve, which peaked at 500 and gently sloped down in both directions toward 800 and 200. Remember, the old curve was no longer a perfect bell curve, so there is no way to establish a one-to-one correspondence between old scores and new scores. In other words, you can't just add a certain number to the old scores in order to calculate the new scores. Instead of just adding fifty points to old scores to get new scores, you have to use the College Board's conversion tables equating all old scores with new scores.

In general, since both the verbal and the math median were below 500, scores on the recentered test look higher, although the percentile remains the same. At Dartmouth, for example, the class of 1999 had an average of about a 640 verbal and a 710 math (unrecentered scores), while the class of 2000 had a 705 verbal average and a 710 math. The verbal is really almost the same (there was a real rise of maybe five points), even though it seems much higher, since recentered verbal scores jump as much as sixty points, while the recentered math scores are very similar, with a slight real rise. As you can see, the math hardly changed at all in the upper range, while the verbal changed somewhat dramatically because of recentering.

Rather than listing the complete official ETS conversion table for original SAT Is to recentered SAT Is (which is available from the College Board and from most high school guidance counselors), I have excerpted scores in various ranges for the sake of example. To assure yourself that there is no constant number you can use to change old scores to new, look at some of the following conversions from old scores to new scores:

Original VerbalRecentered VerbalOriginal MathRecentered Math
800800800800
790800790800
780800780800
770800770790
730800730730
720790720720
670730670660
660720660650
650710650650
600670600600
590660590600
580650580590
400480400440
300380300340
200230200200

Notice that on the verbal scores, the conversion just in the examples I cite above can vary from zero points to eighty points, depending upon where you are on the scale. The math conversion can vary from zero to as much as forty points, again depending upon where you are on the scale.

If you want to see more sobering statistics about the new SATs, you need only to look at Harvard's numbers on the re-centered SAT I for the class of 2000. Of the over 18,000 applications they received in 1995, 9,400 students received higher than a combined SAT of 1,400; 1,600 received perfect verbal scores of 800; and 1,900 had perfect math scores of 800. Before recentering, only about twenty to thirty students worldwide scored a perfect 1,600 combined, whereas with the new test, Harvard alone turned down 165 (out of about 365) applicants with perfect 1,600 combined SAT I scores. Nationwide, about 545 students scored 1,600 in 1996.

Many people have not gotten used to the new scores, especially if they had older children who took the old test. In 1995, the governor of New Hampshire, Stephen Merrill, falsely compared the new state average (935) with the old state average (924) and concluded that New Hampshire students had shown a marked jump in intelligence since the previous year, when in fact the scores were nearly identical, if not lower once adjusted for recentering. He went so far as to say, “The SAT results are a tribute to the excellence of the teachers of New Hampshire and our outstanding educational system,”* although he forgot to mention that more than 25 percent (compared to under 15 percent for most states) of the students in New Hampshire attend private schools (St. Paul's and Exeter, to name the two most prestigious), in which most of the students are from out of state to begin with, and that New Hampshire has one of the lowest minority populations of any state in the country. If the governor could make such a mathematically egregious error, imagine what the average parent could invent. “Your younger sister must be smarter, since her scores were higher!”—even if their scores were identical. If you are trying to compare various siblings, use the conversion tables at least to equalize scores within your family. For example, if your first child had a verbal score of 670 and a math score of 710 and your second child (taking the recentered test) scored a 720 verbal and a 700 math, they had the very same math score, while the first child had a higher verbal score, even though it looks lower. Notice how at first glance the latter scores appear higher, but this is not the case.

There has been tremendous confusion over the new scores, which was exacerbated by the fact that the College Board, the year before recentering, also changed the duration of the test (an extra half hour was added), the name of the test (instead of Scholastic Aptitude Test, it is now called Scholastic Assessment Test), and certain sections of the test (they eliminated the antonym/vocabulary part, made the vocabulary more context-based, and added more reading passages) from what they had been for the last forty years or so. No wonder there has been so much confusion.

IMPLICATIONS FOR SELECTIVE ADMISSIONS

The most obvious result of recentering is the fact that what used to be considered an unusually high score on the SAT I is now just an average score. To be exact, for years a combined score of 1,400 was considered the dividing line between normal scores and extremely high scores for highly selective colleges. This is no longer the case. The average combined score for the admitted class of 2000 at Dartmouth was over 1,410. In fact, for the class of 2000, the average scores of all 11,400 applicants who applied to Dartmouth (this includes all the weakest applicants in the pool) was 662V, 677M, almost 1,340 combined. For the class of 2001, the average was even higher: 664V, 683M.

While it used to be extremely rare to see students with verbal scores over 700 (remember, only 1 percent of all students taking the unrecentered test scored over a 700, and those students could not have missed more than three or four questions on the whole test to achieve that score), now it is much more common. In fact, half of the members of the class of 2000 scored over 710 on the verbal section, about the same as the number who scored over 710 on the math section.

Here is an SAT I breakdown of the acceptance rates at Dartmouth for the class of 2000:

Verbal SATMath SAT
399 or less0%0%
400–4491.6%2.4%
450–4994.1%6.3%
500–5495.4%5.4%
550–5997.5%7.1%
600–64910.1%9.5%
650–69914.8%15.1%
700–74928.2%24.4%
750–80052.2%44.0%

All these statistics taken together mean that parents, students, and counselors need to readjust their standard for excellence dramatically in terms of scores. When a college counselor or a parent calls an admissions officer to ask whether or not a student has a chance of being admitted, he usually starts out by announcing that the student has very high SAT I scores. The problem is, when the officer asks what he means by “very high,” he usually says, “Over fourteen hundred,” which, as we have just seen, is not only not “very high”; it is actually below the class average.

If I had to come up with an artificial number of what constituted truly impressive combined scores on the recentered SAT I, I would say over 1,490 or 1,500. Those are still considered very high scores, ones that are not easy to attain, no matter how bright a student is.

I hesitate to give this example, because, as always, it is not the combined score that matters as much as the breakdown. It is still, as it has always been, more impressive to see high verbal scores than high math scores, since most students at the highly selective colleges will be doing much more writing and reading than math. Verbal ability is still a good indicator of how strong a reader the student is. The ability to read well will ultimately have a bigger impact on most college students than the ability to do SAT I math very well, especially since the level of SAT math is not particularly high. There are many students who do terribly on the SAT I math and yet who manage to get the highest score of 5 on the AP calculus exam. If any math is useful at the college level, it is calculus, not the basic math covered on the SAT I. Therefore, SAT I scores of 750V, 630M would be much more impressive for most highly selective colleges than a 640V, 780M, even though the latter score has a higher combined total by forty points.

The final point I want to make about interpreting SAT I re-centered scores is that because most admissions officers are not gifted in math, there is still the tendency to use the 700 cutoff as the magic number between good and excellent scores, even though, as we have seen, the average score is over 1,420 combined. What I mean by this is that all admissions officers should be intimately familiar with the information in this chapter (although I'm sure if you handed out a pop quiz that asked officers what percentage of SAT I takers scored over 700 on both the old and the new SAT, hardly anyone would know the true numbers), but few have intellectualized and internalized the new system to the point that they can be completely consistent. Remember, probably 99 percent of all current admissions officers took the old SAT I, the nonrecentered test. In their own personal histories, 1,400 has always been a high score. It is extremely difficult to consider applicants under a totally different system from the one you had in high school. In the past, a 670V was a good score, not a great score. Now that same 670V is a 730V, which somehow seems higher, even though it is actually not.

Oddly enough, and I saw this during my last two years in admissions, officers still respond to any scores over 700 by saying that they are strong, whereas if the scores are in the mid to high 600s, they tend to refer to them as average. This is simply not accurate. In Ivy applicant pools, a 700V, 700M is now an average, even below-average, score, although some officers will not be able to draw the correct conclusion when analyzing student scores.

Despite this inability on the officers’ part to be 100 percent accurate in interpreting your scores 100 percent of the time, they will undoubtedly improve as the years go on and they see years’ worth of recentered scores. In the meantime, all applicants should be aware of the reality of SAT I, which is that if you want to stand out in the Ivy applicant pool, you should aim for a combined score of about 1,490 or higher,* or at least a verbal score over 730 or so, which would allow you to have a modest 650 math score and still look strong in the applicant pool.

One final point regarding SAT Is: How many times should a student take them? As I mentioned earlier, scores will generally go up a few points each time the student takes the test, as the student's familiarity with the test increases. The problem is, once a student has taken the test two or three times, scores usually plateau, so there is no need to keep on taking them. All the highly selective colleges and Ivies will take into account only the single-highest verbal score and the single-highest math score, whether the student takes the test once or ten times. Keep in mind, though, that the colleges get the scores reported on an official printout from ETS that will list all the scores, including the dates of ally the tests (SAT I scores cannot be supressed—meaning that scores are only reported to the student, but are withheld from colleges—although SAT II scores can if the student requests it). Though they will officially look only at your highest (and use those in the formula discussed in chapter 6), they will see the other scores, too. It does not necessarily make a strong impression on officers if they see that a student has taken the SAT I six or seven times, because they might think to themselves, Doesn't this student have anything better to do with his time? After all, testing isn't everything—maybe the student should have spent the extra time doing better in high school.

There is a certain resentment or even suspicion regarding students who take the test many times. There is really no need to take the test more than two or three times at most, in my judgment. It makes the student look score-obsessed, and it also shows the range of improvement—so I might note that the student originally had a 500V and got it up to a 620, a good jump (although I would probably assume it was after taking a prep course), but still not impressive in the overall pool of applicants, despite all the effort. My suggestion is to take practice tests in your own house as often as you want, but take the actual SAT I only once or twice, three times at the most, if for some reason you anticipate a major rise.

Now that we have made a thorough study of the SAT I, let us turn to the equally important SAT II subject tests.