In this chapter I examine the verification of the Threshold Test Ban Treaty (TTBT) during the Nixon, Ford, and Carter presidencies. Considerable controversy arose over the yields of Soviet weapons tests, much of it sad and frustrating, between 1974 and 1990. This controversy became known as the “yield wars.”
A number of arms control advocates thought the United States should not have entered negotiations for a TTBT in 1974, believing that the testing threshold of 150 kilotons was too high. They much preferred either a full ban on underground testing or a much lower threshold for allowed testing of nuclear weapons. Nevertheless, neither of those was likely to have been attempted as the Nixon administration was coming apart during the Watergate scandal.
I became involved in the TTBT negotiations in 1974 on a moment’s notice when I was suddenly asked to take part. I was motivated by the desire to lend my expertise in the field of seismology to do whatever I could to help bring about a treaty. Little did I know that the TTBT, signed in 1974, would take until 1990 to be renegotiated and finally to enter into force. The long debate about yields, with U.S. claims that the Soviets were cheating, postponed consideration of a full test ban treaty, or CTBT.
For decades, however, the United States had been using a method that greatly overestimated the yields of Soviet nuclear explosions, and it continued to do so for an additional fifteen years after the treaty was signed in 1974. The political consequences were huge, with the U.S. government incorrectly accusing the Soviet Union of cheating on the TTBT by testing well above its threshold. This occurred at a tense time during the Cold War.
The United States and the USSR each agreed that they would not test above the 150-kiloton limit of the TTBT after March 31, 1976, preventing the testing of megaton nuclear explosions by the two countries thereafter. It occurred too late, however, to prevent the testing of weapons carried by missiles with multiple warheads. In the twenty-one months between the signing of the TTBT in July 1974 and its start date in 1976, the Soviet Union and the United States each conducted a number of weapons tests much larger than the 150-kiloton limit of the treaty. They did not do so afterwards.
Each of the signatories to the TTBT was required to specify its sites for testing nuclear weapons. The United States chose the Nevada Test Site (NTS), and the USSR designated Novaya Zemlya and Eastern Kazakhstan. The treaty called for an exchange of information on the geological and physical properties of each geophysically distinct part of each test site. It also called for exchange of information on yields of two past large explosions within each distinct subarea. Those data were to permit better calibration of past and future nuclear explosions. The exchange of data was to occur upon ratification of the treaty, but this did not happen until 1990.
Contamination of the Eastern Kazakhstan site, the spread of radioactive debris off-site, and reports of widespread cancers led to demonstrations in Kazakhstan over continued nuclear testing. It proved to be a major factor in the breakup of the Soviet Union, as Kazakhstan became an independent country and testing was halted there in October 1989. The Russian Republic conducted tests until 1990 at its Arctic site on Novaya Zemlya.
Having received a secret security clearance from the U.S. government in 1968, I was one of a group of seismologists who were briefed on the determination of seismic magnitudes and yields of Soviet underground explosions using the classified capabilities of the Air Force Technical Applications Center (AFTAC) several months prior to the TTBT negotiations in 1974. Most of the discussion at that meeting, however, was about seismic magnitudes, including Soviet instruments, and why the Soviet Union obtained different magnitudes than the United States for either the same events or the same areas. The classified U.S. magnitude-yield curves were shown to our group but not discussed. That was a mistake, because yield determination was central to ascertaining if the USSR had exceeded the 150-kiloton threshold of the TTBT after March 1976.
Carl Romney of the Defense Department and Eugene Herrin of Southern Methodist University were among the few seismologists in the United States who long had access to and used classified determinations of magnitudes and yields of U.S. explosions. Romney knew that explosions of a given yield in hard rock and salt generated higher seismic amplitudes than those set off in soft rock. Romney mentioned in his 2009 book that early 1960s U.S. tests in Nevada in dry alluvium generated seismic waves (and their associated magnitudes, mb) that were much smaller than those for tests in tuff, a relatively soft volcanic rock. The few U.S. explosions in hard rock in Nevada in the 1960s had generated the largest seismic waves (when differences in yield were taken into account). The Gnome and Salmon explosions in salt in 1961 and 1964 generated seismic waves that were even larger at stations in the central and eastern United States.
Nevertheless, Romney and others in the U.S. government insisted on setting the treaty threshold by magnitude rather than yield. Soviet scientists and officials undoubtedly knew that their numerous nuclear tests at sites in hard rock had generated larger seismic waves for a given yield; they would be at a distinct disadvantage if the threshold were based on magnitude and the United States were permitted to test in alluvium and tuff. Hence, a magnitude threshold was not acceptable to the Soviet Union.
EFFORTS TO DETERMINE SOVIET YIELDS: THE AFTAC PANEL
Once the TTBT was signed in July 1974, AFTAC made a major effort to pull together U.S. seismic data on yield determinations for Soviet and U.S. underground explosions. Thomas Eisenhauer and Robert Zavadil, senior scientists at AFTAC, studied surface waves from Soviet underground tests. They concluded in a classified document that Soviet yields calculated from surface waves were systematically smaller than those determined from short-period P waves (and the magnitudes mb determined from them). The United States used P waves, not surface waves, to calculate yields of Soviet explosions.
I determined to better understand the U.S. classified method of yield estimation, making it a top priority when I returned from the TTBT negotiations in 1974. Soon thereafter I became a member of a classified panel advising AFTAC on the seismic determination of yields of Soviet underground nuclear explosions. Except for its last meeting in California, the panel convened at Patrick Air Force Base in Florida. This was the first of a number of closely controlled panel meetings.
Chaired by Herrin, the panel convened several times until AFTAC asked that it no longer meet in 1977. Although it was not dissolved, no reason was ever given for why it never met again. Presentations filled the time allotted to the panel, but there was seldom time to discuss major conclusions. Interestingly, Herrin never sent a classified draft summary to each of us for comments. Instead, he either sent his own version directly to officials in the Defense Department or briefed them orally.
The panel first heard the results on yields determined from surface waves by Eisenhauer and Zavadil. Herrin and Romney argued vociferously that surface wave determinations of yield should neither be trusted nor used for yield determination because those waves were contaminated by “tectonic release”: underground explosions in hard rock often trigger the relief of varying amounts of natural stress (pressures) in rocks near a shot point. Its effect was like adding the signal from an earthquake to that of the explosion itself.
For some explosions in Eastern Kazakhstan tectonic release was large, but for others it was quite small. I argued that those events with small tectonic release could be used to calibrate the magnitude-yield curve for P waves, but AFTAC refused to do so because Romney and Herrin were adamant that their formula using P-wave magnitudes, mb, was accurate.
At another panel meeting, the commanding general of the base made an opening statement that we should stick strictly to the written agenda, which I think was put together by either Herrin or Romney or both. Under their powerful influence, that meeting and others had no effect on changing the U.S. procedure for yield estimation. The panel was usually asked to comment on seismic absorption and possible differences in magnitudes (called mb bias) of the Eastern Kazakhstan test site with respect to Nevada. It was never asked, however, to comment on or even discuss larger questions, such as what was the best method of determining Soviet yields or whether the USSR had tested above the 150-kiloton limit of the Threshold Treaty.
The panel examined the existing procedure for yield determination using short-period P waves. It was necessary to use known data for explosions in hard rocks because most Soviet explosions at their two main test sites were situated in those rocks. The catch was that few U.S. explosions were detonated in hard rock. Data existed for only three early underground explosions in hard rock in Nevada—Hard Hat (5.7 kt), Shoal (12 kt), and Pile Driver (62 kt)—as well as magnitudes and yields of two French underground nuclear explosions in southern Algeria in 1963 and 1965. Several of us on the panel thought the seismic data for those U.S. explosions were poor. In addition, the yields, especially those for Hard Hat and Shoal, were much smaller than the 150-kiloton limit of the TTBT. The published yields of the two largest French explosions in granite—Ruby and Saphir—were 52 and 120 kilotons.
The Soviets had released the yield of only one nuclear explosion at their two main test sites: a 125-kiloton peaceful explosion that created a large crater at the Eastern Kazakhstan test site on January 15, 1965. It was difficult to use it for accurate calibration of the yields of contained underground explosions—that is, those that did not blow out at the surface as the event in 1965 did. The United States had released much more information on yields of its nuclear explosions. What was clear to me in 1974 and later was that there was no “magic bullet” for estimating Soviet yields. P waves had one set of problems and surface waves another. Our strategy should have been to come up with procedures that gave what we thought were the best determinations of yields and of their uncertainties. That did not happen.
Herrin and Romney, who were well aware that different magnitude-yield relationships applied to soft rocks like tuff and alluvium, were the principal architects of the mb-yield curve used to calculate yields of Soviet tests in hard rocks. I think they strongly believed that the mb-yield relationship for hard rock applied everywhere and that mb for a given yield was only a function of the type of rock at the site of the explosion (the shot point). That premise, which turned out to be false, carried major scientific and political consequences until the issue finally was resolved fifteen years later.
As discussed in an earlier chapter, much evidence existed in 1974 that the magnitude mb of explosions in Nevada and western Colorado varied in a consistent manner for paths to most seismic stations in the United States and Canada. For paths to stations in eastern and central North America, arrival times were early, indicating faster velocities, and magnitudes were larger, indicating smaller P-wave absorption.
DIFFERENT VIEWS ON YIELD DETERMINATION
Peter Marshall, a British scientist who worked on nuclear test verification, and two scientists from Livermore concluded that those findings for stations in North America likely were associated with differences in the Earth’s upper mantle at depths of about 30 to 125 miles (50 to 200 km) at both the down-going and up-going ends of the paths traversed by P waves. In 1976 Marshall and Springer correlated those differences in travel times and absorption with differences in the speed of seismic P waves, called Pn, that travel in the uppermost mantle of the Earth just below the crust. Pn velocities average about 8.2 kilometers per second (km/s) in the central and eastern United States and about 7.85 km/s in the west.
Marshall and Springer found that differences in average Pn velocities correlated with differences (or residuals) in mb values that averaged about 0.3 of a magnitude. This translates into a variation in yield by a factor of 2.5. Measurements of Pn were widely available for parts of various continents, including the USSR. Thus, they concluded that reliable measurements of Pn could be used to infer differences in the absorption of P waves in the upper mantle beneath various test sites. Hence, mb values for Soviet tests could be corrected to better estimate their yields. Marshall and Springer mentioned that more accurate yields of Soviet tests were needed once the United States had signed the Threshold Test Ban Treaty.
It was well known before the advent of plate tectonics in 1967 that the upper mantle beneath Nevada and western Colorado is characterized by slow Pn wave speeds and greater absorption of short-period seismic P waves. These properties clearly differed between young areas like Nevada and regions of older hard rocks, such as those in central and eastern North America and much of the USSR.
In 1979 Marshall of the UK and Springer and Howard Rodean of Livermore went on to quantify Marshall and Springer’s earlier results for seismic stations and Pn velocities globally. For a given yield, they deduced that mb magnitudes for hard rocks in Kazakhstan were about 0.38 magnitude units larger than tests in hard rock at the Nevada Test Site, which translates into a variation in yield by a factor of about 3.2. Their results for Kazakhstan were not at the Soviet test site itself but nearby. Kazakhstan, including the test site, contains large areas of very old hard rocks.
Soviet bulletins published the arrival times of P waves at their seismic stations. The station at Semipalatinsk near the Kazakhstan test site reported arrivals from earthquakes worldwide that were systematically early, like the early arrivals at stations in central and eastern North America. Early arrivals are indicative of high seismic velocities beneath stations.
Norwegian seismologists observed that P waves from Soviet tests recorded by their NORSAR seismic array, located on hard rocks, were rich in frequencies up to at least 8 cycles per second (8 Hz), whereas P waves from NTS explosions recorded in Norway lacked such high frequencies. This was additional evidence that absorption (attenuation) of P waves was low in the upper mantle beneath Soviet test sites and high beneath the Nevada Test Site.
The difference in mb between Nevada and Soviet test sites for explosions of the same yield became known as the magnitude bias (mb bias). It was a systematic effect, and not taking it into account resulted in large overestimates of the yields of Soviet tests. This difference was not trivial: U.S. estimates of Soviet yields were about three times too large. Arguments about magnitude bias and Soviet yields continued until 1988, when they were finally resolved.
At what turned out to be the last meeting of the AFTAC panel I was on in 1977, at least three of us—Thomas McEvilly of UC Berkeley, Springer of Livermore, and I—argued that the U.S. procedure for estimating Soviet yields from P-wave magnitudes (mb) was incorrect and that surface waves needed to be included in some way. The magnitude mb could be used, but it needed to be calibrated for differences in the absorption of P waves—that is, for magnitude bias.
Several of us, including me, asked for a straw vote of the panel concerning these important questions. A majority agreed with us. Herrin and Romney, however, disagreed. They held, and continued to argue for nearly another fifteen years, that it was premature to change the existing procedure for yield determination using P waves. I think the reason our AFTAC panel never met again lies in the strong opinions of Romney and Herrin, who refused to consider other scientific voices. A different AFTAC panel was formed years later to advise again on seismic issues, which Herrin again chaired for many years. Several of us from the original panel were not included. Herrin died in 2010.
AD HOC PANEL ON YIELD DETERMINATION UNDER THE DEFENSE SCIENCE BOARD
Following the last meeting of the original AFTAC panel, the U.S. Defense Science Board formed an ad hoc group on Soviet yields in 1977. A member of a consulting firm in northern Virginia, who had no expertise in either seismology or other areas of geophysics, chaired the panel. His firm, which had beautiful views of the Pentagon and the Potomac River from its offices in Roslyn, Virginia, did considerable classified work for the Defense Department. Clearly Romney instigated the formation of the panel, likely picked its members, and made the major presentation. He and I were the only members from the then defunct AFTAC panel. Romney may have asked that I be a member of the committee so he could claim that dissenting views on yield determination, such as mine, had been heard.
Other members of that panel included seismologists Carl Kisslinger of the University of Colorado and Donald Helmberger of Cal Tech, rock mechanist Eugene Simmons of MIT, a member of the staff at either the Sandia Weapons Lab or Los Alamos, and Richard Wagner of Livermore. None of them was familiar with yield determination. Helmberger was the only one who had worked on the seismic absorption of P waves. Wagner later became an influential hawk as an assistant secretary of defense during the Reagan administration. When he moved to the Department of Defense, he argued that the Soviets were cheating on the Threshold Test Ban Treaty and other arms control agreements.
Romney never mentioned arguments made by many of us on the recent AFTAC panel against using the Romney-Herrin relationship for yield determination. He mentioned neither the determination of Soviet yields using surface waves nor corrections to the seismic magnitude mb based on differences in Pn velocities. He showed data from a single station in Missouri, where seismic waves were small like those in the western United States, but not data from the large numbers of stations in North America where just the opposite was observed. It was a very biased selection by Romney; it did not occur by chance.
Again, the agenda was set and tightly controlled. I was given about five minutes to counter Romney’s claims and to mention the findings of the previous AFTAC panel. Helmberger was not helpful in stating that the magnitude bias between Nevada and Eastern Kazakhstan likely was small. Simmons unfortunately accepted Romney’s assertions because he regarded him as an expert who worked full-time on seismic verification.
Romney, I discovered with time, was a very cagey person who knew his audience very well, including what he could get away with. He would give one story to a knowledgeable seismologist and another to people, like the others on this ad hoc panel, who were not familiar with what had gone on at the last AFTAC panel or what Marshall and Springer had concluded in 1976. I found Romney’s presentation to the ad hoc panel outrageously deceptive. I never trusted him again. I think he likely regarded me as an enemy for my views on this panel and the AFTAC one that preceded it.
Romney persuaded the ad hoc group to endorse a one-page secret document stating that the Russians could be breaking the treaty by testing weapons with yields much larger than the 150-kiloton limit of the TTBT. How much greater is likely still classified. I came away very dejected with that outcome and angry about Romney’s manipulation of the panel. That classified document “rang a lot of bells” in the U.S. government. At the time that yields of tests at Eastern Kazakhstan were being overestimated by many times, Romney concluded that the actual yields were even higher, not smaller.
REVIEW BY CARTER’S SCIENCE ADVISER
In 1977 Jimmy Carter was inaugurated as president, and seismologist Frank Press became his science adviser and head of the Office of Science and Technology Policy. I wrote an unclassified letter to Frank, whom I had known for a long time, saying I had been part of the AFTAC panel and the recent ad hoc committee. I stated that I strongly disagreed with the one-page statement of the ad hoc committee and that this was a very important matter. Press soon wrote back requesting that I send him the secret document and my views about it via a classified route. I complied and sent Press a classified letter of a few pages describing why I disagreed and why I thought that the ad hoc panel’s conclusions were quite incorrect.
Press convened a panel of his own, in the West Wing of the White House on September 1 and 2, 1977, to explore again the evidence about yields of Soviet explosions. He called Romney to testify. The panel consisted of Springer from Livermore, seismologist Robert Massé of AFTAC, and me. Massé broke ranks with the AFTAC-ARPA view and said that Romney was incorrect about determining yields of Soviet tests. Springer, a very straightforward person, said in a nonacrimonious way that Romney was wrong. I voiced similar views. The panel concluded that the United States was overestimating the yields of Soviet explosions.
The panel recommended that the official classified magnitude-yield formula should be changed to include a correction for magnitude bias using P waves and that surface waves should be incorporated into yield determinations. It was my view that AFTAC needed to do more work on how to merge P-wave and surface-wave data in a thoughtful way. One way was to use Soviet explosions characterized by low to little tectonic release to obtain a better estimate of mb bias. AFTAC and the Defense Department, however, did not permit that work to go forward, stating they needed to be “tasked” to do it.
I was told that Frank Press forwarded our recommendation about yield determination using both P and surface waves to Zbigniew Brzezinski, who headed the National Security Council under Carter, but that Brzezinski overruled Press and our recommendation, saying it was stupid to have two different methods for yield determination.
I also learned that the Carter administration concluded that the United States should drill deeper holes in Nevada to test weapons exceeding the 150-kiloton limit of the TTBT. This was in answer to the Soviets’ alleged cheating on the treaty. I would like to think that my and a few others’ raising hell and enough valid scientific points helped forestall the United States from setting off explosions above the limit of the Threshold Treaty. That would have stirred up a hornet’s nest in Russia at a time of tense relations between our two countries.
MORE EVIDENCE OF BIAS IN ESTIMATING SOVIET YIELDS
By 1979 it was clear from a long and thorough publication of Marshall, Springer, and Rodean that the magnitude (mb) bias for explosions in hard rock between the Nevada Test Site and Eastern Kazakhstan was 0.3 to 0.4 magnitude units. Their work also found that the propagation of P waves from the French test site in southern Algeria and Nevada were very similar. Romney had claimed that the Algerian site in the Hoggar (Ahaggar) region was situated in old rocks of the West African Shield and hence could be used to calibrate Soviet yields. Instead, Hoggar is one of several young elevated regions in Africa that are known as “hot spots” to those familiar with plate tectonics. It is not an old geological area like the surrounding West African Shield.
Marshall and others found that the Pn velocity beneath Hoggar itself was low, as it was beneath Nevada. Therefore, P waves were absorbed beneath Hoggar in a similar way to those beneath Nevada. Thus, Hoggar data also needed to have a correction made for a bias of about 0.3 to 0.4 magnitude units. The bottom line was that it was not a good analog for Eastern Kazakhstan.
Clearly, the British government was concerned about the U.S. methodology for determining yields of Soviet tests. Otherwise, Marshall, who worked for the UK Ministry of Defense on seismology, would not have stated his views so forthrightly in peer-reviewed journals. The same is true for Springer and Rodean—two knowledgeable scientists who had long worked at Livermore. It is quite surprising, as well as unfortunate, that their views did not carry more weight in the U.S. debate about Soviet yields. That debate festered until 1988.
A FAILED ATTEMPT AT A FULL TEST BAN TREATY
After 1963, a full test ban (a CTBT) did not resurface until the Carter administration entered into negotiations with the Soviet Union and Britain in 1977. The United States decided not to press for ratification of the TTBT and PNET but to push instead for a full ban on nuclear testing. Formal negotiations started in Geneva in July 1977. In his 1981 book, Glen Seaborg indicates that Herbert York, who led the U.S. negotiations from 1979 to 1980, informed him about what had been accomplished, most of it by the end of 1978. The main outlines of a treaty included automatic seismic detection stations on the territories of the three parties, a system of voluntary on-site inspections with arrangements for challenges and responses, a treaty with a three-year duration, and a moratorium on peaceful nuclear explosions.
After an initial good start, the CTBT negotiations stalemated when Margaret Thatcher became the UK prime minister, opposition to a CTBT increased in the United States, the Soviet Union was reluctant to accept on-site inspections, and the Soviets invaded Afghanistan. Opponents of a CTBT in the United States cited the need to test weapons for the Trident II missile and energy-directed weapons (called “Star Wars” by opponents). In addition, the Carter administration did not want to push for a CTBT until the Strategic Arms Limitation Treaty, SALT II, was completed.