Aids2 | data.frame | Data on patients diagnosed with AIDS in Australia
before July 1, 1991. |
Animals | data.frame | Average brain and body weights for 28 species of land
animals. |
Boston | data.frame | The Boston data frame
has 506 rows and 14 columns. |
Cars93 | data.frame | The Cars93 data frame
has 93 rows and 27 columns. |
Cushings | data.frame | Cushing’s syndrome is a hypertensive disorder associated
with oversecretion of cortisol by the adrenal gland. The
observations are urinary excretion rates of two steroid
metabolites. |
DDT | numeric | A numeric vector of 15 measurements by different
laboratories of the pesticide DDT in kale, in ppm (parts per
million), using the multiple pesticide residue
measurement. |
GAGurine | data.frame | Data was collected on the concentration of the chemical
glycosaminoglycan (GAG) in the urine of 314 children aged 0 to
17 years. The aim of the study was to produce a chart to help a
pediatrician to assess if a child’s GAG concentration is
“normal.” |
Insurance | data.frame | The data given in data frame Insurance consists of the numbers of
policyholders of an insurance company who were exposed to risk,
and the numbers of car insurance claims made by those
policyholders in the third quarter of 1973. |
Melanoma | data.frame | The Melanoma data
frame has data on 205 patients in Denmark with malignant melanoma. |
OME | data.frame | Experiments were performed on children on their ability
to differentiate a signal in broadband noise. The noise was
played from a pair of speakers, and a signal was added to just
one channel; the subject had to turn his/her head to the channel
with the added signal. The signal was either coherent (the
amplitude of the noise was increased for a period) or incoherent
(independent noise was added for the same period to form the
same increase in power). The threshold used in the original
analysis was the stimulus loudness needed to get 75% correct
responses. Some of the children had suffered from otitis media
with effusion (OME). |
Pima.te | data.frame | A population of women who were at least 21 years old, of
Pima Indian heritage, and living near Phoenix, Arizona, was
tested for diabetes according to World Health Organization
criteria. The data was collected by the National Institute of
Diabetes and Digestive and Kidney Diseases. A total of 532
complete records were used, after dropping the (mainly missing)
data on serum insulin. |
Pima.tr | data.frame | A population of women who were at least 21 years old, of
Pima Indian heritage, and living near Phoenix, Arizona, was
tested for diabetes according to World Health Organization
criteria. The data was collected by the National Institute of
Diabetes and Digestive and Kidney Diseases. A total of 532
complete records were used, after dropping the (mainly missing)
data on serum insulin. |
Pima.tr2 | data.frame | A population of women who were at least 21 years old, of
Pima Indian heritage, and living near Phoenix, Arizona, was
tested for diabetes according to World Health Organization
criteria. The data was collected by the National Institute of
Diabetes and Digestive and Kidney Diseases. A total of 532
complete records were used, after dropping the (mainly missing)
data on serum insulin. |
Rabbit | data.frame | Five rabbits were studied on two occasions after
treatment with saline (control) and after treatment with the
5-HT_3 antagonist MDL 72222. After each treatment, ascending
doses of phenylbiguanide were injected intravenously at
10-minute intervals and the responses of mean blood pressure
measured. The goal was to test whether the cardiogenic chemoreflex elicited by
phenylbiguanide depends on the activation of 5-HT_3
receptors. |
Rubber | data.frame | Data frame from accelerated testing of tire
rubber. |
SP500 | numeric | Returns of the Standard & Poor’s 500 Index in the
1990s. |
Sitka | data.frame | The Sitka data frame
has 395 rows and 4 columns. It gives repeated measurements on
the log size of 79 Sitka spruce trees, 54 of which were grown in
ozone-enriched chambers and 25 of which were controls. The size
was measured five times in 1988, at roughly monthly intervals. |
Sitka89 | data.frame | The Sitka89 data frame
has 632 rows and 4 columns. It gives repeated measurements on
the log size of 79 Sitka spruce trees, 54 of which were grown in
ozone-enriched chambers and 25 of which were controls. The size
was measured eight times in 1989, at roughly monthly
intervals. |
Skye | data.frame | The Skye data frame
has 23 rows and 3 columns. |
Traffic | data.frame | An experiment was performed in Sweden in 1961–1962 to
assess the effect of a speed limit on the highway accident rate.
The experiment was conducted on 92 days in each year, matched so
that day j in 1962 was
comparable to day j in 1961.
On some days, the speed limit was in effect and enforced, while
on other days there was no speed limit and cars tended to be
driven faster. The speed limit days tended to be in contiguous
blocks. |
UScereal | data.frame | The UScereal data
frame has 65 rows and 11 columns. The data comes from the 1993
American Statistical Association (ASA) Statistical Graphics
Exposition and is taken from the mandatory Food and Drug
Administration (FDA) food label. The data has been normalized
here to a portion of 1 American cup. |
UScrime | data.frame | Criminologists are interested in the effect of punishment
regimes on crime rates. This has been studied using the
aggregate data on 47 states of the United States for 1960 given
in this data frame. The variables seem to have been rescaled to
convenient numbers. |
VA | data.frame | Veteran’s Administration lung cancer trial from
Kalbfleisch and Prentice. |
abbey | numeric | A numeric vector of 31 determinations of nickel content
(ppm) in a Canadian syenite rock. |
accdeaths | ts | A regular time series giving the monthly totals of
accidental deaths in the United States. |
anorexia | data.frame | The anorexia data
frame has 72 rows and 3 columns. Weight change data for young
female anorexia patients. |
bacteria | data.frame | Tests of the presence of the bacteria H.
influenzae in children with otitis media in the
Northern Territory of Australia. |
beav1 | data.frame | Reynolds describes a small part of a study of the
long-term temperature dynamics of the beaver (Castor
canadensis) in north-central Wisconsin. Body
temperature was measured by telemetry every 10 minutes for four
females, but data from a period of less than a day for each of
two animals is used here. |
beav2 | data.frame | Reynolds describes a small part of a study of the
long-term temperature dynamics of the beaver (Castor
canadensis) in north-central Wisconsin. Body
temperature was measured by telemetry every 10 minutes for four
females, but data from a period of less than a day for each of
two animals is used here. |
biopsy | data.frame | This breast cancer database was obtained from the
University of Wisconsin Hospitals, Madison, from Dr. William H.
Wolberg. He assessed biopsies of breast tumors for 699 patients
up to July 15, 1992; each of nine attributes has been scored on
a scale of 1 to 10, and the outcome is also known. There are 699
rows and 11 columns. |
birthwt | data.frame | The birthwt data frame
has 189 rows and 10 columns. The data was collected at Baystate
Medical Center, Springfield, Massachusetts, during 1986. |
cabbages | data.frame | The cabbages data set
has 60 observations and 4 variables. |
caith | data.frame | Data on the cross-classification of people in Caithness,
Scotland, by eye and hair color. This region of the United
Kingdom is particularly interesting, as there is a mixture of
people of Nordic, Celtic, and Anglo-Saxon origin. |
cats | data.frame | The heart and body weights of samples of male and female
cats used for digitalis experiments. The cats were all adult,
over 2 kg in body weight. |
cement | data.frame | Experiment on the heat evolved in the setting of each of
13 cements. |
chem | numeric | A numeric vector of 24 determinations of copper in
wholemeal flour, in parts per million. |
coop | data.frame | Seven specimens were sent to six laboratories in three
separate batches and each analyzed for analyte. Each analysis
was duplicated. |
cpus | data.frame | A relative performance measure and characteristics of 209
CPUs. |
crabs | data.frame | The crabs data frame
has 200 rows and 8 columns, describing 5 morphological
measurements on 50 crabs,
each of 2 color forms and both sexes, of the species
Leptograpsus
variegatus, collected at Fremantle, Western
Australia. |
deaths | ts | A time series giving the monthly deaths from bronchitis,
emphysema, and asthma in the United Kingdom, 1974–1979, for both
sexes. |
drivers | ts | A regular time series giving the monthly totals of car
drivers in Great Britain killed or seriously injured from
January 1969 to December 1984. Compulsory wearing of seat belts
was introduced on January 31, 1983. |
eagles | data.frame | Knight and Skagen collected data during a field study on
the foraging behavior of wintering bald eagles in Washington
state. The data concerned 160 attempts by one (pirating) bald
eagle to steal a chum salmon from another (feeding) bald
eagle. |
epil | data.frame | Thall and Vail give a data set on 2-week seizure counts
for 59 epileptics. The number of seizures was recorded for a
baseline period of eight weeks, and then patients were randomly
assigned to a treatment group or a control group. Counts were
then recorded for four successive two-week periods. The
subjects’ age is the only covariate. |
farms | data.frame | The farms data frame
has 20 rows and 4 columns. The rows are farms on the Dutch
island of Terschelling, and the columns are factors describing
the management of grassland. |
fgl | data.frame | The fgl data frame has
214 rows and 10 columns. It was collected by B. German on
fragments of glass collected in forensic work. |
forbes | data.frame | A data frame with 17 observations on the boiling point of
water and barometric pressure, in inches of mercury. |
galaxies | numeric | A numeric vector of velocities, in kilometers/second, of
82 galaxies from 6 well-separated conic sections of an unfilled survey of the Corona Borealis
region. Multimodality in such surveys is evidence for voids and
superclusters in the far universe. |
gehan | data.frame | A data frame from a trial of 42 leukemia patients. Some
were treated with the drug 6-mercaptopurine, and the rest were
controls. The trial was designed as matched pairs, both
withdrawn from the trial when either came out of
remission. |
genotype | data.frame | Data from a foster feeding experiment with rat mothers
and litters of four different genotypes: A , B , I and J . Rat litters were separated from
their natural mothers at birth and given to foster mothers to
rear. |
geyser | data.frame | A version of the eruptions data from the Old Faithful
geyser in Yellowstone National Park, Wyoming. This version comes
from Azzalini and Bowman and is of continuous measurement from
August 1 to August 15, 1985. Some nocturnal duration
measurements were coded as 2, 3, or 4 minutes, having originally
been described as “short,” “medium,” or “long.” |
gilgais | data.frame | This data set was collected on a line transect survey in
gilgai territory in New South Wales, Australia. Gilgais are
natural gentle depressions in otherwise flat land, and sometimes
they seem to be regularly distributed. The data collection was
stimulated by the question: are these patterns reflected in soil
properties? At each of 365 sampling locations on a linear grid
of 4 meters, spacing, samples were taken at depths 0–10 cm,
30–40 cm, and 80–90 cm below the surface. pH, electrical
conductivity, and chloride content were measured on a 1:5
soil:water extract from each sample. |
hills | data.frame | The record times in 1984 for 35 Scottish hill
races. |
housing | data.frame | The housing data frame
has 72 rows and 5 variables. |
immer | data.frame | The immer data frame
has 30 rows and 4 columns. Five varieties of barley were grown
in six locations in 1931 and in 1932. |
leuk | data.frame | A data frame of data from 33 leukemia patients. |
mammals | data.frame | A data frame with average brain and body weights for 62
species of land mammals. |
mcycle | data.frame | A data frame giving a series of measurements of head
acceleration in a simulated motorcycle accident; used to test
crash helmets. |
menarche | data.frame | Proportions of female children at various ages during
adolescence who have reached menarche. |
michelson | data.frame | Measurements of the speed of light in air, made between
June 5, and July 2, 1879. The data consists of 5 experiments,
each consisting of 20 consecutive runs. The response is the
speed of light, in kilometers/second, less 299,000. The
currently accepted value, on this scale of measurement, is
734.5. |
minn38 | data.frame | Minnesota high school graduates of 1938 were classified
according to four factors. The minn38 data frame has 168 rows and 5
columns. |
motors | data.frame | The motors data frame
has 40 rows and 3 columns. It describes an accelerated life test
at each of four temperatures of 10 motorettes and has rather
discrete times. |
muscle | data.frame | The purpose of this experiment was to assess the
influence of calcium in solution on the contraction of heart
muscle in rats. The left auricle of 21 rat hearts was isolated,
and on several occasions a constant-length strip of tissue was
electrically stimulated and dipped into various concentrations
of calcium chloride solution, after which the shortening of the
strip was accurately measured as the response. |
newcomb | numeric | A numeric vector giving the “Third Series” of
measurements of the passage time of light recorded by Newcomb in
1882. The given values divided by 1,000 plus 24 give the time,
in millionths of a second, for light to traverse a known
distance. The “true” value is now considered to be
33.02. |
nlschools | data.frame | Snijders and Bosker use as a running example a study of
2,287 eighth-grade pupils (aged about 11) in 132 classes in 131
schools in the Netherlands. Only the variables used in their
examples are supplied. |
npk | data.frame | A classical N, P, K (nitrogen, phosphate, potassium)
factorial experiment on the growth of peas conducted on six
blocks. Each half of a fractional factorial design confounding
the NPK interaction was used on three of the plots. |
npr1 | data.frame | Data on the locations, porosity, and permeability (a
measure of oil flow) on 104 oil wells in the U.S. Naval
Petroleum Reserve No. 1 in California. |
oats | data.frame | The yield of oats from a split-plot field trial using
three varieties and four levels of manurial treatment. The
experiment was laid out in six blocks of three main plots, each
split into four subplots. The varieties were applied to the main
plots and the manurial treatments to the subplots. |
painters | data.frame | The subjective assessment, on an integer scale of 0 to
20, of 54 classical painters. The painters were assessed on four
characteristics: composition, drawing, color, and expression.
The data is due to the 18th-century art critic, de
Piles. |
petrol | data.frame | The yield of a petroleum refining process with four
covariates. The crude oil appears to come from only 10 distinct
samples. This data was originally used by Prater to build an
estimation equation for the yield of the refining process of
crude oil to gasoline. |
phones | list | A list object with the annual number of telephone calls
in Belgium. |
quine | data.frame | The quine data frame
has 146 rows and 5 columns. Children from Walgett, New South
Wales, Australia, were classified by culture, age, sex, and
learner status, and the number of days absent from school in a
particular school year was recorded. |
road | data.frame | A data frame with the annual deaths in road accidents for
half the U.S. states. |
rotifer | data.frame | The data give the numbers of rotifers falling out of
suspension for different fluid densities. |
ships | data.frame | Data frame giving the number of damage incidents and
aggregate months of service by ship type, year of construction,
and period of operation. |
shoes | list | A list of two vectors, giving the wear of shoes of
materials A and B for one foot each of 10 boys. |
shrimp | numeric | A numeric vector with 18 determinations by different
laboratories of the amount (percentage of the declared total
weight) of shrimp in shrimp cocktail. |
shuttle | data.frame | The shuttle data frame
has 256 rows and 7 columns. The first six columns are
categorical variables giving example conditions; the seventh is
the decision. The first 253 rows are the training set, the last
3 the test conditions. |
snails | data.frame | Groups of 20 snails were held for periods of 1, 2, 3, or
4 weeks under carefully controlled conditions of temperature and
relative humidity. There were two species of snail, A and B, and
the experiment was designed as a 4-by-3-by-4-by-2 completely
randomized design. At the end of the exposure time, the snails
were tested to see if they had survived; the process itself is
fatal for the animals. The object of the exercise was to model
the probability of survival in terms of the stimulus variables
and, in particular, to test for differences among species. The
data are unusual in that, in most cases, fatalities during the
experiment were fairly small. |
steam | data.frame | Temperature and pressure in a saturated steam-driven
experimental device. |
stormer | data.frame | The stormer viscometer measures the viscosity of a fluid
by measuring the time taken for an inner cylinder in the
mechanism to perform a fixed number of revolutions in response
to an actuating weight. The viscometer is calibrated by
measuring the time taken with varying weights while the
mechanism is suspended in fluids of accurately known viscosity.
The data comes from such a calibration, and theoretical
considerations suggest a nonlinear relationship among time,
weight, and viscosity of the form Time
= (B1 * Viscosity)/(Weight - B2) + E , where B1 and B2 are unknown parameters to be
estimated, and E is
error. |
survey | data.frame | This data frame contains the responses of 237 Statistics
I students at the University of Adelaide to a number of
questions. |
synth.te | data.frame | The synth.tr data
frame has 250 rows and 3 columns. The synth.te data frame has 100 rows and 3
columns. It is intended that synth.tr be used for training and
synth.te for testing. |
synth.tr | data.frame | The synth.tr data
frame has 250 rows and 3 columns. The synth.te data frame has 100 rows and 3
columns. It is intended that synth.tr be used for training and
synth.te for testing. |
topo | data.frame | The topo data frame
has 52 rows and 3 columns of topographic heights within a
310-foot square. |
waders | data.frame | The waders data frame
has 15 rows and 19 columns. The entries are counts of waders in
summer. |
whiteside | data.frame | Derek Whiteside of the UK Building Research Station
recorded the weekly gas consumption and average external
temperature at his own house in southeast England for two
heating seasons, one of 26 weeks before, and one of 30 weeks
after cavity-wall insulation was installed. The object of the
exercise was to assess the effect of the insulation on gas
consumption. |
wtloss | data.frame | This data frame gives the weight, in kilograms, of an
obese patient at 52 time points over an 8-month period of a
weight rehabilitation program. |