8. Human Craniometric Variation Supports Discontinuity at the Late Glacial Maximum in Europe
Skeletal differences between early Upper Palaeolithic and later modern human groups have often been interpreted as evidence of substantial gene flow between the former group and Neanderthals. However, this pattern may also arise if there was a high degree of selection and/or lack of genetic continuity between early Upper Palaeolithic populations and those which followed. The impact of the Last Glacial Maximum (LGM) on Upper Palaeolithic humans is critical to such an interpretation. We use a subsample (n=76) from a larger, well-dated craniometric dataset to explore both temporal and geographic variation in Upper Palaeolithic and Mesolithic populations. In all analyses, the early Upper Palaeolithic consistently showed the greatest divergence, suggesting the LGM as a disruptive event in the genetic composition of Europe. No clear divisions were detected between the late Upper Palaeolithic, early Mesolithic and late Mesolithic. Based on the results presented here we caution against interpreting differences between Upper Palaeolithic populations and recent human populations in Europe as evidence for substantial gene flow between modern and archaic humans in the late Middle Palaeolithic and early Upper Palaeolithic.
Two main theories have dominated the study of human origins for almost a century. First, the ‘multiregional continuity’ theory hypothesises that anatomically modern humans (AMH) evolved in parallel from archaic forms in various Old World regions (for example Brace 1964; Frayer et al. 1993; Hawks et al. 2000; Wolpoff 1989; Wolpoff et al. 1984, 2000, 2001). The second theory, known as the ‘Out of Africa’ model, proposes that all modern humans have an AMH ancestor from Africa around 160,000 years ago (Cann et al. 1987; Forster 2004; McDougall et al. 2005; Ramachandran 2005), which dispersed into Eurasia after 100,000, replacing all ‘archaic’ hominins without substantial admixture (Bar-Yosef 2002). Over the last three decades, the recent African origin model has gained increasing support from population genetics (DeGiorgio et al. 2009; Hellenthal et al. 2008; Jobling et al. 2004; Nei 1995), craniometric studies on modern populations (Manica et al. 2007; von Cramon-Taubadel and Lycett 2008), and evidence from the fossil record (Stringer and Andrews 1988). However, recent genomic data from archaic hominins has added complexity (Green et al. 2010; Reich et al. 2010), and some have suggested intermediate theories that allow limited admixture between incoming AMH and resident ‘archaic’ groups (e.g. Bräuer 1984; Bräuer and Rimback 1990).
Against this backdrop, the study of Upper Palaeolithic and Mesolithic humans took a backseat (Frayer 1984; Meiklejohn 1974). Some researchers who favoured a replacement model argued that modern European cranial characteristics were already established by the early Upper Palaeolithic (Campbell, 1992; Howell 1984; Klein 1989; Stringer 1974). In some cases the underlying assumption was that human evolution had come to a standstill with the appearance of ‘Cro-Magnon’ (Klein 1992, 1995) and that studying morphometric variation among Upper Palaeolithic humans would, therefore, not contribute to major questions in human evolution. Such a view derived validation from the work of Morant (1930; see also von Bonin 1935), who concluded that Upper Palaeolithic skulls were strikingly homogeneous in space and time and “modern in almost all respects”. Within this context it was accepted that changes were cultural rather than biological.
While some researchers (for example Henke 1989) found the people of the Upper Palaeolithic to be relatively homogeneous in cranial traits, others (e.g. Frayer et al. 2006; Rougier et al. 2007; Soficaru et al. 2006; Trinkaus 2007; Wolpoff et al. 2006) pointed to what they saw as clear cranial differences between early Upper Palaeolithic and later humans – features such as the variable expression of external occipital protuberances, pronounced brow ridges and broader faces. Others pointed to differences between Upper Palaeolithic populations in western and central Europe (Gambier 1997; Vlček 1970).
Early Upper Palaeolithic (EUP) skeletal remains which fall outside the range of modern human variability may be interpreted as evidence for substantial gene flow between modern humans and Neanderthals. However, this disparity may also arise if there was a high degree of selection and/or lack of genetic continuity between early Upper Palaeolithic populations and those which followed. The impact of the Last Glacial Maximum (LGM) on Upper Palaeolithic humans is critical to such an interpretation. Global temperatures began to deteriorate after the warmer Hengelo event some 34,000 BP, culminating in the LGM at around 20,000 BP. In the Northern Hemisphere this catastrophic climatic event resulted in massive land-based ice sheets (Boulton et al. 2001) and a reduction of sea levels by as much as 130m below their current position (Lambeck 2002; Peltier 1994). Average global air surface temperatures were at least 5°C lower than modern values. Such climatic changes would present considerable challenges for humans in northern latitudes (see Soffer and Gamble 1990).
The impact of the LGM on human settlement is apparent from the archaeological record. With the exception of possible cryptic refugia (Gamble et al. 2004, 2005; Terberger and Street 2002), an archaeological hiatus extends from southern Britain to Poland for up to six millennia. With only a few exceptions, almost no dates occur between the LGM and ~14,000 BP (Terberger et al. 2009). Groups living at higher latitudes were forced to move southward. Archaeological evidence points to an influx of immigrants into more southern regions such as Franco-Cantabria, the Balkans, the Italian Peninsula, and the Black Sea littoral zone.
Some researchers have proposed evidence of biological differences between pre- and post-LGM groups. It has been suggested that the harsh LGM conditions may have produced a more cold-adapted body type (Holliday 1997; and see Weaver and Steudel-Numbers 2005 for a more recent discussion). Most obvious here is the sharp decrease in long bone length between pre- and post-LGM populations (Meiklejohn and Babb 2011), with no further change through the Mesolithic, a finding that refutes the earlier suggestion of a reduction in limb length through the post-LGM into the early Holocene (Meiklejohn et al. 1984). Changes noted in earlier studies should be treated with some reserve (e.g. Formicola and Giannechini 1999; Holliday 1997; Holt 2003; Jacobs 1985; Shackelford 2007). They either relate to issues other than bone length per se, or are subject to queries about sampling (including sample size and material used) or methodology. If stature is used as the primary variable, issues arise that include the use of regression equations and the effect of applying different equations for disparate parts of the sample (see discussion in Meiklejohn and Babb 2011).
In contrast with the postcranial skeleton, overall cranial shape variation in modern human populations results from neutral evolutionary forces (González-José et al. 2004; Harvati and Weaver 2006; Relethford 2001, 2002, 2004; Roseman 2004; Roseman and Weaver 2004; von Cramon-Taubadel 2009). This correspondence between craniometric and neutral genetic data makes the former a useful proxy for reconstructing population histories. This is particularly important for ancient bones, where issues surrounding contamination and extraction, as well as cost, make DNA studies prohibitive.
Given this background, the aim of this study was to explore the relative variation of Upper Palaeolithic and Mesolithic groups using a subsample (n=76) from a comparatively large craniometric dataset. We use this subsample to assess the craniometric patterns of pre- and post-LGM European Upper Palaeolithic and Mesolithic specimens by testing the hypothesis that EUP European cranial morphology was significantly different to that of post-LGM late Pleistocene and early Holocene populations, as would be expected if the LGM represented a disruption to gene flow patterns within the Upper Palaeolithic. Alternatively, if continuity was maintained through gene flow, we would expect no clear separation between temporally adjacent groups. Hence, this represents the first assessment of the effects of the LGM on patterns of craniometric variation in European late Pleistocene and Holocene populations.
Material and Methods
Archaeological context, chronology and the definition of groups
The compilation of the craniometric dataset, which provides the source material for this study, has been collated from various sources by two of the authors (CM and RP) and Winfried Henke. It includes data from the published literature and unpublished measurements by CM and RP as part of earlier work. Our aim was to maximise the size of the dataset, while at the same time using strict criteria to evaluate which specimens would be included. Wherever possible, we used data collected in person by CM and RP. However, published data were used in instances where we did not have access to material.
Martin and Saller (1957) code |
Variable |
M1 |
Maximum cranial length |
M8 |
Maximum cranial breadth |
M9 |
Least frontal breadth |
M17 |
Basibregmatic height |
M45 |
Bizygomatic breadth |
M48 |
Nasoalveolar height |
M51 |
Orbital breadth |
M52 |
Orbital height |
M54 |
Nasal breadth |
M55 |
Nasal height |
Table 8.1. Measurements used in this study.
For the analyses of temporal differences, specimens were divided into four discrete periods: early Upper Palaeolithic (EUP), late Upper Palaeolithic (LUP), early Mesolithic and late Mesolithic. The LGM, dated to ~20,000 BP, was used as the EUP/LUP boundary. In archaeological terms this places Aurignacian and Gravettian material into the EUP, and Magdalenian and contemporary groups into the LUP. The boundary between the LUP and Mesolithic is seen as coeval with the Pleistocene/Holocene boundary. The division between the early and late Mesolithic is more arbitrary in nature. For this study, specimens equal or older than 7000 BP (uncalibrated) were assigned to the early Mesolithic; those younger to the late Mesolithic.
Similar studies in the past were frequently hampered by a lack of reliable dates, with many specimens assigned to the wrong chronological period. Misclassification can seriously bias the results of statistical analyses, particularly in cases where the dataset is small to begin with. The dating and redating of reputed Upper Palaeolithic and Mesolithic skeletal remains behoves us to reassess material from these periods (Trinkaus 2005). We have incorporated the most current dating information in this study. We have only included specimens in our analyses for which the chronological attribution and archaeological association can be reliably determined.
Statistical analysis
Crania were selected from a larger dataset, mentioned in the previous section. Only adult specimens with associated radiocarbon dates or secure provenance were used in the analyses. Ten standard Martin and Saller (1957) craniometric measurements were used (see Table 8.1), corresponding to height, width and length dimensions of the cranial vault and face (including orbital and nasal regions). Specimens missing three or more of these measurements were removed from the original dataset. Multiple regressions were used to estimate missing values (7% of measurements were estimated for the entire dataset).
Cranial measurements were transformed to size-adjusted Mosimann shape variables (Mosimann and James 1979). Darroch and Mosimann (1985) define size as the geometric mean of all variables. Mosimann shape variables were created by dividing each value by the geometric mean of all the variables for each observation.
Sampling methods were used in order to control for the regional bias in the data. For instance, all specimens from Belgium, Britain, Greece, Luxembourg, Norway, Serbia, and the Ukraine were early Mesolithic. Similarly, all specimens from the Czech Republic were EUP, with other EUP specimens restricted to France, Italy, and Russia. Such biases make it difficult to discern whether changes across the LGM were due to different geographic locations being sampled, rather than population disruption at the LGM, as hypothesised. In an effort to control for this factor, analyses were carried out on a sample which was matched by region.
Equal sized samples were chosen for each time period. Since specimens from Russia were only present in the EUP group and not present in any other period, these were removed. The two smallest samples – EUP and LUP – were used as the baseline for choosing the countries to use. Specimens were resampled if they came from the following countries: Italy, France, the Czech Republic, Germany, and Switzerland. A MANOVA was carried out to explore whether significant differences existed in size-adjusted cranial measurements across the four periods. MANOVA requires groups to be relatively equal in size. Unbalanced designs are more likely to violate the assumption of equality of covariance matrices. While Pillai’s trace is generally robust to violations of this assumption for samples with equal group sizes (Olson 1974, 1976, 1979), it is not to larger departures (Hakstian et al. 1979; Holloway and Dunn 1967). The smallest group (EUP, n=19) dictated the size of the groups. The non-EUP groups were randomly sampled to contain 19 individuals from the above countries. This resulted in a sample consisting of 76 specimens (see Table 8.2).
A linear discriminant analysis (LDA) was performed to assess the extent of cranial shape disparity between groups. Mahalanobis squared distances (Mahalanobis, 1936) were calculated to determine the strength of the canonical variates in discriminating between group means. The Mahalanobis distance is a measure for assessing which groups are most different. All LDA analyses were carried out using the MASS package (Venables and Ripley 2002) in R 2.15 (R Core Team 2012).
Post-hoc pairwise MANOVAs were carried out to determine which comparisons were significant for alpha (0.05), adjusted for multiple comparisons. Significance levels were corrected for false positives using the methods of Benjamini and Hochberg (1995). This correction is less conservative and therefore more powerful than family-wise type I error rate corrections of the Bonferroni type.
Given that multiple sources were not available for many of the specimens used in this analysis, interobserver error was not calculated. However, this factor should be kept in mind when interpreting the results. All data preparation and statistical analyses were carried out in R.
A MANOVA was carried out on the sample. Using Pillai’s trace, the effect of time periods on the size-adjusted craniometric measurements was significant (V=0.89, F(3, 195)=2.75, p <0.001). A robust MANOVA was carried out on rank transformed measurements, using the WRS package (Wilcox 2005) in R. As with the parametric MANOVA, a significant effect was found using Munzel and Brunner’s (2000) method (F=2.22, p=0.002), and Choi and Marden’s (1997) method (H(30) = 85.46, p < 0.001).
MANOVA pairwise comparisons across periods were carried out on the untransformed data, while adjusting for false discovery rate. The results are shown in Table 8.3. The EUP sample was significantly different from all other groups in pairwise comparisons. However, the LUP and Mesolithic groups were not significantly different from each other.
The MANOVA was followed by a discriminant function analysis, which resulted in three discriminant functions (Table 8.4). The first function explained 48% of the variance, while the other two explained 30% and 22% respectively. Plots of the first two discriminant functions, together with a separate plot of their mean scores are shown in Figure 8.1. The discriminant function plot shows that the EUP is discriminated from the other groups along the first discriminant function. The coefficients of the discriminant functions revealed that the first function differentiated nasal and orbital dimensions. A similar pattern was also seen for the second discriminant function. Box plot of nasal and orbital height and width measurements are shown in Figure 8.2. The pattern suggests that the EUP group had greater values for nasal dimensions and smaller values for orbital dimensions. The squared Mahalanobis distance between group means was calculated for this sample. These are presented in Table 8.5. Again, the greatest distances were between the EUP and all other groups.
Table 8.3. MANOVA pairwise comparisons of groups.
Table 8.4. Function loadings of LDA for size-adjusted craniometric data.
Figure 8.1. LDA plot of specimens and group means.
Figure 8.2. Box plots of orbital and nasal measurements.
These results are similar to those found by Brewster et al. (2014), which differed somewhat in its methodology. A MANOVA on a sample matched by geographic location found significant differences across periods. Pairwise MANOVA comparisons found statistically significant differences between the EUP and all other groups, while no significant differences were detected between any of the post-LGM groups. In a linear discriminant analysis, the first discriminant function discriminated between the EUP and all other groups. The Mahalanobis squared distances between the group means was consistently larger for the EUP group. These findings are congruent with the hypothesis that the LGM had a significant effect on European cranial morphology.
The results of the MANOVA and LDA confirm the EUP to be the most divergent group, which is congruent with the hypothesis that the LGM had a major effect on European genetic and morphological diversity. This study is in line with the conclusions of van Vark et al. (2003), who found little craniometric evidence to suggest continuity between EUP populations and more recent European inhabitants (although see Jantz and Owsley 2003 for an opposing viewpoint). Here we propose that the LGM had a significant role to play. Gaps in the archaeological record for much of northern and central Europe suggest that people abandoned these regions at this time and are associated with a marked increase in the number of sites in southern France (Bocquet-Appel and Demars 2000; Bocquet-Appel et al. 2005) and Iberia (Straus et al. 2000). It is likely that populations survived in relatively small numbers in refugia zones. The relationship in human genetic diversity between refugia would have been affected by drift and founder events. Genetic studies of parental DNA show that many haplotypes were lost during this period, while new mutations arose. Studies of mtDNA (Achilli et al. 2004, 2005; Álvarez-Iglesias et al. 2009; Forster 2004; Loogväli et al. 2004; Pereira et al. 2005; Torroni et al. 1998, 2001, 2006) and Y chromosomes (Semino et al. 2000; Wells et al. 2001; Zei et al. 2003) have shown a number of haplogroups that likely arose in the Franco-Cantabrian refugium. Evidence for new haplogroups originating in the Balkans (Marjanovic et al. 2005; Peričić et al. 2005; Rootsi et al. 2004) and Ukraine (Peričić et al., 2005; Semino et al. 2000) add weight to claims that these areas were also important refugia during the LGM (Dolukhanov 2000). Recent studies suggest that the Italian refugium played only a marginal role as a source of genetic diversity in Europe as a whole (Pala et al. 2009). In the wake of the LGM, founder groups gradually moved northward to occupy previously deserted areas. These groups would have carried region-specific haplotypes with them.
A limitation of the pan-European approach adopted here is its inability to detect regional patterns of craniometric variation. Although not necessarily reflective of population events, archaeological evidence for continuity across the LGM varies between regions of the continent, as does the sequence of documented technocomplexes. Continuity between the Solutrean and Magdalenian is suggested at some sites with thick stratigraphic sequences in Cantabria, Spain (Aura et al. 2012; Straus et al. 2012), while changes in bone and antler artefacts (Stettler 2000) may reflect rapid ecological shifts. Some researchers (e.g. Banks et al. 2011; Ducasse 2012) have proposed a sharp break between the Solutrean and Badegoulian. However, the nature of the latter technocomplex is not fully understood and may have an eastern influence (Gamble et al. 2006; see also Kozłowski 2012; Terberger and Street, 2002). In contrast, evidence of continuity can be seen in backed blade or bladelet technologies from the Gravettian into the so-called Epigravettian, in Central and Eastern Europe, as well as the Italian and Balkans peninsulas (Bietti 1990; Mussi 2001). The Epigravettian is largely contemporaneous with the Solutrean through to the Azilian of Western Europe. Mortuary practices in Italy from the Gravettian into the Epigravettian may further hint at continuity (Mussi 1986).
Table 8.5. Squared Mahalanobis distances between group means.
While there are detectable craniometric differences between the EUP and later groups, it is not clear to what extent these result from neutral evolutionary forces and/or natural selection. The largest loadings for the LDA were on facial measurements, specifically orbital and nasal dimensions, which may indicate climate driven selection (Harvati and Weaver 2006; Hubbe et al. 2009; Noback et al. 2011; Roseman 2004; Roseman and Weaver 2004). However, von Cramon-Taubadel (2011) found facial shape, including nasal and orbital shape to be congruent with neutral genetic markers. Hence, it is possible that both stochastic effects due to reduced gene flow and genetic drift, and selective evolutionary forces may have contributed to changes in cranial morphology throughout the late Pleistocene.
This study supports the division of the Upper Palaeolithic into two discrete periods separated by the LGM. From a biological perspective the boundary between the Upper Palaeolithic and Mesolithic must be viewed as somewhat arbitrary. A growing number of archaeologists propose that the Mesolithic should be viewed as a continuity of changes dating back to the LGM (Bailey and Spikins 2008). The division of the Mesolithic into early and late is likewise arbitrary. We suggest that studies interested in ascertaining the extent to which modern humans and Neanderthals may have interbred, should only use contemporaneous EUP material. The grouping of specimens from the EUP and LUP into a single ‘Upper Palaeolithic’ category will result in indeterminate findings. Based on the results presented here, the use of a combined LUP and Mesolithic craniometric sample is justified when necessary.
In this study we used a subsample of craniometric data from a larger dataset to explore both temporal and geographic variation in Upper Palaeolithic and Mesolithic populations. This is to date the largest, well-dated dataset for these periods. The EUP showed the greatest divergence in our analyses. This points to the LGM as a disruptive event in the genetic composition of Europe. No clear division between the LUP and Mesolithic was detected, suggesting that the division between the two is arbitrary, at least from a biological point of view. Based on the results presented here we caution against interpreting differences between Upper Palaeolithic populations and recent human populations in Europe as evidence for substantial gene flow between modern and archaic human populations in the late Middle Palaeolithic and early Upper Palaeolithic. The findings of this paper offer testable hypotheses for future studies. A number of alternative explanations exist for the results here. In the interest of space, we have limited ourselves to the discussion of the most parsimonious, ones that are supported by anthropological, archaeological and genetic data.
