radiocarbon

Straightening the calibration curve

Michael Balter reports on a new radiocarbon calibration called INTCAL09. The calibration curve purports to provide a calendar age calibration up to 50,000 years ago for AMS radiocarbon dates.

Balter's report gives a good account of the basics. The atmospheric concentration of carbon-14 varied over time, so that organisms from some ancient times started with a higher proportion and other times started with a lower proportion. The radiocarbon dating technique depends on knowing this initial carbon-14 proportion. But we can only figure this out by comparing the present carbon-14 proportion in things whose ages we know -- like tree rings. Before 25,000 years ago, good non-radiocarbon chronologies are hard to come by, so up to now there has been no good calibration curve.

More recently, however, thanks to new and more accurate data from foraminifers, corals, and other sources--plus some fancy statistical treatments that help predict which way data gaps bend the curve--the INTCAL group has been able to resolve most of the discrepancies. "It took the group quite a while to come together and agree," says INTCAL team leader Paula Reimer, a geochronologist at Queen's University Belfast in Northern Ireland. But the new data, combined with what Reimer calls a "real sense of necessity" among team members to resolve the debates, won the day.

I'm skeptical when I see calibrated dates because they seldom report the calibration error. I like "fancy statistical treatments" that actually report their error. The entire reason a calibration model like INTCAL09 looks good is that it represents only one component of variability within a large set of separate chronometric datasets. The "debates" are more or less about whether that component is time, and if not what other factors must be controlled. Resolving the debates doesn't mean that the model will reduce the error associated with calibrating a given date -- it (hopefully) means that calibrated dates will be unbiased.

In principle, calibration is good because it facilitates comparison between radiocarbon and other dating methods, like OSL or ESR. It also gives a more accurate view of the temporal scale of events -- the radiocarbon chronology compresses the period between 40,000 and 10,000 years ago into 25,000 radiocarbon years instead of 30,000 calendar years. It makes a difference, if for no other reason, because it makes the initial Upper Paleolithic look more rapid than it really was.

Julien Riel-Salvatore ruminates on similar issues ("Paleolithic radiocarbon legerdemain")

The really bad dating problem happens at points where the atmospheric carbon-14 declined. Some declines occurred with nearly the same rate as actual decay of the carbon-14. A younger sample may then up with the same carbon-14 proportion as an older sample, with no way to tell between them. (I discussed this problem as applied to initial Upper Paleolithic-era dates in "Radiocarbon fudgery".)

Because different datasets vary in their results, apparent declines in atmospheric carbon-14 seem more common in those individual datasets than in the model that reflects their common features. The atmospheric carbon should be better reflected by the model -- after all, there's only one atmosphere, so these datasets should reflect the same value.

But any single series of dates ought to have temporal stochasticity more like an individual dataset. When we take dates from bone collagen -- which is not one of the kinds of data with chronological controls -- there ought to be a separate, source-specific error that we can't control by a calibration model.

Does it matter? I think we should assume the resolution of a 40,000-year-old calibrated radiocarbon date is no better than 3000 years. And in some cases more -- depending on the atmospheric trend. If one date is 3000 years earlier than the other, I think there's a very good likelihood that the earlier date really did happen first.

Too conservative? I'd like to see somebody run the numbers on it.

Goddess on a cave bottom

I don't have much value to add to the "figurative art" angle to the Hohle Fels Venus figurine. It seems very interesting that there is a concentration of carved iconic figures in the Swabian Aurignacian. That has two elements -- first, the concentration itself; second, the focus on carved ivory. Other regional Upper Paleolithic variants have their own concentrations of unique artifacts, sometimes tools (like the Solutrean leaf point) other times found objects (like the fossil shells in the Belgian and German Magdalenian. And we know that other times and places in the Upper Paleolithic have carved objects, so here we have the combination of both, in a very early Upper Paleolithic culture.

I do think it's worth discussing the date of the figurine a little more closely. Conard's paper includes a nice short discussion of the difficulties of establishing an accurate chronology -- a bunch of dates are available spanning much of the sequence, and there is substantial mixing of older and younger dates across the sequence.

There is no simple explanation for the variable radiocarbon dates from Hohle Fels and Geienklösterle. The noisy signals result from a combination of factors including variable sample preparation, variable levels of atmospheric carbon, taphonomic mixing and excavation error. Given the lack of reproducibility within and between radiocarbon laboratories, I prefer to emphasize the stratigraphic context of the finds, and to use the highly variable radiometric dates as rough indicators of age8. Although there is no generally accepted calibration for radiocarbon dates over 30 kyr bp, preliminary calibrations suggest that dates of 32 kyr bp correspond to roughly 36 kyr bp in calendar years. If the early dates are correct, the Venus would be even older. The fact that the Venus is overlain by five Aurignacian horizons, containing a dozen stratigraphically intact anthropogenic features with a total thickness of 1 m, suggests that the figurine is of an age corresponding to the start of the Aurignacian, around 40,000 calendar years ago.

The paper also includes a very nice picture showing the stratigraphy profile of the site in terms of artifact positions, color-coded by level. The Venus does lie beneath a well-stratified Aurignacian, with a depth of in this area of more than a half meter, although I am also impressed by the overlying meter of "Gravettian-Aurignacian transition." Conard's text is slightly more definitive than the figure, since two of the five "overlying" Aurignacian layers are not represented directly above the artifact, and one appears mostly to underlie it.

The research report is accompanied by a perspective piece by Paul Mellars. He frames the importance of the site by referring to its early date:

Fragments of the figure were excavated from archaeological deposits in the Hohle Fels cave in south Germany, dated by a range of more than 30 radiocarbon measurements to at least 35,000 years in age (in terms of the newly 'calibrated' radiocarbon timescale) (Mellars 2009:176).

This is a tricky statement to parse. Conard provides eight radiocarbon dates for objects in the lowest Aurignacian level (Vb) at Hohle Fels, only two of these are older than 35,000 radiocarbon years. Mellars refers to calibrated dates, not radiocarbon dates. On that basis, the statement is likely correct but a little misleading in comparison with later, Gravettian-associated figurines, whose dates are reported in uncalibrated years.

For those not familiar with the arcana of radiocarbon dating, the atmospheric proportion of carbon-14 varied during the last 40,000 years, so that there was actually more or less of it at some times than others. For the oldest radiocarbon dates, up above 25,000 BP, the age reported in half-lifes is systematically younger than the real age of an object in calendar years (given in "years ago" or some such). Over the span above 30,000 years ago, the difference is up to 5000 years or more -- so that a radiocarbon date of 30,000 BP might correspond to a calendar date older than 35,000 years ago.

This creates the potential for much confusion when describing dates. In this case, what does it mean to see that a Venus figurine from the Aurignacian is "more than 35,000 years old" when other figurines from the Gravettian date to "25,000 BP"? There's a 5000-year gap between those two timescales -- one that amounts to a sixth of the total age of an artifact. And when we read that an object is "more than 35,000 years old" and remember that Neandertals lived up to "29,000 BP" it is very easy to forget that these dates may well be synchronous. So we have to continually remind ourselves to use comparable timescales when talking about objects in the Upper Paleolithic.

I've discussed the problems with radiocarbon calibration at some length, in association with some earlier work by Mellars. Sometimes I find that reading and learning more about a subject actually clarifies matters a bit. In the case of radiocarbon chronology, it seems that the more I learn, the more confused things really are.

Given the error associated with calibration and atmospheric variation, it is no surprise (as Conard reports in the paper) that the radiocarbon dates in a site over around 30,000 BP should be somewhat mixed and confused. The problem is not so much that ten objects from the same moment will have different proportions of carbon-14, it is that ten objects from different times may have the same proportion. So it is especially important to understand the stratigraphy of a site completely. This appears to be a good, conservative example, and it will be interesting to see what happens if the excavation progresses further into the deeper underlying Mousterian.

But meanwhile there are other sites, excavated in a range of circumstances, in which the stratigraphy was not so carefully documented, or may have been more mixed. I suspect we'll be hearing more confusion before we get a lot more clarification.

References:

Mellars P. 2009. Origins of the female image. Nature 459:176-177. doi:10.1038/459176a

Conard NJ. 2009. A female figurine from the basal Aurignacian of Hohle Fels Cave in southwestern Germany. Nature 459:248-252. doi:10.1038/nature07995

Rats in the radiocarbon (or vice versa)

The story of the New Zealand rat bones is a bit deeper than the press reports (e.g., this AP report). The main idea is that the rat radiocarbon dates support an initial habitation of New Zealand that was relatively late, around 1200 AD. That's not a big surprise, since no human archaeological site or remains have been found to have earlier dates.

I don't have any opinion about New Zealand prehistory, really. It seems to me that the rats are a very good source of evidence, because their population growth is potentially much much faster than human population growth. If rats arrive on an island, there's a good chance of finding them early. I could imagine that humans might escape leaving archaeology for some time. I doubt very much that they could remain invisible for over a thousand years, but that depends on the intensity of archaeological research. But rats are not going to stay invisible. When you have extinct predators who ate rats, and they leave rat bones in their feces that you can sample, and none of those rat bones are more than 800 years old, well that's a sign.

So what's the real story here? The Oxford Radiocarbon Accelerator Unit keeps changing sample preparation protocols! These changes have brought in a number of new ways to take contamination and recent carbon out of the sample. I noted the redating of Vindija G1, which was based on a new sample preparation method using filtration to purify collagen from the bone. At the time, this was one among several new methods attempting to improve the accuracy of AMS dates. The cumulative effect of the advent of AMS dating, coupled with these later improvements, has added substantial precision to our knowledge of Europe during the last 40,000 years, as I reviewed here. Tom Higham, who was behind the new dates in the New Zealand paper, also worked out the Vindija G1 redating.

The problem is that every new sampling method raises the prospect that a lot of currently accepted dates are actually wrong. That is what has happened in the case of the New Zealand rats. The rat case demonstrates the depth of the problem: Holdaway (1996) presented seven AMS dates on rat bones whose confidence intervals are significantly older than 1000 AD (calibrated), two that are significantly older than 500 AD. The present study by Wilmhurst et al. must claim that all those rats were contaminated with old carbon.

Since the half-life of carbon-14 is 5730 years, an elevation of more than 500 years in a date represents a very substantial deficit of carbon-14 -- on the order of five percent of the maximum amount. Such deficits might be possible, either due to conditions after burial or consumption of marine carbon by the animals during their lives. But in his original study, Holdaway closely considered these effects:

Potential sources of error include the addition of 'old' or reservoir carbon to the bone gelatin before death in the diet, or after deposition via unremoved humics or diagenetic processes in carbonate sedimentary environments, especially for small specimens.

Dietary influences were not apparent. Two individuals of known death date give calibrated ages that include their death dates. In addition, 14C dates on bone gelatin from two herbivorous birds (equilibrium carbon consumers) are not significantly different from those on rat bones from comparable levels. Humic contamination is unlikely, most being removed by gelatinization, but must still be considered fro earlier 'collagen' dates. Environmental carbonates were removed by an acid pre-wash, eliminating carbonate contamination. Measured ages were not related to whole-sample mass.

Longer-term diagenetic changes do not appear to have a significant effect. Samples of moa eggshell (species unknown) and bird bone from close proximity in sediment enclosed by two undisturbed volcanic tephras give indistinguishable ages.... These materials were prepared using different treatments. Finally, a rat dentary excavated from beneith the Taupo Tephra gives an age of 1,775±93 yr BP. In addition to the radiocarbon age being consistent with that of the covering tephra, the bone's position beneath the undisturbed layer provides independent evidence that Pacific rats were established in the North Island before the Taupo eruption (Holdaway 1996:226).

Yes, you read that right. He had a rat under a well-dated volcanic tephra.

The current paper claims that all the oldest dates for rat remains have come from a single lab, all before a single date:

Subsequent dating of Pacific rat bones sampled from both laughing owl (32) and archaeological sites (33-35) failed to duplicate the early series of old rat bone dates (35-38). The most telling criticism of the original dates is that they fall into two distinct groups according to when the bones were processed in the same dating laboratory (22, 36, 37) (see Fig. 1). The early series of rat bone dates processed in 1995 and 1996 are all older than the oldest-dated archaeological evidence (1280 A.D.), but all bones dated after 1996 are younger (36, 37) (Fig. 1). Moreover, some rat bones from archaeological assemblages that were processed in 1995 and 1996 are significantly older than consistent dates on diverse materials from the same stratigraphic contexts (34, 35). Critics argued that this unusual bimodal distribution of ages according to when the bones were processed was due to inadequate pretreatment of small bones (33, 35-37). It has also been argued that some of the old 1995-1996 rat bone dates are older than their "true" age because of dietary uptake of carbon depleted in 14C (e.g., refs. 39-40).

Well, there you have it. The argument has to be that the dates are wrong due to the different sample preparation methods. The "dietary carbon-14" argument can't be the explanation, because some of the more recently dated samples ought to show the same deficit, and they don't. I personally don't see how they deal with the rat under the tephra -- they don't address the question. The only possibility that makes sense with their argument is that the samples were technically processed in a way that led to older dates.

Again, I have no opinion about New Zealand settlement. The recent chronology proposed here sounds reasonable to me, but mainly because people in a massively expanding population shouldn't remain archaeologically invisible.

I just want to point out how much our knowledge of the archaeological sequence depends on the technical details of dating methods, known only to a small number of researchers. To be sure, technology advances. But we have thrown out an awfully large number of radiocarbon dates in the last few years, due to small but important changes in methods. And the New Zealand case shows that this problem is not confined to the upper limits of AMS dating, where the preserved carbon-14 fraction is at its lowest. In the European case, the biggest problem has been supposed Aurignacian specimens that turned out to be Holocene in age.

This raises the obvious question: how much weight should we give to current date estimates?

References:

Wilmhurst JM, Anderson AJ, Higham TFG, Worthy TH. 2008. Dating the late prehistoric dispersal of Polynesians to New Zealand using the commensal Pacific rat. Proc Nat Acad Sci 105:7676-7680. doi:10.1073/pnas.0801507105

Holdaway RN. 1996. Arrival of rats in New Zealand. Nature 384:225-226. doi:10.1038/384225b0

Radiocarbon fudgery

I skipped last week's (9/15/2006) Science, and so missed this article by Michael Balter on radiocarbon dating. But some online discussion boards have been talking about it, and this passage especially is worth reading:

Encouraged by their recent successes, radiocarbon researchers now have their eyes on the bigger prize of the 50,000-year limit. Indeed, when the IntCal group began work on the 2004 curve, it had high hopes of extending it back to this final barrier. Yet it was not to be. Although the marine data sets were reasonably consistent with each other up to 26,000 years ago, after that they began to scatter and diverge, in some cases by up to several millennia. Geochronologist Paula Reimer of Queen's University in Belfast, Northern Ireland, who coordinates the working group, says that the differences--among the raw data as well as among the researchers--were just too great: "We had four or five people, all of whom thought their records were right." So the group settled for publishing in Radiocarbon a comparison of the data sets earlier than 26,000 years, which they ironically called "NotCal"--meaning, Reimer and other members say, that it was not intended to be used as a calibration curve.
But archaeologist Paul Mellars of the University of Cambridge in the U.K. used the published data to essentially do just that. Mellars was eager to get the most accurate dates for possibly contemporaneous Neandertal and modern human sites in Europe. So he used the midpoint of the differing "NotCal" curves to approximately calibrate the radiocarbon ages of 19 hominid sites ranging from Israel in the East to Spain in the West. Using this best-guess method, Mellars found that modern humans had not only spread across Europe faster than previously thought, but that they had overlapped with Neandertals during a shorter interval: only about 6000 years rather than 10,000 years in Europe as a whole, and as little as 1000 years in some parts of the continent. Mellars concluded in the 23 February 2006 issue of Nature that Neandertals must have "succumbed much more rapidly to competition" from modern humans than many had assumed.
But Reimer and others say Mellars should not have used the NotCal data as he did. "It is dangerous to draw too fine conclusions using these data sets," says Reimer, because they have not been finalized and the divergences between them have yet to be reconciled. Other researchers have started asking van der Plicht whether they can use the "Mellars curve" for calibration. "This is a bad thing," says van der Plicht.
Mellars insists that archaeologists can't wait for a final calibration curve. "Are we all really expected to keep studies of modern human origins on hold for the next 5 years, until they decide they've finally got the calibration act together?" he asks. The working group, he argues, "has hijacked the term 'calibration' to mean an absolutely agreed, rubber stamped, legalistic, signed, sealed, and delivered curve." And even when the experts agree on a curve, Mellars says, it will not be "final and absolute" but "simply the best estimate from the data at the time."

Now, even something that isn't officially approved by geochronologists might still be correct. So the question is whether errors were introduced by Mellars into the chronology by using the "NotCal" not-calibrated calibration (and yes, the Mellars paper uses without noting the irony "the recent NotCal04 'best estimation' calibration curve").

The problems are noted in a communication to Nature last week (9/14/2006) by Chris Turney, Richard Roberts and Zenobia Jacobs:

Atmospheric 14C variability has not followed a simple, smooth pattern, as suggested by Mellars. Instead, smoothing took place during the statistical analysis of these data sets to develop the NotCal04 mean best-fit line. By using the mid-point of the mean best-fit line, Mellars artificially improves the apparent precision of calibrated ages in his Fig. 3; even 'infinitely' old ages are reported with improved precision, whereas calibration almost invariably results in age ranges that are significantly larger than the radiocarbon measurement error.

It's a bad sign when your method improves the precision of infinite dates. In fact, you always add to the variance of a measurement when you multiply it by some correction that itself entails measurement error.

But Mellars' central point was not principally about the ages of particular sites, but instead about the total time taken by modern humans to invade Europe. Can we still get an estimate of a reduced time period for this "invasion" if we use the calibrated dates properly?

To answer that, we need to look at two graphs. First, the graph used by Mellars (2006) to support the idea that the total time taken by modern humans to occupy Europe was short:

Radiocarbon dates for sites from West Asia and Europe, Figure 3 of Mellars (2006). Sites numbered as in original text.

You can see in this graph the effect of "calibration", at least according to Mellars -- it reduces the statistical error associated with each date. Indeed, the caption to this figure says:

Owing to the slope of the calibration curves, the error bars ( 1 s.d.) on the calibrated dates are smaller than those on the uncalibrated dates.

A moment's reflection reveals this to be a nonsensical statement. The error bars may be attributable either to measurement error (in the proportion of 14C) or to calibration error (relating the current proportion to the original atmospheric proportion). But these error estimates are applied to dates in "radiocarbon years" -- meaning that they don't include possible error in the original atmospheric proportion. Indeed, if they did include this error, these error bars would have to stretch to cover the NotCal-produced dates!

But the "slope of the calibration curve" certainly can't reduce error due to measurement in the sample. At best, the current calibration can predict that a given date must represent a slightly larger number of half-lifes than the uncalibrated date, because the original atmospheric proportion of 14C was higher than today. It can't reduce the standard error due to measurement, and therefore won't reduce the confidence interval on the date as reported in radiocarbon years (it certainly may reduce the total error, which is usually overlooked).

To understand how Mellars came to this erroneous conclusion -- and to see how it affects his assertion regarding the time period of modern human dispersal -- we need to consider the NotCal04 calibration curve itself:

Radiocarbon calibration data, Figure 1 from Turney et al. (2006). I added the pink rectangular regions. The lower pink rectangles represent the low variance on dates drawn from the NotCal correlation region, as apparently Mellars did. The upper, wide, pink rectangles represent the error that might be assigned to the full calibration process, including uncertainties in the underlying calibration data. In other words, they encompass the range of calendar dates that might be attributed to different samples of a single radiocarbon date. This error range does not necessarily include measurement error (except insofar as error on calibration samples is distributed just like error on fossil samples.

OK, you probably read the caption, so you get the gist of this picture and my pink rectangle additions. In short, the width of the NotCal calibration is very narrow, because it is a summary of many data sets. But the dispersion within those original data sets is very high. This means that for any given radiocarbon date, there is actually a very wide interval of possible calibrated dates that it might represent. The range of this dispersion is high partly because it includes dates from different regions and raw materials (here, mostly coral and shells) -- and these are exactly the kinds of problems that create variance in archaeological radiocarbon dates. So we should be looking at wide error bars -- much wider than we are used to doing.

Making this source of error explicit certainly doesn't decrease the error bars of measurements -- it vastly increases the error bars. This is a good thing -- it is a more accurate understanding of the potential error in radiocarbon dates.

But what can we conclude about the time interval represented by early modern humans (or more properly, of early Aurignacian sites, since it is far from demonstrated that they were left by modern humans) and their dispersal across Europe?

Looking at the Mellars graph, his interpretation is apparent from his numbers. The leftmost European site on his graph is number 4, Bacho Kiro. It is estimated at more than 43,000 radiocarbon years; Mellars put it at more than 46,000 "calibrated" years. The youngest site (17, Roc de Combe) is 35,000 radiocarbon years; Mellars put it at 40,000 "calibrated" years. So the interval from oldest to youngest is 11,000 radiocarbon years, compared to only 6,000 "calibrated" years, according to Mellars:

[W]e can now see from the new calibrated chronology that this must be shortened to at most about 6,000 yr (at least in the more central and northern parts of Europe), with periods of overlap within the individual regions of Europe (such as western France) of perhaps only 1,000-2,000 yr. Evidently the native Neanderthal populations of Europe succumbed much more rapidly to competition from the expanding biologically and behaviourally modern populations than previous estimates have generally assumed.

But again, this is quite plainly wrong. First, it assumes the reduction in length of the error bars, which the calibration process shows must actually be greatly increased in length. And second, it ignores the visually apparent "kink" in the calibration curve over just the time range represented by the early Aurignacian. That "kink" means that true dates over a very wide time range will come out with the same radiocarbon date estimate.

And remember that Neandertals persisted well after 40,000 years in the sequence, with a number of dates after 33,000 years ago now (and possibly as recent as 28,000). These dates are radiocarbon years, and calibrated dates might be older (and closer to the early Aurignacian "calibrated" dates). But they don't fit into the blitzkrieg model very readily.

The real question is whether the radiocarbon data address the pattern of change in biology and archaeology -- a sudden shift might still be a piecewise or mosaic transition, and a long shift might nevertheless have discrete boundaries. I think there is sufficient evidence that the transition in Europe was mosaic in character. From there, the pace of change (and migration or gene flow) might be 6000 years or 20,000, it doesn't much matter from the perspective of pattern. On the other hand, folks interested in climatic forcing and other more time-centric scenarios might care very deeply about whether we are looking at a short or long timeframe.

In any event, it is safe to conclude that the evidence for a rapid dispersal based on these data is pretty much all based on faulty statistics.

By the way -- has anybody else noticed that the vast preponderance of totally wrong research lately has been in Nature?

References:

Mellars P. 2006. A new radiocarbon revolution and the dispersal of modern humans in Eurasia. Nature 439:931-935. DOI link

Turney CSM, Roberts RG, Jacobs Z. 2006. Archaeology: Progress and pitfalls in radiocarbon dating. Nature 443:E3. DOI link

Syndicate content