It Was A Single Procedural Screw-Up. No Other Area Was Sampled. Is That Enough?
In any case, the most important aspect of this work, from the point of view of physicists working in the field of AMS dating, is the methodological one concerning sampling strategy. First, sampling should always be done in agreement with and under the guidance of scholars and people involved in the historical or archaeological problem. In addition, whenever possible, collecting several samples from the object to be dated (as we did in the case of the two frocks) is definitely the right approach in order to reduce the possibilities of ambiguities.
— AMS radiocarbon dating of medieval textile relics:
The frocks and the pillow of St. Francis of Assisi
by M.E. Fedi, A. Cartocci, F. Taccetti, P.A. Mandò
An interesting thread within a thread brought us to the above important quote. daveb started it off:
The sampling area [of the Shroud of Turin] is clearly anomalous, and unrepresentative of the whole, never mind the details. No other area was sampled. To assert the truth of an hypothesis on such ambiguous evidence would never be accepted in any other scientific endeavour. It merely demonstrates wishful thinking on the part of skeptics and anti-authenticists, not to say their poverty of scientific reasoning. Precisely the same poverty of thought that Yves Delage encountered from the Science Academy in 1905, dominated as it was then by so-called free-thinkers and agnostics. Some things never change!
daveb then amplified this position:
My point is that it is bad science to assert any kind of conclusion from such a poor sampling protocol. Just reflect on the stringent sampling protocols for drug testing for instance. For some 10 years I was engaged in designing sampling systems for a significant Corporate’s Internal Audit function, together with various other sundry applications of Applied Statistics. So I know more than a little about drawing conclusions from sampled data. The laboratories may have had a good grip on the theoretical and technical physics aspects of Carbon 14 dating. But it is only too obvious that they had little idea on how a proper and persuasive scientific conclusion can be reached when they accepted such a poorly constructed sampling regime for their testing, peer reviews notwithstanding. . . . For any further understanding of the problem, I can only recommend any elementary Applied Statistics text book. It does turn out that the sampled area in this case was anomalous, and merely illustrates the folly of it. If it had turned out that the date reached conclusively proved a 1st century date, these very same scientists would have been the first to object at the sampling regime.
Hugh Farey chimes in:
I’m interested in daveb’s comment (although I appreciate that it is a commonly held view, not just his), that “To assert the truth of an hypothesis on such ambiguous evidence would never be accepted in any other scientific endeavour.”
A fairly similar problem to the Shroud arose in deciding where to take the samples from two garments and a pillow associated with St Francis in 2005. The details are in ‘AMS Radiocarbon Dating of Medieval Textile Relics: the Frocks and the Pillow of St Francis of Assisi,’ M. Fedi et al, Science Direct, 2009. (Behind a paywall, I’m afraid).
Beginning with “dating of materials connected to faith is always a delicate matter,” which has a familiar ring to it, the authors discuss where, exactly, they took their radiocarbon samples from. “Samples were taken following the advice of a textile conservator, who examined the manufacture of the relics. No darns or patches were present.” Sounds familiar? “Anyway, [interesting adverb…] we decided to sample several pieces from each frock.”
From one ‘frock,’ “supposed to have covered St Francis just in the moment of his death,” the team took seven samples, about 1cm2 each, three from the hem, two from the end of one of the short sleeves, one from the side, and one from slap in the middle of the back. The frock was made of several pieces of wool sewn together, and the samples came from different pieces. The samples were washed in an ultrasound bath, then in hydrochloric acid, but not in sodium hydroxide as it was thought detrimental to wool.
One of the hem samples and the one from the side fell to pieces during cleaning and couldn’t be used. The others gave dates of between 1155 and 1225, a 70 year spread which was assumed consistent. This compares with the Shroud findings (12 samples from the same place) of between about 1225 and 1315, a 90 year spread.
St Francis died in 1226, which fitted this frock (and the pillow, as it happens) well. The other frock was dated to about 1300, and was therefore considered not a genuine relic, although the authors say, rather charmingly, that “these data are not to be read in a negative way, since the result of its dating can anyway be a valuable element for the reconstruction of the history of religion during Middle Ages.”
The circumstances of the two radiocarbon datings make interesting comparison, I feel. One important point is that, even in 2005, a 1cm2 area is considered a minimum sample for accurate dating, whereas in 1988 every laboratory subdivided its sample further and tested each one separately. No wonder their experimental error bars were so much bigger, especially those of Tucson, which dated roughly 0.2cm2 pieces.
Joe Marino focuses on an important point:
There were a couple of sentences I found most interesting in this paper: “In any case, the most important aspect of this work, from the point of view of physicists working in the field of AMS dating, is the methodological one concerning sampling strategy. First, sampling should always be done in agreement with and under the guidance of scholars and people involved in the historical or archaeological problem. In addition, whenever possible, collecting several samples from the object to be dated (as we did in the case of the two frocks) is definitely the right approach in order to reduce the possibilities of ambiguities.
This single procedural screw-up – it doesn’t matter who did it or who is to blame – opened the door to the Benford/Marino/Rogers observations that something was amiss; daveb simply says it was “anomalous, and unrepresentative of the whole.” This single procedural screw-up opened the door to worrisome statistical observations that the sample was not sufficiently homogenous. You can nitpick at this or that: did Rogers do this or that correctly or not? You can squabble about the statistics until you are wondering if Chi Squared applies to the problem of how many angels can dance on the head of a pin. In the end these many detail questions are like magnifying glasses held over wood shavings in the sun. They ignite, again and again, the single issue: “No other area was sampled.”
Note: Picture from Reuters story has been removed on February 4, 2013. See comments for explanation.