Chemical weapons Syria

Who is Responsible for Chemical Attacks in Syria? Guest Blog by Professor Paul McKeigue (Part 2)

United Nations Security Council Holds Emergency Meeting On Syria

Paul McKeigue

Fake news, false flags and the weight of evidence favouring alternative explanations of alleged chemical attacks in Syria

In the last post I outlined the development of the mathematical and philosophical basis for using probability calculus to evaluate evidence.

The framework of probability calculus implies that:-

  • you cannot evaluate the evidence for or against a single hypothesis, only the weight of evidence favouring one hypothesis over an alternative
  • the weight of evidence favouring one hypothesis over another is based on comparing how well each of the two hypotheses would have predicted the observations
  • your assessment of how well a hypothesis would have predicted the observations does not in general depend on your prior degree of belief that this hypothesis is true

That’s as far as we can go without using numbers. As the objective of these posts is to show you how to evaluate evidence for yourself using simple back-of-the envelope calculations, I’ll recapitulate how to do this.

  • How well a hypothesis would have predicted the observations is quantified by a number called the likelihood. This is calculated as the probability of the observations given that hypothesis. When the observations are fixed and we are comparing different hypotheses, we reverse this dependency and describe this number as “the likelihood of the hypothesis given the observations”. If you find this confusing, you’re not the only one. Likelihoods are not probabilities when they are used to compare hypotheses. “Support” would be a better word than “likelihood” (which in ordinary English is synonymous with probability).
  • The weight of evidence favouring one hypothesis over another is the logarithm of the ratio of the likelihoods. Weights of evidence can be added over independent observations. It’s convenient to use logarithms to base 2, so that the weights are expressed in bits.
  • If you make an assertion about the strength of evidence favouring one hypothesis over another, you are making an assertion about the conditional probabilities from which the ratio of likelihoods is calculated. These conditional probabilities (“expectations” would be a better word than “probabilities”) are based on subjective judgements. You can’t evaluate evidence without making these subjective judgements.

If you have learned to think of probabilities as objective properties of physical systems, the modern subjectivist interpretation of probability as quantifying degree of belief may be hard to accept. Classical probability theory was based on situations like coin-tossing and throwing dice, where probabilities are imposed by physical symmetries. However the rules for updating subjective probabilities in the light of evidence apply even when there are no such symmetries. One way to elicit your subjective probability of an event is as the price you would offer or accept for a ticket that will pay out £1 if the event occurs, and nothing if the event does not occur. If the prices you specify over various combinations of events are not consistent with probability theory, someone else can construct a “Dutch book” against you – a set of bets that guarantees that they will gain and you will lose.

Although probabilities are subjective, they are not plucked out of nowhere: they have to be consistent with what you already know, and with what you do not know. Usually there is some information that can be used to set conditional probabilities. For instance, in the examples discussed later, where some victims were injured after they were supposedly rescued, information on the frequencies of accidental injuries of these types in different settings can help us to specify the conditional probability of this observation given a hypothesis under which such injuries could only be explained as accidental. Where people differ in their assessment of the strength of evidence contributed by the same observations, eliciting these conditional probabilities will establish where their judgements differ.

We don’t need to get too hung up on the subjectivity of assessing how well a hypothesis would have predicted the observations. In the examples discussed in these posts, serious errors have arisen not because people’s assessments of these conditional probabilities were inconsistent with the available information, but because:-

  • relevant observations have been widely ignored (as we shall see in this section)
  • observations consistent with a hypothesis have been accepted as evidence supporting that hypothesis, without considering alternative hypotheses. An example is how the observation of Volcano rockets in the Ghouta incident was accepted as supporting the hypothesis of a regime attack, though the hypothesis of a “false flag” attack would have predicted this observation at least as well.
  • the evaluation of evidence favouring one hypothesis over another has been been confused with assertions of prior belief about the plausibility of one of those hypotheses.

In the context of the Syrian conflict, it is difficult for independent-minded journalists and academics to propose explanations that differ from the official line without being heavily criticized for “speculating” or “conspiracy theorizing”. However you cannot evaluate evidence for a hypothesis without specifying alternative hypotheses and computing the likelihoods of these hypotheses given the observations: this inevitably requires you to “speculate”.

In the discussion below I have linked to the sources of the observations used, but I have not embedded any images as the horrifying nature of some of these images would distract from the formalism of the argument. I am not appealing to your emotions but to your ability to use logic to evaluate evidence for yourself.

Weight of evidence for alternative hypotheses about the alleged chemical attack in Ghouta in 2013

At the end of the last post I listed four alternative hypotheses about the Ghouta event:-

  • H1: a chemical attack was carried out by the Syrian military, authorized by the government
  • H2: a false-flag chemical attack was carried out by the Syrian opposition to implicate the government
  • H3: an unauthorized chemical attack was carried out by a rogue element in the Syrian military
  • H4: there was no chemical attack but a managed massacre of captives, with rockets and sarin used to create a trail of forensic evidence that would implicate the Syrian government in a chemical attack.

The problem is to compute the likelihoods of these four hypotheses given the “dog did not bark” observation that no images of search and rescue operations were released:-

Under H1H2 or H3, it is unlikely that no such images would have been made available. But how unlikely? To assess this, we have to envisage the scenario under H1Procedures for urban search and rescue are well established. After each home has been searched, it is marked to record how many live victims were rescued and how many dead victims were found. If in eastern Ghouta the area affected covered only one square kilometre of housing, with 50 homes per hectare, there would have been 5000 homes to search. With at least 400 fatalities, we expect at least as many living but incapacitated individuals to have needed rescuing. The immediate priority would have been to rescue the living, leaving bodies to be removed later. Even if the operation began in the middle of the night soon after the alleged attack, we would expect it to have continued after daybreak.

Of at least 150 videos uploaded , badged as coming from 18 different media outlets, not one shows this search and rescue operation. Most show victims at morgues, hospitals or improvised medical stations: dead and living victims appear to have arrived at these medical stations in the middle of the night. Most people will agree that the probability of this observation given H1 is low: I’ll assign a value of 0.05. It is possible to calculate a number for this conditional probability based on some assumptions about the probability that the output of a single media outlet will include a search and rescue image, but I don’t claim that this is more than a (rather conservative) subjective judgement.

Under H4, the conditional probability of this observation is high – it would have been difficult to stage such operations without the cooperation of large numbers of civilians. We can set a value of 0.8 for the probability that no search and rescue operations would be uploaded. This gives a likelihood ratio of about 20: a weight of evidence of 4.3 bits favouring H4 over H1. There are other related observations that should be taken into account: for instance:

  • the observation that all victims were in day clothes though the alleged attack occurred at about 2 am
  • the obviously fraudulent videos of the “Zamalka Ghost House” in which videos of a group of adults and children apparently executed several days before the alleged chemical attack and placed in an unfinished building were presented as a family of victims found in situ.

Each of these observations would add one or two bits to the weight of evidence favouring H4 over H1, giving a total weight of evidence of about 7 bits given the related observations of no images of search and rescue, that the only such images uploaded were fraudulent, that bodies of victims apparently arrived without delay, and that these victims were fully clothed. Even if your prior odds favouring H1 over H4 were 1000 to 1, this weight of evidence would reduce the odds to about 3 to 1. There are other “dog did not bark” observations of non-occurrence of expected events related to the Ghouta incident that support H4 over H1: for instance under H1 we would expect to see interviews with bereaved survivors who would be able to document, with family photos, that they were relatives of victims seen dead in morgues.

We could proceed to evaluate the weights of evidence for other independent observations that have been made on the Ghouta event, and add them up. However there is one single observation for which I assess the weight of evidence favouring H4 over H1 to be so large as to overwhelm anything else.

The Kafr Batna morgue images

Some of the most harrowing images from the Ghouta incident were from a building identified as an old tuberculosis hospital in the suburb of Kafr Batna, in which living and dead victims were shown in a basement room, and and at least 80 dead victims were laid out in a sunlit ground floor room (the “Sun Morgue”). A detailed study of the videos and still images from this site has been released online. This includes a detailed reconstruction of the fate of one victim in the Sun Morgue (pages 184-201). A short video summarizing this reconstruction has been released. The sequence of the videos and still images can be reconstructed from sun angles and from the order in which bodies are laid out and removed. The reconstruction shows that a heavily built male (given the code M-015 in this study) was brought into the morgue and laid on the floor apparently dead with no sign of bleeding. In later images M-015 had clenched his fists to grip his shirt, was bleeding from the neck, and a folded blanket had been been placed under his head. In subsequent images the flow of bright red blood had continued, eventually saturating the blanket and spilling on the floor. At the end, when most of the bodies had been removed, the blood-soaked blanket remained. These images show that M-015 was not dead when brought into the morgue (dead people do not clench fists or bleed profusely). The only plausible interpretation of this image sequence is that M-015’s throat was cut when the morgue workers realized he was still alive.

I’ll now try to compute the likelihoods of H4 and H1 given this observation. Under H1 it is possible that a victim would be mistakenly declared dead and begin stirring in the morgue, but it’s almost impossible to explain why subsequent videos showed the victim bleeding bright red blood from the neck, or why the reaction of the emergency workers to someone who was obviously alive and bleeding profusely was to place a blanket under his neck and leave him to die.

The least implausible explanation I can come up with under H1 is that M-015 began stirring in the morgue, that somehow this led to an accident in which he was stabbed in the neck, and that the morgue staff, having no idea how to deal with this and afraid to report the accident, simply placed a blanket under his neck and left him to die. The probability of a patient being accidentally stabbed in the neck in a hospital setting, even in a chaotic response to a major incident, is extremely low. On the basis that I found no reports of such accidents in a brief search, I’ll put the risk of at least one such accident in Ghouta at less than 10-5. A botched medical procedure, such as an attempted insertion of a central venous catheter via the neck, is not a plausible explanation for the bleeding as there would have been no indication for such a procedure as the first response to someone apparently recovering consciousness, and no space around the patient was cleared to facilitate medical intervention. Based on a probability of 10-5 that a victim waking up in the morgue would be accidentally stabbed in the neck and a probability of 0.01 (given that under hypothesis H1 the morgue staff are genuine emergency workers) that the reaction of the staff to this accident would be to leave him to die, I compute the likelihood of H1 given this observation as 10-7. Maybe readers can come up with an better explanation of how this sequence of images could have occurred given hypothesis H1.

Under H4, which postulates that the Ghouta victims were massacred captives most likely killed in gas chambers and that the morgue staff were playing an active part in this operation, such an observation is not unexpected. The probability that in a massacre of more than 400 people at least one victim would survive the gas chamber and that those removing the bodies would fail to detect this is high (0.5). It is to be expected that they would kill such an individual as soon as he began stirring, and it is probable that the method chosen would be throat cutting rather than shooting or strangling (0.5). It’s also probable (0.4) that such an incriminating sequence of images would not be detected by those responsible for editing and uploading the videos and stills. Multiplying these conditional probabilities together gives a likelihood of 0.1. As we’ll see it won’t make much difference to the weight of evidence if these numbers vary by a factor of 2 or so. The weight of evidence is dominated by the very low likelihood of H1.

On this basis I evaluate the likelihood ratio favouring H4 over H1,given the observation of what appears to be a murder in the Kafr Batna morgue, as a million to one: a weight of evidence of 20 bits.

Evidence for alternative explanations of Khan Sheikhoun

We now turn to evaluating the evidence for alternative explanations of the alleged chemical attack on 4 April 2017 in Khan Sheikhoun. With Ghouta as a precedent, we can begin by defining just two alternative hypotheses:

  • H1: the Khan Sheikhoun incident was a chemical attack by the Syrian air force using sarin. The leading proponents of this hypothesis are the US, UK and French governments.
  • H2: the Khan Sheikhoun incident was a planned deception operation intended to bring about US military intervention, in which captives were killed in gas chambers, small quantities of sarin were used to generate a forensic trail and a large-scale media operation was undertaken to support the story of a chemical attack by the Syrian air force. The earliest proponents of this hypothesis were a group of contributors to the wiki A Closer Look on Syria. Under this hypothesis, Khan Sheikhoun is Ghouta version 2, and it is to be expected that a similar trail of evidence will be laid: purported eyewitnesses will describe the attack, videos will show victims purportedly being treated and bodies laid out in morgues, at least one alleged impact site will be shown with the remains of a munition, and both environmental and physiological samples will test positive for sarin.

As before, your prior beliefs about which of these two alternative hypotheses is correct need not prevent you from evaluating the weight of evidence favouring one hypothesis over the other. You may believe that H1 is highly implausible on the basis that the Syrian government had no motive to carry out such an attack, or you may believe that H2 is highly implausible on the basis that such an elaborate deception operation is beyond the capability of the Syrian opposition and their foreign allies. To evaluate the evidence favouring H1 over H2, you have to assess, for each hypothesis in turn, what you would expect to observe if that hypothesis were true. This means that you have to put yourself first in the shoes of a Syrian general planning a chemical attack on the town, and then in the shoes of an opposition commander planning a massacre that would implicate the Syrian government.

For hypothesis H2 we have to envisage how a clever and ruthless al-Qaeda commander, perhaps working with foreign help, would plan such an operation. Although it is disturbing to have to work through this, I’ll now state, as neutrally as I can, how I would expect such an operation to be planned.

  • Captives (most likely religious minorities or families of government supporters) would be held in readiness. Improvised explosive devices and possibly smoke generators could be placed at key locations in the town to panic the civilian population into believing they were under chemical attack. Low doses of sarin could be administered to volunteers so that they would test positive for exposure to sarin (the doses required to generate a positive test are far below those required to cause symptoms). Medical facilities controlled by jihadis would be ready to play their part by showing casualties, real or fake, being “treated”. A few actors could be prepared to play the part of bereaved parents, and provided with photos of children who were to be killed. Captives would be killed in improvised gas chambers, but the preferred agent would be an easily-available gas that leaves no residue, rather than sarin which would endanger those removing the bodies. A well-staffed video editing operation would be ready to edit the raw footage into clips and stills badged with the logos of various opposition media organizations. To make the video images so horrific that those viewing them would be shocked into supporting immediate retaliation against the Syrian government, the planners might choose that some children would not be killed outright by the gas but instead filmed struggling to breathe, before they were finished off by other methods.

In this framework, we can begin by evaluating the “mountain” of evidence – eyewitness reports, footage, crater, positive tests for sarin – that Monbiot invoked. Most of these observations were similar to that from Ghouta: purported eyewitnesses of the attack were made available for interview, images showing victims in morgues or improvised treatment facilities were uploaded, and samples tested positive for sarin. The likelihood of H1 given these observations is close to 1. Under H2, which specifies that Khan Sheikhoun was a repeat of Ghouta, we expect such observations so the likelihood of H2 also is close to 1. The weight of evidence favouring H1 over H2 given these observations is therefore close to zero. You may have a strong prior belief that H2 is implausible, but that does not influence the likelihood ratio favouring H1 over H2.

How does hypothesis H2 account for this mountain of evidence so easily? A key requirement for a successful deception operation is to create what look like many independent sources of evidence, even though they are all in fact generated by the operation. This principle was brilliantly applied by the legendary Naval Intelligence Division in the deception operations that led German commanders to expect Allied landings in 1943 in Greece rather than Sicily, and in 1944 in Calais rather than Normandy. Thus under H2, if the planners are competent, we expect to see videos badged with the logos of different opposition media agencies and uploaded separately, even though they may all originate from a central video editing operation.

At this point you may reasonably ask: if H2 can so easily account for this mountain of evidence, what possible observations could give a likelihood ratio strongly favouring H1over H2? Such observations are those that would be expected under H1, but very difficult to generate under H2. For instance if H1 were true, any of the following observations might be expected to contribute evidence favouring H1 over H2:-

  1. if we were presented with convincing and hard-to-fake evidence that the victims seen dead in the images had lived in the locality from which they were supposedly rescued
  2. if interviews with bereaved survivors included convincing and hard-to-fake evidence that the dead victims were their relatives, including family photos showing them with these victims. These family photos should include adult victims, who unlike young children cannot easily be induced to pose in a familiar setting with their captors.
  3. if videos showed the search and rescue operations in which these victims’ bodies were recovered: these operations would be hard to stage on a large scale without the cooperation of civilians.
  4. if a chemical signature match between the environmental sarin samples and Syrian military stocks were reported by scientists prepared to put their names on a report that was detailed enough to be subjected to independent peer review.
  5. if blood tests on purported survivors of the chemical attack showed exposure to sarin at levels high enough to have caused severe and life-threatening poisoning. Modern tests for sarin exposure can detect exposure at levels far lower than those required to cause symptoms. It would be easy for actors to expose themselves to low doses of sarin, but not so easy for them to expose themselves at levels high enough to cause severe symptoms.

It’s also useful to list, before going further, what possible observations might be expected to contribute evidence favouring H2 over H1 if H2 were true:-

  1. if the locations of victims and alleged air strikes were not consistent with records of flight tracks or with wind directions. Under H2, locations of improvised explosive devices would have to be planned in advance, without knowing where a jet would fly or which way the wind would be blowing.
  2. if the uploaded videos contained evidence that scenes were staged or that the victims were captives. Under H2, a weak point in the operation is that dozens of video clips and still images that are meant to show rescue workers dealing with large numbers of victims have to be recorded, edited and uploaded in a few hours, and the editing may fail to remove incriminating material. When all available images are arranged in temporal sequence, using sun angles and other clues to time the images, and the identities of victims are matched in different clips a different story may be revealed, as in Kafr Batna.

With this in mind, we can evaluate the weight of evidence contributed by five observations that have been summarized here

Weights of evidence contributed by observations

Observation Prob (obs given H1) Prob (obs given H2) Likelihood ratio H2 / H1 Weight of evidence (bits) favouring H2 over H1
An individual claiming to be a bereaved survivorwas made available for interview, with photos showing him with two children later seen as victims. The lack of photos of his wife was attributed to loss of the family photo album in an airstrike on the family home. 0.002 0.04 20 4.3
There are no videos of victims being rescued in their homes, or bodies being recovered 0.05 0.8 16 4
The flight track of the Syrian jet shown by the Pentagon (single east-west pass just south of the town) is incompatible with the track of the three explosions (north-south axis over the northern part of town) and the alleged impact site of the chemical munition 0.01 0.8 80 6.3
The alleged impact site of the chemical munition is upwind of where the casualties were reported (by the rebels) to have occurred. 0.02 0.5 25 4.6
In the images released by the rebels, several of the children who are seen dead have head and neck injuries. Reconstruction of sequences and matching of identities shows that in two of these children the head injuries were received after they had been supposedly rescued by the White Helmets 0.01 0.2 20 4.3
Total 23.5

Notes on assignment of likelihoods

  1. In Khan Sheikhoun at least two individuals claiming to be bereaved survivors were interviewed. Most of the interviews were given by Abdelhamid al-Yousef (AHY), who appears to have been serving in the opposition forces as a sniper. AHY reported that his wife and nine-month old twins had been killed in the chemical attack, and produced photos showing him with two children about this age who were among the dead victims. No photos showing AHY with the mother of these children were produced: an interviewer reported that “he does not even have any photos of his beloved wife of two years left to console him, as they were all destroyed in the attack that ripped through his hometown.” and quoted him as saying “In my house all the photos I had of my wife and everything I owned was burnt.”Under H1, it is expected that at least one bereaved survivor would be available for interview. However the probability is rather low that the witness’s home would have been destroyed in an air strike at the same time as the alleged chemical attack, given that only three explosions were documented as occurring in Khan Sheikhoun at this time. These explosions were geolocated by smoke plumes, satellite images and ground-based images. The explosions appear to have been relatively small, each destroying only a single house. If, as alleged, these explosions were caused by bombs dropped by an aircraft in a single pass over the northern half of town, we can estimate the area at risk as about 30 hectares, and that about 1500 homes were at risk (based on a typical urban density of 50 homes/hectare). The probability that the witness’s home would have been one of the buildings destroyed by these three explosions is therefore about 1 in 500.Under hypothesis H2 that Khan Sheikhoun was version two of Ghouta, there is a moderate probability that at least one actor would have been prepared to play the part of a bereaved survivor, and would have posed for photographs with captive children. I’ll assign a probability of 0.2 to this. The problem for such an actor would be to explain the lack of photographs showing him with the adult victims from the same family, It is much easier to get young children to play happily with an adult who befriends them than it is to induce adults to pose for a family photograph with their captors. Of the possible explanations that such an actor might choose to give, one of the most likely (to emphasize the brutality of the regime) is that the family home was destroyed in an airstrike. I’ll assign a probability of 0.2 that this explanation would be produced. Multiplying the conditional probability under H2 that an actor with photos showing him with the children would be made available for interview by the probability that this actor would invoke destruction of the family photo album in an airstrike to explain the lack of photos showing him with the mother, we get a likelihood of 0.04.The likelihood ratio favouring H2 over H1 is 20. Note that this assessment of likelihoods does not make any assessment of whether AHY is telling the truth or lying. We have shown that under H1, it is a rather improbable coincidence that one of the few homes destroyed by three apparently untargeted bombs dropped on a town of at least 20,000 people would be that of the sole survivor of a large extended family killed in a chemical attack at the same time. We also assess that under H2, it is quite probable that an actor playing the part of a bereaved survivor would report the destruction of his home in an airstrike as an explanation for why no family photos showing him with adult victims were available. Computing the ratio of these two likelihoods allows us to make a statement about the strength of the evidence contributed by this observation.
  2. In all the videos and images released by the White Helmets and other opposition media organizations from Khan Sheikhoun, there are no images of urban search and rescue operations. Under H1, we’d expect to see videos of the White Helmets carrying out a search and rescue operation covering the neighbourhood allegedly affected by the chemical attack. The White Helmets are trained in urban search and rescue procedures and are famous for documenting their operations on video. The absence of such videos has low probability (conservatively assessed at 0.05) under H1, but high probability (0.8) under H2 as it would be difficult to stage such scenes without involving large numbers of civilians.
  3. The flight track of the Syrian jet shown at the Pentagon’s press conference shows only a single east-west pass just south of the town, passing no closer than 2 km from the crater that was the alleged impact site of the chemical munition. The three high explosive detonations, mapped by OPCW based on witness reports, and by others based on geolocation of smoke plumes and images (satellite and ground-based) of explosive damage, are in the northern half of town in a north-south line. From the scatter of the points that were plotted on the Pentagon’s map, we can estimate the accuracy of the flight track (presumably based on airborne radar). By inspection of other east-west passes on this map, I estimate that the standard deviation of the errors in a north-south direction is less than 1 km. For the jet to have passed over the alleged impact site, at least two data points would have had to have been plotted too far south by at least two standard deviations: the probability of this is about 1 in 1000. Even more unexpected under H1 is that the flight track does not show the north-south pass that would have been required to drop three bombs corresponding to the three documented high-explosive detonations.As the Pentagon’s map appears to include at least one false-positive data point (an outlying data point southwest of Homs city that does not appear to be part of a flight track), it is reasonable to allocate a small but nonzero probability to false-negative results: specifically a failure to detect a north-south pass. To be conservative, I’ll assign a value of 0.01 to the probability under H1that the Pentagon’s map of the track of the Syrian jet would match neither the position nor the alignment of the reported impact sites.Under H2, the explosions were generated by IEDs, and the arrival of the jet was the cue to set off these explosions. The probability that the pre-planned line-up of IEDS would not match the flight path of the jet is high – I assign a probability of 0.8 to this.
  4. The videos of the smoke plumes from the three high explosive detonations, recorded by opposition cameramen and said to have occurred just before the alleged chemical attack, show that the wind was blowing steadily from southwest to northweast. The OPCW’s map of the area in which casualties allegedly occurred, based on reports from eyewitnesses, shows that this area is southwest – i.e. upwind – of the alleged impact crater. Under H1, this is difficult to explain: we have to postulate some unusual local reversal of wind direction at ground level. I assign a probability of 0.02 to this. Under H2, in which the locations from which casualties were to be reported and the location of the impact crater were planned in advance, the probabilities that the specified casualty location would be upwind or downwind of the impact crater are about equal, so the probability of an upwind location is about 0.5.
  5. The images of the victims are so horrific that most of us find it difficult to look at them further. Detailed frame-by-frame analyses of the many videos clips and still images can take many months. A few citizen journalists in different countries, sharing their work for peer review, have made some progress with this Careful examination of the videos and still images, using sun angles to time them, has allowed them to be ordered in temporal sequence and the identities of the same individuals to be matched in different videos. Several of the children seen dead in in improvised morgues have obvious and recent head injuries. In at least two of these children, it is possible to establish that these head injuries were received after they had been “rescued” by the White Helmets. Under H1, the probability that at least two victims would receive traumatic injuries after they had been rescued is very low. The most plausible explanation under H1 is a traffic accident while they were being transported in an ambulance or a pickup truck. A rough estimate for the rate of serious injuries from road traffic accidents in a low-income country like Syria in wartime is 1 per million vehicle-kilometres. Allowing for a tenfold higher rate per vehicle-km in vehicles used as emergency ambulances, and a total distance of 200 vehicle-km travelled by vehicles transporting casualties in the Khan Sheikhoun incident, the probability of an accident causing injuries to some of these casualties is about 0.002. Note that this is the risk of a single accident that is assumed to account for all injuries received after rescue; if the injured children did not travel in the same ambulance, we have to postulate multiple accidents, for which the probability is far lower. Again to be conservative I’ll assign a conditional probability of 0.01 to these injuries occurring by accident under H1.Under H2, it is probable that some victims would survive the gas, either by accident or by design (if the plan was to film some children while still alive for maximal emotional impact). These victims would have to be finished off with physical violence, and the probability is high that this would include blows to the head or neck.
    The probablity that editing of the videos would fail to remove the incriminating sequence of images is also moderately high, given the large number of videos that had to be edited and uploaded over a few hours. I assign a probability of 0.2 to this observation given H2.

From this evaluation, I assess the total weight of evidence favouring H2 over H1 as about 23 bits, giving a likelihood ratio of about 8 million to 1. This might be described as a mountain of evidence.

Although these assignments of the conditional probablities of the observations given H1and H2 entail subjective judgements on my part, it should be possible for people with different prior odds to reach consensus on these conditional probabilities, and thus on the likelihood ratios. You may be able to improve on and correct my judgements of the conditional probabilities, using additional information. For instance:-

  • By fitting smoothed curves to the points shown on the Pentagon’s map of the flight track, it should be possible to make a better estimate of the probability distribution of the errors in the data points that make up the flight track.
  • Someone with meteorological expertise may be able to assign a more realistic probability of a local reversal of wind direction at ground level.
  • Further analysis of the videos may establish whether a single traffic accident to an ambulance can account for all children who were injured after they had been rescued.


From this evaluation of the likelihoods of alternative explanations of the alleged chemical attacks in Ghouta in 2013 and Khan Sheikhoun in 2017 given some key observations, I assess that the evidence favouring the hypothesis of a managed massacre of captives over the hypothesis of a regime chemical attack is overwhelming (at least 20 bits) both for the Ghouta attack and for Khan Sheikhoun. The calculations and subjective judgements on which these assessments are based are set out above. The evaluation of weights of evidence does not depend on prior beliefs about which hypotheses are plausible. To modify this conclusion about the weight of evidence, you have either to identify additional observations which would have been predicted better by the regime chemical attack hypothesis than by the managed massacre hypothesis, or to criticize and revise my assessments of the conditional probabilities of the observations listed above given each of these two hypotheses. I’ve suggested above some ways in which additional information could be used to revise these conditional probabilities. If you believe that either the managed massacre hypothesis or the regime attack hypothesis is implausible, I am not disagreeing with you: priors are subjective. However for your beliefs to be logically consistent, your priors must be updated by the weight of evidence according to Bayes’ theorem.

The strength of the evidence favouring the managed massacre hypothesis over the regime chemical attack hypothesis has quite radical implications for the credibility of western media, western governments and international agencies such as OPCW; you may reasonably ask “how could they have got it so wrong?”.

3 replies on “Who is Responsible for Chemical Attacks in Syria? Guest Blog by Professor Paul McKeigue (Part 2)”

Hi Paul,

This seems like a long-winded demonstration of confirmation bias. Has your “probability calculus” method been validated with test studies? Has it been peer reviewed in discussions of current events?

As far as I can tell, this method–like so much modeling–is only as good as the assumptions, controls, and weighting you put into it.

Maybe you should apply this method to the Moon landing, 9/11, the Kennedy assassination, or any other established case to test its validity in the context of current events.

Aside from a general case of selection bias for the evidence you discuss, here are specific criticisms highlighting flaws in your interpretation of evidence:

Example 1: “The OPCW’s map of the area in which casualties allegedly occurred, based on reports from eyewitnesses, shows that this area is southwest – i.e. upwind – of the alleged impact crater. Under H1, this is difficult to explain: we have to postulate some unusual local reversal of wind direction at ground level”

You’re assuming that wind is the dominating factor in the dispersion of Sarin–not the explosive bursting charge or gravity (sarin is denser than air). See? Bad assumptions lead to ridiculous conclusions.

Example 2: “Several of the children seen dead in in improvised morgues have obvious and recent head injuries.”

Why do you find this suspicious? Sarin and Diisopropyl Fluorophosphate (DIFP) are known to cause rapid “knock out.” Head injuries are very common after fainting.

Example 3: “The flight track of the Syrian jet shown at the Pentagon’s press conference shows only a single east-west pass just south of the town, passing no closer than 2 km from the crater that was the alleged impact site of the chemical munition.”

You are interpolating a flight path from radar blips. Different paths can be interpolated–how do you evaluate which one is more true than the other? Curve fitting and smoothing over a non-linear cluster of points projected from four dimensions (x,y,z, time) onto two is a meaningless exercise. You also do not know the point at which the bomb was released. Anywhere within 10 km is reasonably within range. A discussion of specific maneuvers and points of release is unknowable from the available data.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.