Measuring Evidence to Strengthen Belief (Bayes’ Theorem)
Reverend Thomas Bayes of Tunbridge Wells in Kent, England, was a Georgian gentleman scientist. These wealthy, independent, male citizens practiced science as a passionate hobby and not a profession. Many were priests. Church provided easy and respectable means of an ample income and vast spare time to indulge scientific queries. By most accounts, Bayes was a hopeless preacher, but an ingenious mathematician. At some point in his life, exactly when is not known, he devised a formula to work out various probability distributions of an uncertain event. And then; he forgot about it. Richard Price, Bayes’ friend, submitted this formula to Royal Society in 1763, two years after his death, under the title, 'An essay towards solving a problem in the doctrine of chances'.
Bayes’ theorem, as the formula
came to be called, had no utility in Bayes’ lifetime. Today it is considered a
landmark in the history of probability science and is used for: spam filtering,
weather forecasting, DNA testing, medical diagnosis and testing of new
treatment modalities, financial forecasting, fault diagnosis in engineering,
training of AI models, etc.
Data is the reigning attribute of
universe today. It is being generated incessantly, at an astronomical rate,
about 400 X 1012 bytes daily. Raw data is meaningless. Statistical
analysis imbues it with meaning. And turns it into assertions with varying
degrees of evidence. Data is ever changing and so is the evidence. Belief in
rational world demands that one changes their opinion with changing data.
Bayes’ rule or Bayes’ theorem
tells us how to revise our beliefs consistently, continuously, and rationally,
considering new data and the consequent evidence.
Most distinguishing feature of
modern medicine is its reliance on evidence; hence the name EBM, Evidence Based
Medicine. Bayes’ rule is widely used to interpret evidence in EBM. Expanse of
its applicability in modern medicine is surpassed only by its widespread
ignorance among medical professionals.
Bayesian reasoning finds its
paradigm application in medical diagnosis.
Presume mammography is used to
screen asymptomatic women for breast cancer in a population. Suppose the
probability that a woman has breast cancer is 1 percent (the prevalence of
breast cancer in the population). If a woman has breast cancer, the probability
that she tests positive is 90 percent (the sensitivity of mammography). If a
woman does not have breast cancer the probability that she nevertheless tests
positive is 9 percent (the false positive rate of the test). A woman tests
positive. What is the chance that she has breast cancer?
In a study by Gerd Gigerenzer,
160 gynaecologists were asked this question. Majority (60 percent) believed the
chance was between 80 percent to 90 percent and 19 percent believed it would be
1 percent. The correct answer is 9 percent that can be derived from Bayes’ law.
Thus, 80 percent of specialists, who are routinely called to pronounce
judgement on such tests were way, i.e., ten times, off the correct answer. The
most common treatment of breast cancer is a major surgery. Nine out of ten
patients, not suffering from breast cancer, but offered diagnosis of cancer on
mammogram, in this example, may undergo a disfiguring surgery that was
pointless.
Bayes’ theorem can be stated as:
P (A/B) = P(A)
X P (B/A)
P (B)
Our minds abhor symbols. I was
foxed when I read this formula first. Stated in words it is quite simple.
●
P is probability of an
event. A and B are events. In above example A is a patient suffering from
breast cancer. B is positive mammogram for breast cancer.
●
P (A/B) is probability
of A if B is true. Or probability that a woman has a disease if her mammography
is positive. This is the figure a doctor, and the patient, are
interested in.
●
P (B/A) is probability
of B if A is true. In above case, probability of the test being positive if
women tested suffer from cancer. This is the sensitivity of the test.
●
P (A) & P (B) are
the independent probabilities of events A or B.
Rendered in lay language theorem
means, now that you have the evidence, i.e., the result of her mammography, how
would you like to revise the probability of the woman suffering from breast
cancer? If you did not have the evidence, i.e., mammography result, you could
only have said that she had a one percent chance of suffering from breast
cancer, i.e., the prevalence of breast cancer we assumed in the population.
Calculation of probability using
percentages is confusing. We tend to take recourse to the easier, though
fallacious, short cuts, as most of the gynaecologists in the Gigerenzer’s study
did.
Gerd Gigerenzer, a psychologist,
and a long-time director at the Max Planck Institute for Human Development, is
a world-renowned expert in risk-literacy and decision making. He has authored
many books and papers on risk. He has suggested that instead of percentages, as
in the above example, data be presented as ‘natural frequencies’ which foster
understanding.
Data in above example can be
re-written in natural frequencies as:
Ten in every 1000 women have
breast cancer. Of these 10 women with breast cancer, nine will test positive.
Of the 990 women without breast cancer, about 89 nevertheless test positive.
What are the chances that woman who tests positive has a breast cancer. 98
women (10+89) will test positive. Of these 10 have breast cancer. The chances
that a woman who tests positive has breast cancer are 9/98, or roughly one in
ten. Not eight in ten as 60 percent gynaecologists believed.
There are three important
deductions from Bayes’ formula.
Chance of the disease is high if
its prevalence is high. This is the first figure in the numerator. As it is
taught to medical students, if you hear hoofbeats outside, it is likely to be a
horse, and not a zebra. If a patient presented with cough and fever in May
2021, when Covid raged, chances that she had Corona virus infection was high.
If a patient reports similar complaints today, the chances of Covid infection
are much less.
This is the reason claims of supernatural phenomena are pooh-poohed by rationalists. Such phenomena are exceedingly rare. Any proof that claims to prove their veracity, has to be so good that its falsehood is more unlikely, than the phenomenon it is trying to prove. Carl Sagan stated this pithily: 'Extraordinary claims require extraordinary evidence'.
Second, one should believe the
idea (individual is suffering from the disease) more if the evidence (the
positive test) is likely to occur when the Idea is true - implying high sensitivity
of the test. This is the second term in the numerator. A child from Nilgiri
Hills in Central India presenting with swollen painful fingers and long-standing
anaemia is likely to be suffering from Sickle Cell disease. If a person complains
of fever of a few days’ duration and bleeding gums, in August, Dengue fever
must be ruled out.
Thirdly, believe the idea less if
the evidence is commonplace – If it occurs frequently even in healthy
individuals. This is the figure in the denominator of Bayes’ formula. Followers
of a particular system of belief may be the dominant suspects in specific
crimes in a locality. But the said belief-system cannot be used to identify the
troublemakers. This is because most of the followers of this system are not
troublemakers. Belief is a poor touchstone here as it throws up many false
positives.
Daniel Kahneman and Amos Tversky,
both psychologists, singled out a major fault in our Bayesian reasoning: the
base rate neglect. In our example it is the prevalence of breast cancer in
population. The probability of positive test in cancer attracts our attention
and we forget how rare the disease is in the population.
The duo suggested further that
human mind doesn’t engage in Bayesian reasoning at all. Instead, we judge the
probability that an instance belongs to a category by how representative it is:
how similar it is to the prototype of the category that we nebulously represent
mentally as a family with its crisscrossing resemblances. Breast cancer
typically gets positive mammogram in our minds. Therefore, any woman with a
positive mammogram is deemed to be suffering from breast cancer. How common is
breast cancer, is conveniently forgotten.
One of the manifestations of base
rate neglect is hypochondria. A headache recalls fear of dreaded brain tumour.
We do not pause to consider that brain tumour is rare while headache has many
commonplace causes like cold, lack of sleep.
Base rate neglect also drives
thinking in stereotypes. An under-graduate in a university is concerned about
environment protection, human rights, freedom of speech, attends
literature-festivals, writes poetry. Is she more likely to be studying history
or science? History, of course! No, science courses have 10-15 times more
students than History in most universities. A dull, shabbily dressed nerd is
more likely to be a clerk in one of capital's many government offices than a
research scholar at the university.
Bayes’ reasoning has many
applications in law. A judge must decide if the evidence presented by the
prosecution overwhelmingly proves defendant’s culpability. Prevalence of crime
in the population (the base rate), the false positive signals given by the test,
and its sensitivity should influence the decision.
In a criminal case, DNA evidence
matches the suspect. The probability of a random match is 1 in 100,000. The
prosecutor argues that there's only a 1 in 100,000 chance that the suspect is
innocent. Is this correct? No, it is false. 1 in 100,000 is the probability of
a DNA match given innocence, not probability of innocence given the match.
Rahul drives a blue taxi.
Proportion of blue taxis in the city is one in five. A blue taxi is reported to
be involved in an accident. Chances that Rahul’s taxi was involved are 20
percent. Correct? No, not by a long shot.
20 percent is the proportion of all blue cars in city, not the chance of a blue
car being spotted on a road being a particular blue car. This will depend on
the number of blue cars in the city and the veracity of the witness.
To calculate the probability of
innocence in a DNA test one must know the size of the potential suspect pool.
Assume the population of the city where the crime happened to be 1 million. 10
people would have false positive matches. Only one is guilty. Therefore, true
positive is only one. Thus, there is a 10 in 11 chance, that the suspect is
innocent.
Not all evidence presented in a
court is discrete like DNA matching. Most, I believe, are fuzzy. They do not
easily lend themselves for statistical interpretation. State is always
interested in fobbing of woolly evidence as unassailable. Laws are enacted to
give these ambiguous allegations sanctity of unimpeachable evidence.
One such law is the Unlawful
Activities (Prevention) Act, the UAPA. People can be arrested if they are
suspected to have committed an offence as defined under the act. Bail has been
made extremely difficult for the accused arrested under UAPA. Courts are
required do deny bail if they find that there are reasonable grounds to believe
that the case against the accused was prima facie true. Courts
are mandated to assess guilt only by examining the chargesheet prepared by the
investigating agency. Accused cannot provide any evidence outside the
chargesheet. During bail hearing, they cannot examine or cross-examine
witnesses to challenge the chargesheet or the evidence it contains.
Conviction rate of those accused
under UAPA is only 2%. But not all UAPA cases have been tried till date.
Between 2014 and 2020, of the total UAPA cases
that saw completion of the trial, an average of 72.4% were discharged or
acquitted and 27.5% saw convictions.
27.5% is a mind-bogglingly poor
sensitivity test. Were it a diagnostic test for a disease, it would not even
see the light of publication in the most obscure medical literature. Yet judges
in most courts, appear to treat UAPA as faultless, and deny bail to accused,
thus incarcerating them for years, before the trial is completed.
We have seen the extravagant
folly of applying a medical test with more than 90 percent sensitivity as a
screening test for a rare disease. False positives would overwhelm the true
positives. If misdiagnosis in medicine can usher debilitating surgery in its
wake, the result of false-positive in judiciary can be more fatal:
incarceration for years, or even a death sentence. This is the reason that
nowhere in the world can the terrorists be profiled and detained preventively.
Base rate, the proportion of population that is terrorist, is an infinitesimal
fraction. Any test that is less than perfect would throw up a huge cache of
innocents.
Humans did not evolve to
intuitively understand chance and probability. Lord Balfour, a onetime prime minister of UK, once remarked that ‘the human mind is no more a truth finding apparatus
than the snout of a pig’. True, we did not evolve to discern truth hidden behind
the veil of chance. But we also did not evolve to treat cancer or kill people
because they differed from us in their ideology.
Millenia of thought has equipped
us to make sense of a life that is uncertain at its core. For a professional,
trained and trusted to separate chaff from the grain of truth, surrender of
reason in understanding the world is a gross dereliction of duty at its best.
At its worst it is an unpardonable sin condemning fellow humans to a fate worse
than the blind chance had in store for them.
Comments
Post a Comment