RCTS vs. Observational Trials

In randomized controlled trials, the efficacy of a potential treatment is determined by using a computer to randomly assign patients into two groups: an experimental group which receives the drug, and a control group which receives either the standard treatment (where one exists) or a placebo. When these trials are “double-blinded,” neither the prescribing doctor nor the patients know whether they are receiving the experimental drug or the placebo. This limits the potential for physician bias in deciding who receives the drug and their subsequent care.

In observational trials, which dominated medical science for most of its history, doctors observe outcomes of patients taking a particular drug, then try to judge efficacy by comparing them with a control group of people who didn’t receive the drug. There are various methods for defining control groups. In some, patients are provided with informed consent and then choose whether or not to take the drugs. In others, doctors recommend which treatment they think will be best for each patient.

Because drug assignment is not randomized, observational studies can be more open to bias compared to RCTs. This is called “confounding by indication,” which means the reasons why certain patients did or did not take the drug may influence the study outcome as much as, or more than, the effects of the drug itself. For example, a doctor may unconsciously choose to give the drug to patients who are less sick, yielding more positive results. Conversely, a doctor worried about potential side effects might hold off until patients are already very sick, producing a negative bias.

However, it’s important to note that medical researchers have developed ways to account for possible biases in observational studies, by controlling for other reasons that the trial may be yielding a particular result. In the case of drug efficacy studies, these “confounding factors” can include a range of patient characteristics and treatment modalities. For example, when comparing treated versus untreated patients, researchers make sure that the patients in each group are as close to each other as possible in terms of baseline characteristics such as age, sex, and severity of illness. One such technique, propensity score matching, involves taking patients from each group and matching them for as many variables as possible.


Observational Trials Can Anticipate RCT Results

Armed with these techniques, careful researchers can derive meaningful results through observational studies; as noted, hundreds of treatments have been approved on the basis of observational trials and without RCTs, including the tetanus vaccine and insulin. In fact, meta-analyses – studies which review a large number of previous studies with statistical methods – have shown that in many cases observational studies can anticipate the results of RCTs years in advance. This is especially true when large numbers of observational studies are performed and analyzed in the aggregate, as errors and biases are less likely to distort the results of large-scale comparisons.

One sweeping meta-analysis, drawing data from seven databases, compared more than 10,000 pairs of observational studies and RCTs across 228 medical conditions. It concluded that “on average, there is little evidence for significant effect estimate differences between observational studies and RCTs, regardless of specific observational study design, heterogeneity, or inclusion of studies of pharmacological interventions.” Another group of authors found that “the average results of the observational studies were remarkably similar to those of the randomized, controlled trials.” A third meta-analysis, reviewing 136 studies of 19 treatments, stated: “We found little evidence that estimates of treatment effects in observational studies…are either consistently larger than or qualitatively different from those obtained in randomized, controlled trials.”

In short, while any individual observational study may be better or worse conducted, it is highly misleading to portray observational studies generally as merely “anecdotal.” Again, many of today’s commonly used treatments were discovered or validated through careful observational trials and only later, if ever, subject to RCTs. It should also be emphasized that this practice continues. A review of cardiac drug approvals and treatment recommendations from 2008-2018, covering 51 current guidelines, noted “the proportion supported by evidence from RCTs remains small,” including just 8.5% of American College of Cardiology/American Heart Association (ACC/AHA) guidelines and 14.3% of European Society of Cardiology (ESC) guidelines. On that note, another review of observational studies and RCTs argues: “We do not believe that dependable clinical evidence only comes from RCTs….If RCTs were required for proof of efficacy of a given treatment, the practice of clinical medicine would indeed be reduced to a relatively few verified treatments.”

Government regulatory agencies in the U.S. and across the world routinely use observational study evidence to make inferences about causal outcomes. The foundational method for such work was laid out in 1965 by Sir Austin Bradford Hill. He identified nine “aspects” of causal reasoning evidence that in sum provide a rationale for deducing causation. To this day the aspects identified by Hill remain the most widely used framework for general causal reasoning across medicine, science and law. One of these aspects is “experiment,” which includes RCTs, but these comprise only a small component of causal evidence. Hill’s main point is that all forms of scientific evidence must be considered and weighed, with no single type considered solely definitive.

It’s particularly unrealistic to demand RCTs confirming efficacy against COVID-19 before using a drug like hydroxychloroquine for that purpose for several reasons. RCTs face logistical challenges. High-quality randomized studies require intensive preparation, controls and oversight. These expenses are often beyond the limited resources available in many healthcare settings. And poorly designed and executed RCTs can just as easily produce results that are meaningless or misleading – as has been the case with a number of COVID-19 treatments. In fact, some authors have argued that “in the end, an observational study with credible corrections and a more relevant and much larger study sample…may provide a better estimate [than small or flawed RCTs].” Indeed this is precisely the case with the small group of poorly designed and executed RCTs cited by critics seeking to discredit hydroxychloroquine as a treatment for COVID-19. 


Problematic RCTS of Hydroxychloroquine for COVID-19

All the evidence used to cast doubt on hydroxychloroquine as a treatment for COVID-19 is drawn from a handful of deeply flawed studies, which are nonetheless presented as trustworthy “gold standard” findings simply because they meet the technical criteria for RCTs.  Following is a list of these troubled studies, and the integral errors which render their findings invalid.  


  1. Boulware, et al. “A Randomized Trial of Hydroxychloroquine as Postexposure Prophylaxis for Covid-19.” N Engl J Med August 6 2020; 383:517-525. Doi: 10.1056/NEJMoa2016638
  • Virtually total lack of PCR testing for COVID-19, with just 2.6% receiving standard tests, forcing researchers to rely on subjective self-reporting of symptoms
  • Treatment started an average of four days after COVID-19 exposure, rather than no later than two days as recommended
  • Included mostly low-risk individuals who generally do well without treatment
  • Not blinded: healthcare workers received identifiable pills
  • Study stopped prematurely, before statistically significant figures
  • Reanalysis shows the statistical significance of the large benefit of early treatment, contrary to the authors’ claims. After re-analysis, there was a reduced incidence of Covid-19 associated with HCQ compared with placebo (9.6% vs. 16.5%) when received up to 3 days (Early) after exposure.


  1. Skipper, et al. “Hydroxychloroquine in Nonhospitalized Adults With Early COVID-19.” Annals of Internal Medicine, July 16, 2020. Doi: https://doi.org/10.7326/M20-4207
  • Underpowered, with 491 subjects recruited over the Internet versus designed for 6,000
  • Lack of testing, leading to inclusion of patients with “probable COVID-19”
  • Changing metric, beginning with hospitalizations but transitioning to symptomatic endpoints
  • Study was not blinded to the participants
  • Study used an active placebo medication (folate)


  1. Cavalcanti, Alexandre, et al. “Hydroxychloroquine with or without Azithromycin in Mild-to-Moderate Covid-19.” N Engl J Med July 23, 2020; DOI: 10.1056/NEJMoa2019014
  • Changed endpoint from viral load at day three to viral load at day seven
  • Changed endpoint to symptomatic rather than PCR test
  • Patient pre-trial medications not controlled
  • Median time from symptom onset to randomization 7 days, too late for HCQ to have early treatment benefit


  1. “RECOVERY” Trial. Horby, Peter, et al. “Effect of Hydroxychloroquine in Hospitalized Patients with COVID-19: Preliminary results from a multi-centre, randomized, controlled trial.” medRxiv, July 15, 2020. Doi: https://doi.org/10.1101/2020.07.15.20151852
  • Study was not a randomized trial; instead, the allocation of the drug was randomized, and the timing of drug administration varied widely
  • Median number of days from symptom onset [to treatment] was 9 days – far too late for early treatment effect
  • Study suffered from “confounding by indication”: patients who received HCQ were already sicker than those who didn’t
  • Used dosage far exceeding recommended 600 mg per day


  1. “PATCH” Trial. Abella, Benjamin, et al. “Efficacy and Safety of Hydroxychloroquine vs Placebo for Pre-exposure SARS-CoV-2 Prophylaxis Among Health Care Workers. A Randomized Clinical Trial.” JAMA Intern Med. September 30, 2020. Doi: 10.1001/jamainternmed.2020.6319
  • Small and terminated early, prompting authors to warn it “may have been underpowered to detect a clinically important difference”
  • HCQ arm results include an early positive test likely representing infection before study started
  • Low adherence (81%) relying on self-reporting rather than HCQ blood levels


  1. Rajasingham, Radha, et al. “Hydroxychloroquine as pre-exposure prophylaxis for COVID-19 in healthcare workers: a randomized trial.” medRxiv. September 18, 2020. Doi: 10.1101/2020.09.18.20197327
  • Underpowered, with 1,483 healthcare workers enrolled versus total target 3,150
  • Used low dose of HCQ, 400 mg once or twice weekly
  • Study relied on symptom-based reporting and diagnosis due to limited availability of PCR testing, but failed to investigate other possible causes of symptoms
  • Irregular reporting characterized by wide variation in timing of matching symptoms and PCR tests (where available). Study counted PCR+ tests within 14 days before/after symptoms, but PCR- tests within just four days of symptoms. Results suggest symptoms-based diagnosis is highly inaccurate
  • Despite these shortcomings, study actually suggests positive effect with 28% relative risk reduction of infection by giving HCQ weekly for 6-8 weeks


  1. WHO Solidarity Trial (halted early) WHO Solidarity Trial Consortium. “Repurposed Antiviral Drugs for Covid-19 — Interim WHO Solidarity Trial Results Repurposed Antiviral Drugs for Covid-19 — Interim WHO Solidarity Trial Results.” NEJM, Dec. 2, 2020. Doi: 10.1056/NEJMoa2023184.
  • Most patients started on HCQ two days or more after admission, majority already receiving oxygen or ventilation, suggesting advanced disease with little likelihood of early treatment benefit
  • No attempt to screen patients for markers of increased inflammation (D-dimer, LDH, high-sensitivity troponin) for indication of likely benefit from HCQ
  • Intention to treat measurement only; no per protocol analysis, measurement of cumulative HCQ dose or days on therapy to determine dose response
  • Hydroxychloroquine was often included in the standard of care comparator in many countries.
  • Dosages far exceeding 600 mg per day


  1. Ulrich et al. “Treating Covid-19 With Hydroxychloroquine (TEACH): A Multicenter, Double-Blind, Randomized Controlled Trial in Hospitalized Patients.” Open Forum Infectious Diseases. Doi: 10.1093/ofid/ofaa446
  • Very small study, with 67 HCQ patients versus 61 control
  • Suffered confounding by indication: patients receiving HCQ were 82% more likely to have very severe symptoms at the beginning of the study. HCQ recipients included 32% more males, who are known to fare worse on average
  • HCQ recipients were also more like to suffer cerebrovascular disease, cardiovascular disease (non-hypertension), renal disease (non-dialysis), and have a history of organ transplants


  1. Fiolet et al., “Effect of hydroxychloroquine with or without azithromycin on the mortality of coronavirus disease 2019 (COVID-19) patients: a systematic review and meta-analysis.” Clinical Microbiology and Infection. August 26, 2020. DOI: 10.1016/j.cmi.2020.08.022
  • Conflated hospital and outpatient studies
  • Inclusion criteria required RT-PCR confirmed cases, but a number of studies included had negligible testing rates
  • Authors do not consider different treatment delays, risk level of patients, differences in dosage, or usage of Zinc


  1. Magagnoli, Joseph, et al. “Outcomes of hydroxychloroquine usage in United States veterans hospitalized with COVID-19.” Med, June 5, 2020. Doi: 10.1016/j.medj.2020.06.001
  • Not a randomized controlled trial at all. Suffered confounding by indication: choice of providing patients with HCQ was left to physicians, and cohort of 90 patients receiving HCQ prior to intubation were much sicker than the group of 177 patients not receiving HCQ prior to intubation
  • Timing of treatment was apparently left up to the physicians as well, and the number of patients dying with and without ventilation indicates heavy “cross-over” to HCQ after patients were put on ventilators, and therefore much sicker; 75% of the patients not initially receiving HCQ prior to intubation were subsequently started on HCQ late in the clinical course, after they had deteriorated and required intubation
  • Despite the supposedly negative conclusions of the VA study, just 7.8% of the initial HCQ patients later had to be intubated, compared to 14.2% of the other 177 patients not on HCQ who required intubation. In short, HCQ actually appeared to reduce the risk of intubation by 50% – even with bias favoring the non-intervention group


11. Barnabas, Ruanne V., et al. “Hydroxychloroquine as Postexposure Prophylaxis to Prevent Severe Acute Respiratory Syndrome Coronavirus 2 Infection – A Randomized Trial.” Ann Int Med, December 8, 2020. Doi: 10.7326/M20-6519

  • Used high cycle threshold (CT) measurements, 38+, detecting inactive virus fragments, suggesting many infections occurred weeks earlier
  • Average time from exposure to PCR+ was seven days. Study results suggest ongoing or repeat exposures, with most infections occurring before enrollment or HCQ PEP reached therapeutic levels
  • Failed to confirm medication adherence with HCQ blood levels
  • Very low dose of HCQ, 400 mgs for three days followed by 200 mgs for 11 days
  • Long lag to PEP initiation, average of two days after exposure, not reaching therapeutic levels for several weeks
  • Study design changed during the study period to end point driven vs initial sample size determination driven