The FDA Should Better Regulate Medical Algorithms
Medical algorithms are used across the health care spectrum to diagnose disease, offer prognosis, monitor patients’ health and assist with administrative tasks such as scheduling patients. But recent news in the U.S. is filled with stories of these technologies running amok. From sexual trauma victims being unfairly labeled as “high-risk” by substance-abuse-scoring algorithms to diagnostic algorithms failing to detect sepsis cases in more than 100 health systems nationwide to clinical decision support (CDS) software systematically discriminating against millions of Black patients by discouraging necessary referrals to complex care—this problem abounds. And it extends our pandemic as well. In a review of 232 machine-learning algorithms designed to detect COVID-19, none were of clinical use.
The kicker: most of these algorithms did not require FDA approval, and the ones that did often were not required to conduct clinical trials.
To understand why, let’s take a brief dive into history. In 1976—after the Dalkon Shield, a faulty intrauterine contraceptive device, was implicated in several reported deaths and numerous hospitalizations—Congress amended the Food, Drug, and Cosmetics Act to mandate that medical devices demonstrate safety and effectiveness through clinical trials. Because thousands of medical devices were already marketed at the time, a provision known as 510(k) was created to fast-track the approval process of devices. Under the 510(k)-clearance pathway, a device would not need to submit clinical trial data if it could demonstrate “substantial equivalence” in materials, purpose and mechanism of action to another device that was already on the market. At the time, this was a perfectly reasonable compromise for policy makers. Why make a surgical glove manufacturer undergo rigorous clinical trials if surgical gloves have already been approved? A glove is a glove, right?
Of course, medical devices became more complex with time. Surgical gloves were overshadowed by surgical robots; knee braces were trumped by prosthetic knees. Eventually it became too hard to determine whether any two devices were truly equivalent. This complexity led the George W. Bush administration to make a controversial decision in 2002: broaden the definition of “substantial equivalence” to include devices with substantially different mechanisms and designs if they had similar safety profiles. The goal was to encourage innovation, but over time, this change led to a greater proportion of unsafe and ineffective devices.
To make matters more complicated, a device approved via 510(k) could remain on the market even if its predicate device was later recalled for quality and safety issues. This has led to a “collapsing building” phenomenon, where devices that are currently in use within hospitals are based on failed predecessors. Of 5,362 device recalls during 2008–2017, 97 percent had received 510(k) clearance.
Under current law, medical algorithms are classified as medical devices and can be approved with the 510(k)-approval process. The primary difference is that medical algorithms are less transparent, far more complex, more likely to reflect preexisting human bias and more apt to evolve (and fail) over time, compared with medical devices of the past. Additionally, Congress excluded certain health-related software from the definition of a medical device in the 21st Century Cures Act of 2016. Therefore, some medical algorithms, such as CDS, can evade FDA oversight all together. This is particularly concerning, given the ubiquity of these algorithms in health care: in 2017, 90 percent of hospitals in the U.S.—roughly 5,580 hospitals—had CDS in place.
Ultimately, regulation needs to evolve with innovation. Given the threats of unregulated medical algorithms for patients and communities, we believe the U.S. must improve regulations and oversight on these new-age devices. There are three specific action items Congress ought to pursue.
The first is to lower the threshold for FDA evaluation. For medical algorithms, the definition of equivalency under 510(k) should be narrowed to consider whether the data sets or machine learning tactics used by the new device and its predicate are similar. This would prevent a network of algorithms, such as kidney disease risk tools, from being approved simply because they all predict kidney disease. Furthermore, CDS systems that are ubiquitous amongst hospitals in the U.S. should not receive exemption from FDA review. Although CDS algorithms are not intended as the sole determinant of care plans, health care workers often rely on them heavily for clinical decision-making, meaning they often affect patient outcomes.
The second action is to dismantle systems that foster overreliance on medical algorithms by health care workers. For example, prescription drug monitoring program (PDMP) mandates—which require prescribers to consult substance abuse scoring algorithms prior to prescribing opioids—should include comprehensive exemptions in all states, such as for cancer patients, emergency department visits and hospice care. And in general, unless doctors’ decisions lead to patient harm, they should not face significant penalties for using their own clinical judgement instead of following recommendations from medical algorithms. An algorithm may label a patient as high-risk for drug abuse, but a doctor’s understanding of that patient’s history of trauma adds critical nuance to the interpretation.
The third action is to establish systems of algorithmic accountability for technologies that can evolve over time. There is already some momentum towards this goal in Congress. A few years ago, Democratic Representative Yvette Clarke of New York State introduced the Algorithmic Accountability Act of 2019. This bill would require device companies that create “high-risk automated decision systems” that involve personal information to conduct impact assessments reviewed by the Federal Trade Commission (FTC) as frequently as deemed necessary. For medical algorithms that function in health care settings, the FTC could require more frequent assessments to monitor for any changes over time. This bill and several similar ones that were introduced have yet to reach the president’s desk. Hopefully, there will be added momentum in the months ahead.
We know algorithms in health care can often be biased or ineffective. But it is time for America to pay more attention to the regulatory system that lets these algorithms enter the public domain to begin with. For in health care, if your decisions affect patient lives, “do no harm” must apply—even to computer algorithms.