Natural language processing (NLP) can be a useful way to extract meaningful information from unstructured data, such as text and tables from electronic health records (EHRs), journals, and social media, but it isn’t ready for full-scale use, according to speakers at the FDA’s June workshop Use of Natural Language Processing to Extract Information from Clinical Text.
"The FDA’s goal is to personalize NLP capabilities to make our medical officers more effective when reviewing adverse events,” Mark Walderhaug, Ph.D., CBER (Center for Biologics Evaluation and Research), said. Workshop speakers suggested NLP may be used to support evidence generation and to improve the scientific validity of efficacy, safety, and post-marketing submissions. It also may find applications in IND (investigational new drug) safety reports, NDA (new drug application) and BLA (biologic license application) submissions, labels, adverse event reports, pharmacoepidemiological studies, and social media and internet queries.
NLP-derived clinical evidence hasn’t yet been included in regulatory submission documents, however. Doing so today would be disruptive to assumptions of data integrity and could jeopardize acceptances, speakers cautioned. To minimize that risk (if used with regulatory submissions), they recommend ensuring the NLP extraction process is transparent and leaves an audit trail.
NLP’S VALUE FOR PHARMA
For drug developers, mining comments from text using NLP offers three main benefits: filling in knowledge gaps left by coded data, extracting adverse events information, and improving sample sets by accessing much more data.
Many details of a patient’s reaction to a drug or events affecting outcome — like major bleeding or smoking — often aren’t coded. Yet, they are important in understanding adverse events. By mining EHRs’ text, NLP can fill many of those gaps.
In regard to heart disease, for example, Russ Altman, M.D., Ph.D., Stanford University, cited a study in which only 46 percent of the 101 charts contained structured wound information. NLP mined the unstructured notes, extracting terms like “venous stasis and RLE ulcers” that indicated wounds.
Nigam Shah, Ph.D., associate professor of medicine and of biomedical data science at Stanford University, found similar inadequacies with the coding for patients treated for prostate surgery. “The coding for urinary incontinence was practically nonexistent, but some of the records mentioned urinary incontinence in the text. By mining the notes, there is about a 100- fold increase in the things you find, and you can get negative information — such as ‘no urinary incontinence’ — which you can’t get from the codes.”
Shah also found NLP helpful in searching for events to aid in phenotyping for Alzheimer’s, Parkinson’s, and multiple sclerosis.
HOW NLP IS BEING USED NOW
NLP has been used by companies such as Shire and Lilly for drug discovery and clinical trials as well as European regulators to compare codes in submission and quality assurance documents.
Lixia Yao, Ph.D., assistant professor at the Mayo Clinic, and colleagues mined social media and EHRs for off-label drug uses, identifying several opportunities for drug repurposing. The search also showed that, among social media, “YouTube and Twitter had larger followings, but WebMD and PatientsLikeMe had better-quality information,” Yao said. “In general, people post if they are disappointed or angry. Therefore, individual data sources have inherent biases and may only provide one piece of the puzzle.”
At the FDA, CDER is piloting a study with Veterans Affairs to determine how patient behaviors affect outcomes among smokers with lung cancer. CDER is especially interested in learning whether NLP can identify variables missing from Big Data or spot biases that may color the findings.
Another FDA project uses NLP to mine the FDA Adverse Event Reporting System (FAERS). This approach has identified causal relationships between products and adverse events and helps reviewers more accurately summarize the cases.
The FDA also is mining ARIA, the FDA’s active risk identification and analysis system, to identify signals of serious, unexpected risk related to certain medications. “The algorithms lack the judgment of human experts and need to improve,” cautions Robert Ball, M.D., deputy director, office of surveillance and epidemiology, CDER. “Nonetheless, we can use NLP for a deeper look at what’s happening in a case.”
HIGH ACCURACY RATES
Accuracy rates for NLP have ranged as high as the upper 90s, but a literature review indicates ranges of about 85 to 95 percent are more typical. That’s accurate enough to use NLP to mine data for probable causes of disease, refine case definitions, find adverse events, support decisions, and identify changes in patients’ conditions.
“Linking data sources improves overall performance,” Altman said. Speakers also advised using multiple versions of search terms — weight, weighing, wt., etc. — and allowing for typographical errors. Additional steps include allowing the system to infer information — that simvastatin is a statin or that certain weights indicate obesity, for example — and differentiating between positive and negative mentions (like “I have cancer” versus “I don’t have cancer”) and to avoid confusing family histories with present conditions.
ADAPTING NLP FOR MULTISITE USE
Today NLP is most effective within single sites. Multisite use for product and safety surveillance, where many outcomes are captured only in unstructured narratives, is feasible but not easy.
One of the greatest challenges in adapting NLP to multisite applications is assembling a complete and representative clinical corpora because different sites, even within the same healthcare system, have their own cultures and customs that affect information. “Applying NLP in a multisite setting requires fore-thought and attention to detail because of idiosyncrasies in language usage, document structure, and content,” emphasized David Carrell, Ph.D., Kaiser Permanente Washington Health Research Institute.
Additional challenges include incomplete clinical text, differences in interpretation, and the differing needs of researchers and clinicians. For example, reconciling inconsistencies, such as polyps that were recorded before the colonoscopy, can be resolved easily for one patient. Resolving similar issues for hundreds or thousands of patients isn’t so simple.
Duplicate records also must be reconciled to ensure the most current EHRs are mined. In a study of 2 million EHRs, many had more than 30 versions and one had 53 different versions.
WHAT’S NEXT FOR NLP?
Refinements in algorithms and search techniques are ongoing. IBM is developing a network of connected knowledge built from de-identified patient records from the Cleveland Clinic. That network was developed by parsing and linking terms from medical concepts, medical dictionaries, and EHRs, as well as identifying the clinicians who authorized treatments and authored the notes, according to Murthy Devarakonda, Ph.D., principal investigator of the Watson Medical Analytics project at IBM Research. The goal is to gain insights into what individual physicians were thinking when ordering tests and making diagnoses.
“At the Mayo Clinic, we’re looking at language patterns using regular expressions and extracting sentences using a decision tree,” Yao said. The goal is to capture the context of patient- and clinician-generated data.
Several of the presenters expressed interest in deep learning, a form of machine learning. Mimicking the way humans learn, deep-learning algorithms read the same information multiple times, increasing their accuracy with each reading. Devarakonda and his team also are investigating a deep-learning approach that goes beyond extracting keywords to understanding the meaning of passages and sentences, and thereby linking problems to solutions.
KEEP YOUR EYE ON NLP
“Text is only one of many sources of information,” Shah stressed. “Rather than focusing on one or the other, look at the synergy between text and coded data. You don’t have to get your information using just one technology.”
Although NLP isn’t expected to be ready for routine use anytime soon, the FDA is seriously evaluating the technology to determine how it can be used by drug developers and the agency to support and evaluate regulatory submissions.