Data Analytics in Pharma: How Pfizer, Moderna, and Others Innovate Drug Development

Advanced analytics shows lots of promise for pharma companies. According to McKinsey's analysis, the wider adoption of data-driven technologies could increase business operating performance by 15 to 30 percent of EBIDTA (Earnings Before Interest, Taxes, Depreciation, and Amortization) in just five years. Over a decade, the improvement is expected to range from 45 to 75 percent. Improvements driven by data analytics

Improvements data analytics promises to bring for pharma. Source: McKinsey & Company

Many pharma businesses — especially big ones — have already launched ambitious AI projects covering different phases of drug production. In this article, we’ll review the most popular use cases of machine learning and AI in pharma and back them with real-life examples from industry leaders.

How and where analytics is used in pharma

As the above-mentioned analysis clearly shows, analytics could significantly enhance each stage of pharma production — from research and early development to marketing. Here, we’ll only highlight the most common and by far successful ML applications that are already revolutionizing medicine and pharma.

Early research and drug discovery. Drug discovery is the first phase in the drug development process when researchers identify new substances that can potentially cure a targeted disease. It involves filtering hundreds of thousands or even millions of active molecules and numerous tests.

Drig discovery process

How drug discovery unfolds.

This phase takes five to six years, and only ten of ten thousand drug candidates finally make it to clinical trials. No wonder pharma companies seek ways to make the journey shorter — and pin their hopes on data analytics.

Read our article on AI in drug discovery and repurposing to learn more.

Clinical trials. This phase involves numerous clinical trial systems and largely relies on clinical data management practices to organize information generated during medical research. Overall, it lasts from five to seven years, with no more than 10 percent of molecules chosen during drug discovery getting FDA approval.

How could data analytics boost this process? For example, researchers may employ ML to analyze demographics, medical histories, genetic makeup, and other data to find and choose trial participants. AI models might also predict side effects, drug interactions, and efficacy before proceeding to another trial stage.

Precision medicine. Also known as personalized medicine, precision medicine is a data-driven approach to disease prevention and treatment that takes into account a person’s medical history, genome, lifestyle, and other information. It supports the development of unique drugs that target specific patients. Obviously, precision medicine requires a large amount of data and is enabled by advanced ML models.

To dive deeper into the subject, read our article on precision medicine.

Regulatory compliance. The pharma industry deals with tons of diverse medical data which may include protected health information (PHI). Containing details that can be used to identify a person, PHI is subject to strict HIPAA regulations. To stay HIPAA-compliant and avoid penalties, businesses need to timely identify sensitive information. Data analytics tools may automate PHI detection and anonymization.

Marketing and sales. Efficient analysis of data from multiple sources helps pharma businesses identify market trends and develop targeted marketing strategies. Machine learning algorithms can be used to predict future sales of particular drugs or spot growth.

Daily operations. ML tools are able to automate routine tasks, classify documents, and help with filling out medical forms throughout all the pharma processes. This way, pharma companies may not only save time but also dramatically reduce the number of human errors and enhance the accuracy of daily procedures.

Now that you know key areas where data analytics can deliver additional value, let's move to real-life success stories.

Identifying symptoms of rare diseases: Pfizer

In 2021, Pfizer, a global pharmaceutical company, revealed a novel prediction model that could identify the wild-type form of transthyretin amyloid cardiomyopathy (wtATTR-CM), a rare, life-threatening condition that causes heart failure. There are around 100,000 people in the US suffering from ATTR-CM, and the vast majority of them stay undiagnosed. If not timely detected and treated, the disease quickly progresses leading to a decline in life quality and, in the case of elderly people, to death within 2.5 to 3.5 years.

The model by Pfizer showed an accuracy of 87 percent when predicting patients with ATTR-CM. The data for evaluation was extracted from electronic health records and medical claims.

The technology enables doctors to detect a condition at an early stage and initiate treatment with proper drugs. The results give hope to millions that more people will be correctly diagnosed and that more analytics tools will appear to spot other rare diseases over time.

Currently, US healthcare providers can take advantage of a web-based educational tool that employs data from the Pfizer model to estimate the probability of a patient with heart disease having wtATTR-CM.

Building a model of the immune system to boost drug development: Pfizer and CytoReason

Another AI initiative by Pfizer was launched in partnership with CytoReason, an Israel-based developer of tissue- and cell-specific models, that mimic diseases. Pfizer already took advantage of CytoReason’s biological models to better understand how the immune system works and use these insights in the development of new drugs for immuno-mediated and immuno-oncology conditions.

Now, the two companies are building a simulated model of the entire immune system. As the company explains, it can be compared to weather models considering a large amount of data like air pressure, wind speeds, and moisture to help meteorologists predict the weather. Likewise, the immune system model takes information from various sources including public research papers, Pfizer’s clinical studies, omics datasets, and more.

Due to the complexity of human biology, the immune system model is still in its early stage. However, in the future, it’s expected to help discover new medicines and spot who will benefit from them most. Potentially, the technology will reduce the cost and time to market of novelty drugs.

Increasing diversity in clinical trials: Moderna

Data has always been a critical asset for a research-driven biopharmaceutical manufacturer like Moderna, especially when it comes to increasing the efficacy of clinical trials.

In 2016, Moderna chose Google Cloud’s Looker as its preferred data platform for enhancing organization-wide access to reliable and secure metrics. With the help of this platform, Moderna is able to conduct analysis of in-house data (clinical operations, gender, risk groups, etc.) as well as make use of external medical datasets. This allows the company to obtain a more complete view of the study and identify potential trends, increasing the diversity of its clinical trials.

Moderna’s data science team employs advanced analytics for other tasks as well. For example, they use Looker and other AI tools provided by Google to identify cost-saving opportunities in logistics (the company manages 60,000+ shipments per year) and run sentiment analysis to better address clients’ problems.

Discovering drug for dry eye disease: Aramis Biosciences and Exscalate

According to McKinsey estimates, there are about 270 companies that employ AI for drug discovery purposes. Aramis Biosciences is one of them. A Harward Medical School spin-off founded in 2018, the biopharmaceutical company focuses on the treatment of dry eye disease. While about 344 million people suffer from it worldwide, the syndrome remains under-treated.

As Aramis had an in vivo model of the target for identifying dry eye, it employed Exscalate, a supercomputing drug discovery platform, to detect lead candidate molecules for the treatment.

Exscalate system comprises three major components:
  • a ligand library with more than 2 trillion organic molecules and data required for their synthesis that uses AI to analyze the relationships between the structure and activities of the molecules;
  • a database of therapeutic targets; and
  • a ligand generator to run simulations on supercomputer clusters.
Aramis employed the ligand virtual library and ligand generator to create a library of peptides ready to be used in screening. Once the lead candidate was discovered, Aramis started in vivo testing. The discovery process was completed in 14 months, compared to years of discovery when using more traditional methods. Currently, Aramis has finished the enrollment of Phase II clinical trials for its agent capable of treating dry eye syndrome.

Revolutionizing cardiovascular trials: AstraZeneca

AstraZeneca, a global pharmaceutical manufacturer and biotechnology company, wants to revolutionize clinical trials with AI. The cardiovascular (CV) study Automating Identification Detection Adjudication (AIDA) is part of a bigger AI initiative. It contains three components: event sniffer, event harmonizer, and event classifier. An event here is a change from a participant’s baseline condition monitored during the trial to track the impact of a given medication or treatment.

An innovative CV clinical trial run by AstraZeneca . Source: AstraZeneca

The event sniffer explores multiple ways to detect events faster by using home monitoring, geofencing, and patient self-reporting.

The event harmonizer automates data collection and processing. It integrates both structured and unstructured data from various sources and applies deep learning algorithms to extract meaning from documents.

The third component, the event classifier, involves using machine learning to identify and classify events. Currently, human experts from independent adjudication committees verify that patients did experience a specific event. It’s a manual, time-consuming process that can take more than a couple of months. AstraZeneca claims that the results of its event classifier are comparable with those of human experts and that has the potential to reduce the time it takes to get new treatments to patients.

Tracking and forecasting pandemic hotspots: Johnson&Johnson

Johnson&Johnson, another pharmaceutical giant, implemented data analytics to help frontline healthcare workers during the COVID-pandemic. The company developed a model called COVID Lens that considers factors like coronavirus test sensitivity, the prevalence of the virus in the community, and the number of employees working on-site. This helps the company understand how much testing they must do in a specific area and consequently how many employees they need to allocate there.