Subscribe Now

By entering these details you are signing up to receive our newsletter.

Employing advanced analytics and artificial intelligence in rare-disease pharmacovigilance

Estimated reading time: 7 minutes

Image of Lucy Fulford-Smith
TMC logo

The limitations of traditional pharmacovigilance (PV)are becoming more apparent in rare diseases. Established safety frameworks and signal-detection methodologies that work in large populations become difficult to implement when a global patient population may number only in the dozens. Every single case report carries disproportionate weight, and PV teams must make safety decisions with limited context and shifting natural history

As a result of challenges, sponsors and marketing authorisation holders (MAHs) are searching for novel solutions that artificial intelligence (AI) can provide, such as advanced analytics and machine learning tools to fill the gaps. These technologies allow sponsors and MAHs to find patterns in sparse, fragmented datasets and extract meaningful insights where conventional systems fall short.

However, the application of AI in rare disease PV must be handled with care. There are several technical and ethical pitfalls to watch out for when embedding these tools into existing safety operations.

So, where can AI add real value in rare disease PV? And why is human oversight still essential?

Conventional disproportionality statistics and fixed thresholds assume scale. But in rare settings, those thresholds are difficult to apply because the denominator is so tiny.

One way in which AI excels is in pattern recognition across disparate inputs, identifying subtle correlations that wouldn’t trigger standard rules-based systems. In practice, this often looks like models that flag unusual temporal patterns, unexpected symptom clusters across registries and clinic notes or early shifts in severity distribution that warrant human review.

AI systems can also continuously ingest and normalise feeds from electronic health records, safety databases, registries and even patient-reported outcomes to provide real-time situational awareness—enabling earlier detection of an emerging safety profile than periodic manual review alone. For products with limited post-marketing exposure, this is invaluable; it can shorten the time from an unusual pattern emerging to a clinician-led assessment and, if needed, regulatory escalation.

Generative-AI based large language models (LLMs) can extract timing, suspected causality statements, comorbidities and treatment changes from this free text to make unstructured narratives usable. As a result, those data become searchable and comparable without manual re-keying—considerably speeding up triage and aggregation.

Finally, where condition-specific data are too scant to train from scratch, sponsors and MAHs can train models on broader datasets (such as related disease areas or general clinical texts) and then adapt them to the specific rare condition using carefully curated real-world data.

This approach reduces the small-sample risk but must be done responsibly, with stringent clinical oversight, to avoid inappropriate generalisations.

For all its benefits, it would be reckless to leave PV unsupervised in the hands of AI. 

For these reasons, specialist input is essential.

The statistical frameworks used in common PV workflows don’t translate cleanly to ultra-rare settings, so expert clinical adjudication of every case is often required. That clinical input must be embedded into any AI or AI-augmented workflow, not tacked on afterwards; models should be designed to prioritise and summarise evidence for clinical reviewers rather than make final safety determinations.

Then, of course, there is the issue of data protection.

Off-the-shelf AI tools often have complex data-use policies. Some commercial LLMs retain inputs to improve their models unless explicitly contracted not to; others offer paid tiers with different retention terms. PV teams must, therefore, evaluate whether a chosen tool’s data handling practices meet their obligations and whether that vendor risk is acceptable for sensitive patient safety information.

Regulators are still developing their positions on the use of AI in PV activities, often sharing knowledge and experiences across regions. While the agencies establish their positions, sponsors and MAHs can learn from their approach—collaborating with regulators and technologists to design AI solutions that both add clinical value and hold up to regulatory scrutiny.

Existing regulatory frameworks and guidance stress that acceptability depends on a tool’s intended use, validation evidence and the presence of a ‘human-in-the-loop’. Traceability, documented validation and robust data-protection arrangements are essential when AI outputs will inform regulatory decisions.

To give AI the best chance of being useful and defensible to regulators, I’d recommend following these practical deployment principles:

When sponsors and MAHs treat these tools as decision-supporters, not decision-makers, AI can materially improve rare-disease pharmacovigilance.

In practice, that means defining narrow, accountable use cases; curating rational, patient-centred data flows; embedding clinicians and patients throughout model design; investing in explainability and human oversight; and documenting everything clearly for regulators.

Done well, AI integration does not replace human judgement in PV—it makes that judgement faster, better informed and more focused on the safety questions that matter most to patients.

Connect with Lucy

Skip to content