Employing advanced analytics and artificial intelligence in rare-disease pharmacovigilance
Estimated reading time: 7 minutes


The limitations of traditional pharmacovigilance (PV)are becoming more apparent in rare diseases. Established safety frameworks and signal-detection methodologies that work in large populations become difficult to implement when a global patient population may number only in the dozens. Every single case report carries disproportionate weight, and PV teams must make safety decisions with limited context and shifting natural history
As a result of challenges, sponsors and marketing authorisation holders (MAHs) are searching for novel solutions that artificial intelligence (AI) can provide, such as advanced analytics and machine learning tools to fill the gaps. These technologies allow sponsors and MAHs to find patterns in sparse, fragmented datasets and extract meaningful insights where conventional systems fall short.
However, the application of AI in rare disease PV must be handled with care. There are several technical and ethical pitfalls to watch out for when embedding these tools into existing safety operations.
AI tools are not a magic bullet. But when applied thoughtfully with clear purpose, rigorous governance and expert human oversight, they offer pragmatic ways to sharpen detection, speed triage and support better regulatory and clinical decisions—even in ultra-rare indications.
So, where can AI add real value in rare disease PV? And why is human oversight still essential?
Where AI can help in rare-disease PV
Conventional disproportionality statistics and fixed thresholds assume scale. But in rare settings, those thresholds are difficult to apply because the denominator is so tiny.
One way in which AI excels is in pattern recognition across disparate inputs, identifying subtle correlations that wouldn’t trigger standard rules-based systems. In practice, this often looks like models that flag unusual temporal patterns, unexpected symptom clusters across registries and clinic notes or early shifts in severity distribution that warrant human review.
AI systems can also continuously ingest and normalise feeds from electronic health records, safety databases, registries and even patient-reported outcomes to provide real-time situational awareness—enabling earlier detection of an emerging safety profile than periodic manual review alone. For products with limited post-marketing exposure, this is invaluable; it can shorten the time from an unusual pattern emerging to a clinician-led assessment and, if needed, regulatory escalation.
It’s also important to note that a great deal of the clinically relevant detail in rare disease lives in free text, such as specialist letters, case narratives and family-reported histories.
Generative-AI based large language models (LLMs) can extract timing, suspected causality statements, comorbidities and treatment changes from this free text to make unstructured narratives usable. As a result, those data become searchable and comparable without manual re-keying—considerably speeding up triage and aggregation.
Finally, where condition-specific data are too scant to train from scratch, sponsors and MAHs can train models on broader datasets (such as related disease areas or general clinical texts) and then adapt them to the specific rare condition using carefully curated real-world data.
This approach reduces the small-sample risk but must be done responsibly, with stringent clinical oversight, to avoid inappropriate generalisations.

Why human oversight must remain central
For all its benefits, it would be reckless to leave PV unsupervised in the hands of AI.
There is one key practical reason why clinicians and safety teams must stay in the loop; models trained on small datasets are at clear risk of overfitting or producing misleading signals. Standard AI thrives on scale, so a model that appears to ‘work’ on a tiny dataset may simply be memorising idiosyncratic noise. Models can also inherit biases from training data or create misleading outputs (‘hallucinations’) if prompted incorrectly. Generative AI systems, in particular, may produce plausible-sounding but incorrect inferences.
For these reasons, specialist input is essential.
The statistical frameworks used in common PV workflows don’t translate cleanly to ultra-rare settings, so expert clinical adjudication of every case is often required. That clinical input must be embedded into any AI or AI-augmented workflow, not tacked on afterwards; models should be designed to prioritise and summarise evidence for clinical reviewers rather than make final safety determinations.
Investing in explainable models and forcing transparency around model provenance and decision pathways also help teams understand why a flag was raised, not just that it was.
Then, of course, there is the issue of data protection.
Off-the-shelf AI tools often have complex data-use policies. Some commercial LLMs retain inputs to improve their models unless explicitly contracted not to; others offer paid tiers with different retention terms. PV teams must, therefore, evaluate whether a chosen tool’s data handling practices meet their obligations and whether that vendor risk is acceptable for sensitive patient safety information.
How to align AI deployment with evolving regulatory expectations
Regulators are still developing their positions on the use of AI in PV activities, often sharing knowledge and experiences across regions. While the agencies establish their positions, sponsors and MAHs can learn from their approach—collaborating with regulators and technologists to design AI solutions that both add clinical value and hold up to regulatory scrutiny.
Existing regulatory frameworks and guidance stress that acceptability depends on a tool’s intended use, validation evidence and the presence of a ‘human-in-the-loop’. Traceability, documented validation and robust data-protection arrangements are essential when AI outputs will inform regulatory decisions.
To give AI the best chance of being useful and defensible to regulators, I’d recommend following these practical deployment principles:
- Be specific about the context of use. Define what the model is meant to do—triage, summarise literature, compare external data sources or flag emerging patterns. Validation, governance and acceptance criteria all flow from that decision.
- Embed domain experts from day one. The most valuable models are built where domain knowledge guides feature selection, labelling and interpretation. So, don’t leave clinicians, patient representatives and statisticians as downstream reviewers—embed them in the model design.
- Curate and connect real-world data (RWD). Registries, pharmacy touchpoints, physiotherapy records or school health contacts can provide workable, high-value data points that patients themselves may be unable or unwilling to report frequently. Designing data capture that matches patient capability and burden is a sensible way to increase signal quality.
- Keep workflows human-centric. Automation should reduce repetitive work, such as pulling together comparative RWD tables, while leaving interpretation and regulatory judgement to trained humans. In many cases, a model will convert a time-consuming task into a concise briefing for a clinician to review.
- Be pragmatic with thresholds and triggers. Conventional statistical thresholds often fail in rare disease settings. Design bespoke decision rules—and document the rationale—so that regulators can see why a particular alarm level or follow-up cadence was chosen. If using AI in this context, evidence to support its implementation must be available for regulatory review.
- Validate conservatively and transparently. Use conservative holdouts, simulated scenarios and clinician adjudication to ensure models don’t overstate performance. Record limitations clearly; regulators expect an honest account of what a model can and cannot do.
- Train and upskill safety teams. AI changes the skillset required in PV. Teams should be trained to interpret outputs, interrogate model provenance and understand failure modes—reducing the risk of misinterpretation and ensuring the human-in-the-loop adds value.
When sponsors and MAHs treat these tools as decision-supporters, not decision-makers, AI can materially improve rare-disease pharmacovigilance.
In practice, that means defining narrow, accountable use cases; curating rational, patient-centred data flows; embedding clinicians and patients throughout model design; investing in explainability and human oversight; and documenting everything clearly for regulators.
Done well, AI integration does not replace human judgement in PV—it makes that judgement faster, better informed and more focused on the safety questions that matter most to patients.
Connect with Lucy
in the know brings you the latest conversations from the RARE think tank. To access more in the know articles click below.