OntoPharma brings Semantic Web technology to the world of regulatory affairs. The beginning of IDMP readiness is to realise that IDMP is not about regulatory affairs at all: it is about opening up the cosmos of medicinal product data to all fields of science. That is why Semantic Web technology is the corner stone of any healthy IDMP strategy.

VocabularyConnect

Reference data are crucial for semantic interoperability. Our solution for managing these is OntoPharma’s VocabularyConnect reference data management platform.

VocabularyConnect is based on state of the art reference data management tooling. It supports the Data Steward in all management tasks. Vocabularies designated by the EMA, such as the ATC vocabulary for substances, are kept up to date automatically through the RMS API. This is done in autogenerated Working Copies that the Data Steward can release to production whenever opportune.

Other vocabularies in the same terminology group, such as DrugBank codes (which may be needed by systems inside or outside your enterprise), also need to be versioned and kept up to date. The same is true of the crosswalks that translate between items in different vocabularies inside the same terminology group. Managing these vocabularies in a disciplined way is the very essence of IDMP readiness and the start for Web-scale semantic interoperability.

CortexExtract

CortexExtract takes SmPCs and other documents as input and creates IDMP compliant datasets as output. This is an important capability, since about 80% of all the information that needs to be submitted as part of the IDMP-compliant dataset is found in documents. In the longer run, this information will be “born” in the form of data rather than text, but for the hundreds of thousands of existing registered products, the data has to be produced either by manual data entry or semi-automated extraction.

CortexExtract makes semi-automated extraction possible. It uses the vocabularies in VocabularyConnect, enriched with term labels for different languages. Of approximately 75 IDMP Phase 1 attributes, the value of some 55 attributes can be extracted from the SmPC alone. With very high accuracy in 28 languages. And with due attention for the need for machine-supported expert decisions in certain cases. It keeps track of versions of the input documents and of the datasets generated from them.

Check out the demo

Attributes supported and accuracy achieved with our extractor. Product still in development.
Attribute	Accuracy
ATC Code	98,2%
Therapeutic indication	100,0%
Medicinal product name	100,0%
Dose form name part	81,2%
Scientific name part	92,0%
Invented name part	98,0%
Company name part	100%
Strength name part	79,6%
Container name part	76,0%
Time/period name part	---

VerityMIS: The medicinal product data hub

OntoPharma’s VerityMIS Medicinal Product Data Hub is a Master Database containing versioned product data. It is the single source of truth for all systems in your enterprise. The bulk of the legacy data are extracted from textual documents (such as SmPCs). Some data are published to the hub by systems designated as the source for those data. All systems that need product data obtain these from the hub.

VerityMIS is built on Semantic Web technology from the ground up. It links versions of data in the hub to versions of extraction results in CortexExtract, and, hence, to versions of their textual sources.

Using Semantic Web standards, connecting your systems to the hub is relatively cheap and easily done. The crosswalks in VocabularyConnect support translations needed for semantic interoperability. When the target system expects DrugBank codes for substances, the ATC codes in VerityMIS are translated on the fly.