Jon Patrick http://www.it.usyd.edu.au/~jonpat Centre for Health Informatics Research & Development, School of Information Technologies, University of Sydney Australia
Jon Patrick holds the Chair of Language Technology at the University of Sydney. He obtained his PhD from Monash University in Computer Science in 1977. He has four other degrees and is a registered psychologist with an interest in language based therapies, He is the author of the only substantial reference grammar for the teaching of the Basque language in English. In 2005 he was awarded the national Eureka Prize for the development of the Scamseek system for ASIC: a natural language processing system that detects financial scams on the internet, which reputedly saves the public tens of millions of dollars per annum. His research is now focused on language technologies for enhancing clinical information systems and concomitant topics.
Yefeng Wang http://www.it.usyd.edu.au School of Information Technology, University of Sydney Australia
Graeme Miller
Family Medicine Research Centre, University of Sydney Australia
Julie O'Halloran
Family Medicine Research Centre, University of Sydney Australia
Automatic mapping of ICPC-2 PLUS terms to the SNOMED CT terminology
Jon Patrick, Yefeng Wang, Graeme Miller, Julie O'Halloran
Abstract
Achieving interoperability in sharing and exchanging data between health information systems requires the support of standard medical terminology. To integrate standardised terminology into information systems, there is a need to map legacy interface terminology to a reference terminology. In this study, we mapped ICPC-2 PLUS, the interface terminology developed in Australia and classified to the International Classification of Primary Care Version 2, to the SNOMED CT terminology. We have developed a series of automated mapping algorithms to assist humans to perform the mapping. The Unified Medical Language System (UMLS) metathesaurus mapping, which utilises the links between ICPC-2 PLUS and SNOMED CT terms in the UMLS library mapped 46.5% of ICPC-2 PLUS terms to SNOMED CT. Lexical mapping explored the lexical similarities between terms in these two terminologies, and mapped 60.3% of ICPC-2 PLUS terms overall. Post-coordination of remaining unmapped terms was performed, allowing one ICPC-2 PLUS term to be mapped into composition with two SNOMED CT terms, which gives an increase of about 20% in mapped terms. Overall we have mapped 80.58% of ICPC-2 PLUS terms. A manual review of the mapping shows that about 90% of string-based mappings are accurate. Unmapped terms and mismatched terms are due to the differences in the structures between these two terminologies. Also, terms contained in ICPC-2 PLUS but not in SNOMED CT caused a large proportion of failures in the mappings.
Keywords
Medical Terminology; Terminology Planning; ICPC-2 PLUS; SNOMED CT