Frequency-based Rare Events Mining in Administrative Health Data

Jie Chen, Huidong Jin, Hongxing He, Christine M O'Keefe, Ross Sparks, Graham Williams, Damien McAullay, Chris Kelman


The low occurrence rate of adverse drug reactions makes it difficult to identify risk factors from a straightforward application of association pattern discovery in large databases. In this paper, we are interested in developing a data mining approach that can use the information about rare events in sequence data in order to measure the multiple occurrences of patterns in the whole period of target and non-target data. To address this, we define an interestingness measure which exploits the difference between the frequency of patterns in target and non-target sequence data. The proposed approach guarantees the easy generation of candidate patterns from the target sequence data by applying existing association mining algorithms. These patterns can then be evaluated by comparing their frequency in the target and non-target data. We also propose a ranking algorithm that takes into account both the rank of the patterns as determined by the interestingness measure and their supports in the target population. This algorithm can prune the patterns greatly and highlight more interesting results. Experimental results of a case study on Angioedema show the usefulness of the proposed approach.


Adverse Drug Reaction; Temporal Pattern Mining; Administrative Health

Full Text:


= = = eJHI - electronic Journal of Health Informatics - ISSN 1446-4381 = = =