Using Data Mining Techniques for Finding Cardiac Outlier Patients
Abstract:In this paper we used data mining techniques to
identify outlier patients who are using large amount of drugs over a
long period of time. Any healthcare or health insurance system
should deal with the quantities of drugs utilized by chronic diseases
patients. In Kingdom of Bahrain, about 20% of health budget is spent
on medications. For the managers of healthcare systems, there is no
enough information about the ways of drug utilization by chronic
diseases patients, is there any misuse or is there outliers patients. In
this work, which has been done in cooperation with information
department in the Bahrain Defence Force hospital; we select the data
for Cardiac patients in the period starting from 1/1/2008 to
December 31/12/2008 to be the data for the model in this paper. We
used three techniques for finding the drug utilization for cardiac
patients. First we applied a clustering technique, followed by
measuring of clustering validity, and finally we applied a decision
tree as classification algorithm. The clustering results is divided into
three clusters according to the drug utilization, for 1603 patients, who
received 15,806 prescriptions during this period can be partitioned
into three groups, where 23 patients (2.59%) who received 1316
prescriptions (8.32%) are classified to be outliers. The classification
algorithm shows that the use of average drug utilization and the age,
and the gender of the patient can be considered to be the main
predictive factors in the induced model.
 G. Y. H. Lip, K. Peter "New oral anticoagulant drugs in cardiovascular
disease", Thrombosis and Haemostasis. ISSN: 0340-6245. 2010 July.
 World Health Organization, "The World Health Report 2006 - working
together for health", http://www.who.int/whr/2006/en/index.html. 2006.
 Ministry of Health - Kingdom of Bahrain. Annual Report of
 J. Han, M. Kamber, Data Mining: Concepts and Techniques, 2nd
Edition, Morgan Kaufmann, 2006.
 T. Mitchell, Machine Learning, McGraw Hill, 1997.
 J.R. Quinlan: C4.5, Programs for MachineLearning, Morgan Kaufmann,
 M. Last and O. Maimon, "A Compact and Accurate Model for
Classification", IEEE Transactions on Knowledge and Data
Engineering 2004; 16, 2: 203-215.
 O. Maimon and M. Last, Knowledge Discovery and Data Mining - The
InfoFuzzy Network (IFN) Methodology, Kluwer Academic Publishers,
Massive Computing, Boston, December 2000.
 M. Halkidi, Y. Batistakis, M. Vazirgiannis, "On Clustering Validation
Techniques", J. Intell. Inf. Syst. 2001; 17, 2-3: 107-145.
 M. Last, Y. Klein, A. Kandel, "Knowledge Discovery in Time Series
Databases", IEEE Transactions on Systems, Man, and Cybernetics 2001;
31, 1: 160-169.
 J.C. Prather, D.F. Lobach, L.K. Goodwin, J.W. Hales, M.L. Hage, W.E.
Hammond, "Medical Data Mining: Knowledge Discovery in a Clinical
Data Warehouse", Proc AMIA Annu Fall Symp. 1997:101-5.
 Krzysztof J. Cios, Witold Pedrycz, Roman W. Swiniarski, and Lukasz A.
Kurgan "Data Mining: A Knowledge Discovery Approach" ISBN-13:
978-0-387-33333-5; 2007 Springer.
 J.C. Dunn, "Well Separated Clusters and Optimal Fuzzy Partitions", J.
Cybern. 1974; 4: 95-104.
 F. Azuaje, "A Cluster Validity Framework for Genome Expression
Data", Bioinformatics 2002; 18: 319-320.