A survey of data mining algorithms used in cardiovascular disease diagnosis from multi-lead ECG data


  • DIANA MOSES Department of Computer Science and Engineering, Thiagarajar College of Engineering,Madurai – 625015, India


Cardiovascular disease, data mining algorithms review, ECG analysis, telecardiology.


Remote cardiovascular disease (CVD) diagnosis from ECG plays an important role in healthcare domain. Data mining, the major step in the process of the extraction of knowledge usingdescriptive and predictive algorithms that aid in making proactive decisions, has also been usedfor CVD diagnosis. Recently, diverse techniques have been developed for analyzing the ECGsignals. However, due to the diversity of techniques used, terminologies, performance measuresused in different techniques makes analysis and comparing of results thwarting. The aim of thiswork is to essentially explore and present the analysis of different data mining algorithmsproposed earlier in literature for CVD diagnosis, their advantages and limitations. This paperpresents various techniques for CVD diagnosis using data mining from an ECG signal underfour major phases – ECG Acquisition, ECG Compression, ECG Feature Extraction and ECGdiagnosis. The primary aim of this paper is to categorize the various researches done in thisregard to provide a glossary for interested researchers and to aid in identifying their potentialresearch direction.

Author Biography

DIANA MOSES, Department of Computer Science and Engineering, Thiagarajar College of Engineering,Madurai – 625015, India

Research Scholar, Department of Computer Science & Engineering


Abawajy, J.H., Kelarev, A.V. & Chowdhury, M. 2013. Multistage approach for clustering and

classification of ECG data, Computer Methods and Programs in Biomedicine, 112:720–30.

Abenstein, J.P. & Tompkins, W.J. 1982. A new data reduction algorithm for real time ECG analysis.

IEEE Transactions on Biomedical Engineering, BME 29(1):43–8.

Acharya, U.R., Faust, O., Kadri, N A., Suri, J S. & Yu, W., 2013. Automated identification of normal

and diabetes heart rate signals using nonlinear measures, Computers in Biology and Medicine,


Addison, P.S. 2005. Wavelet transforms and the ECG: a review, Physiological Measurement, 26:R155

Alkoot, F.M. 2014. Mltimodal biometric authentication using adaptive decision boundaries, Kuwait

Journal of Science, 41(3):103-27.

Andreão, R.V., Muller, S.M., Boudy, J., Dorizzi, B., Bastos-Filho, T.F. & Sarcinelli-Filho, M. 2008.

Incremental HMM training applied to ECG signal analysis. Computers in Biology and Medicine,


Ashkenazy, Y., Havlin, S., Ivanova, P.C., Peng, C.K., Schulte-Frohlinde, V H. & Stanley E. 2003.

Magnitude and sign scaling in power-law correlated time series, Physica A: Statistical Mechanics

and its Applications, 323:19 – 41.

Babloyantz, A. & Maurer, P. 1996. A graphical representation of local correlations in time series -

Assessment of cardiac dynamics, Physics Letters A, 221:43-55.

Batista, L.V., Melcher, E.U. & Carvalho, L.C., 2001. Compression of ECG signals by optimized

quantization of discrete cosine transform coefficients, Medical Engineering & Physics, 23(2):127–

Benzid, R., Messaoudi, A. & Boussaad, A. 2008. Constrained ECG compression algorithm using the

block-based discrete cosine transform, Digital Signal Processing, 18:56–64.

Bousseljot, R., Kreiseler, D. & Schnabel, A. 1995. Nutzung der EKG-Signaldatenbank CARDIODAT

der PTB über das Internet, Biomedizinische Technik, Band 40, Ergänzungsband, 40:317–8.

Breiman, L., Friedman, J., Olshen, R. & Stone, C., 1984. Classification and Regression Trees.

Wadsworth Int. Group

Bukkapatnam, S., Komanduri, R., Yang, H., Rao, P., Lih, W.C. & Malshe, M. 2008. Classification

of atrial fibrillation episodes from sparse electrocardiogram data, Journal of Electrocardiology,


Bukkapatnam, S., Yanga, H., Leb, T. & Komanduri, R. 2012. Identification of myocardial infarction

MI) using spatio-temporal heart dynamics, Medical Engineering & Physics, 34(4):485-97.

Castro, B., Kogan, D. & Geva, A.B. 2005. ECG feature extraction using optimal mother wavelet. The

st IEEE Convention of the Electrical and Electronic Engineers in Israel, 346-50.

Ceylan, R., Özbay, Y. & Karlik, B. 2010. Telecardiology and Teletreatment System Design for Heart

Failures Using Type-2 Fuzzy Clustering Neural Networks, International Journal of Artificial

Intelligence and Expert Systems, 1(4):100-10.

Chawla, M.P.S. 2007. Parameterization and R-peak error estimations of ECG signals using independent

component analysis, Computational and Mathematical Methods in Medicine, 8(4):263–85.

Chen, T., Mazomenos, E.B., Maharatna, K., Dasmahapatra, S. & Niranjan, M. 2013. Design of

a Low-Power On-Body ECG Classifier for Remote Cardiovascular Monitoring Systems, IEEE

Journal of Emerging Selected Topics in Circuits and Systems, 3(1):75-85.

Cheng, Z., Yu, P.S. & Bell, D. 2010. Introduction to the domain-driven data mining. Special section,

IEEE Transactions on Knowledge and Data Engineering, 22(6):53-4.

Christov, I., Gómez-Herrero, G., Krasteva, V., Jekova, I., Gotchev, A. & Egiazarian, K. 2006.

Comparative study of morphological and time-frequency ECG descriptors for heartbeat

classification, Medical Engineering and Physics, 28:876-87.

Corduas, M. & Piccolo, D. 2008. Time series clustering and classification by the autoregressive metric,

Computational Statistics and Data Analysis, 52:1860 – 72.

Cox, J.R., Nolle, F.M., Fozzard, H.A. & Oliver, G.C. 1968. AZTEC. A preprocessing program for real

time ECG rhythm analysis, IEEE Transactions on Biomedical Engineering, BME-15(4):128–9.

DiPersio, D.A. & Barr, R.C. 1985. Evaluation of the FAN method of adaptive sampling on human

electrocardiograms, Medical & Biological Engineering & Computing, 23(5):401–10.

Dumont, J., Hern´andez, A.I., Fleureau, J. & Carrault, G. 2008. Modeling temporal evolution of

cardiac electrophysiological features using Hidden Semi-Markov Models, Annual Intl. Conf. of the

IEEE Engg in Med. and Biology Society: Personalized Healthcare through Technology.

Fayn J. 2011. A Classification Tree Approach for Cardiac Ischemia Detection Using Spatiotemporal

Information From Three Standard ECG Leads, IEEE Transactions on Biomedical Engineering,


Fisch, D., Gruber, T. & Sick, B. 2011. SwiftRule: Mining Comprehensible Classification Rules for Time

Series Analysis, IEEE Transactions on Knowledge and Data Engineering, 23(5):774-87.

Fu, T C . 2011. A review on time series data mining, Engineering Applications of Artificial Intelligence,e,

:164 –181.

Fuchs, E., Gruber, T., Nitschke, J. & Sick, B. 2009. On-Line Motif Detection in Time Series with

SwiftMotif, Pattern Recognition, 42(11):3015-31.

Fuchs, E., Gruber, T., Nitschke, J. & Sick, B. 2010. Online Segmentation of Time Series Based on

Polynomial Least-Squares Approximations, IEEE Transactions on Pattern Analysis and Machine

Intelligence, 32(12):2232-45.

Ge, D., Srinivasan, N. & Krishnan, S.M. 2002. Cardiac arrhythmia classification using autoregressive

modeling, BioMedical Engineering OnLine, doi: 10.1186/1475-925X-1-5

Giri, U.D., Acharya, R., Martis, R.J., Sree, S.V., Lim, T.C., Ahamed V.I.T & Suri, J.S. 2013.

Automated diagnosis of Coronary Artery Disease affected patients using LDA, PCA, ICA and

Discrete Wavelet Transform, Knowledge Based Systems, 37:274-82.

Glickman, S.W., Shofer, F.S., Wu, M.C., Scholer, M.J., Ndubuizu, A. & Peterson, E.D. 2012.

Development and validation of a prioritization rule for obtaining an immediate 12-lead

electrocardiogram in the emergency department to identify ST-elevation myocardial infarction,

American Heart Journal, 163(3):372-82.

Goldberger, A.L., Amaral, L.A.N., Glass, L., Hausdorff, J.M., Ivanov, P.Ch., Mark, R.G., Mietus,

J.E., Moody, G.B., Peng, C-K. & Stanley, H.E. 2000. PhysioBank, PhysioToolkit, and PhysioNet:

Components of a New Research Resource for Complex Physiologic Signals, Circulation,101(23):e215-e220

Goudarzi, M.M., Moradi, M.H. & Taheri, A. 2005. Efficient Method for ECG Compression Using Two

Dimensional Multiwavelet Transform, Proceedings of Word Academy of Science, Engineering and

Technology, 2:10-4.

Hailey, D., Ohinmaa, A. & Roine, R. 2004. Evidence for the benefits of telecardiology applications: a

systematic review, Edmonton, Alberta Heritage Foundation for Medical Research, 34:1- 60.

Health Informatics. 2005. Standard Communication Protocol. Computer assisted Electrocardiography,

British-Adopted European Standard BS EN 1064, 2005.

Iliopoulos, C.S. & Michalakopoulos, S, 2010. Combinatorial ECG Analysis for Mobile Devices. Proc.

of MIR’10, Philadelphia, USA, March 2010,29-31.

Iliopoulos, C.S. & Michalakopoulos, S. 2011. A Combinatorial Model for ECG Interpretation,

International Journal of Biological Sciences, 7:10-14.

Iwata, A., Nagasaka, Y. & Suzumura, N. 1990. Data compression of the ECG using neural network for

digital Holter monitor, IEEE Engineering in Medicine & Biology, 9(3):53–7.

Jalaleddine, M.S., Chriswell, G.H., Strattan, R.D. & Coberly, W.A. 1990. ECG Data Compression

Techniques-A Unified Approach, IEEE Transactions on Biomedical Engineering, 37(4):329-43.

Saini, I., Singh, D. & Khosla, A. 2013. QRS detection using K-Nearest Neighbor algorithm KNN)

and evaluation on standard ECG databases, Journal of Advanced Research, 4:331–44.

Jumaa, H., Fayn, J. & Rubel, P., 2008. XML based mediation for automating the storage of SCP-ECG

data into relational databases, Computers in Cardiology, 35:445-8.

Kalpakis, K., Gada, D. & Puttagunda, V. 2001. Distance measures for effective clustering of ARIMA

time series, Proc. IEEE Int. Conf. Data Mining, 273–80.

Karpagachelvi, S., Arthanari, M. & Sivakumar, M. 2010. ECG Feature Extraction Techniques - A

Survey Approach, International Journal of Comput Science and Information Security, 8(1):76-80.

Khashei, M., Hamadani, A Z. & Bijari, M. 2012. A novel hybrid classification model of artificial neural

networks and multiple linear regression models, Expert Systems with Applications, 39(3):2606–20.

Kiranyaz, S., Ince, T., Pulkkinen, J. & Gabbouj, M. 2011. Personalized long-term ECG classification:

A systematic approach, Expert Systems with Applications, 38:3220–6.

Krivokapich, J., Child, J S., Walter, D O. & Garfinkel, A. 1999. Prognostic Value of Dobutamine Stress

Echocardiography in Predicting Cardiac Events in Patients with Known or Suspected Coronary

Artery Disease, Journal of the American College of Cardiology, 33(3):708-16.

Ku, C.T., Hung, K.C., Wu, T.C. & Wang, H.S. 2010. Wavelet-Based ECG Data Compression System with

Linear Quality Control Scheme, IEEE Transactions on Biomedical Engineering, 57(6):1399-409.

Kumar, D.S., Sathyadevi, G. & Sivanesh, S. 2011. Decision Support System for Medical Diagnosis

Using Data Mining, International Journal of Computer Science Issues, 8(3):147-53.

Kumar UA, 2013: http://www.indianexpress.com/India-has-just-one-doctor-for-every-1700-peopleKutlu,

Y. & Kuntalp, D. 2011. A multi-stage automatic arrhythmia recognition and classification system,

Computers in Biology and Medicine, 41:37–45.

Lanatá, A., Valenza, G., Mancuso, C. & Scilingo, E P. 2011. Robust multiple cardiac arrhythmia

detection through bispectrum analysis, Expert Systems with Applications, 38:6798–804.

Lei, W.K., Li, B.N., Dong, M.C. & Fu, B.B. 2008. An Application of Morphological Feature Extraction

and Support Vector Machines in Computerized ECG interpretation, Sixth IEEE Mexican

International Conference on Artificial Intelligence, Special Session

Lin, C.T., Chang, K.C., Lin, C.L., Chiang, C.C., Lu, S.W. & Chang, S.S. 2010. An Intelligent

Telecardiology System Using a Wearable and Wireless ECG to Detect Atrial Fibrillation, IEEE

Transactions on Information Technology in Biomedicine, 14(3):726-33.

Lu, Z., Kim, D.Y. & Pearlman, W.A. 2000. Wavelet compression of ECG signals by set partitioning in

hierarchical trees algorithm, IEEE Transactions on Biomedical Engineering, 47(7):849–56.

Luca, G.D., Suryapranata, H., Ottervanger, J.P. & Antman, E.M. 2004. Time delay to treatment and

mortality in primary angioplasty for acute myocardial infarction: Every minute of delay counts,

Circulation, 109:1223–5.

Macek, J. 2005. Incremental Learning of Ensemble Classifiers on ECG Data, Proc. of the 18th IEEE

Symposium on Computer-Based Medical Systems.

Maglaveras, N., Stamkopoulos, T., Diamantaras, K., Pappas, C. & Strintzis, M. 1998. ECG pattern

recognition and classification using non-linear transformations and neural networks: A review,

International Journal of Medical Informatics, 52:191–208.

Martis, R.J., Chakraborty, C. & Ray, A.K. 2009. A two-stage mechanism for registration and

classification of ECG using Gaussian mixture model, Pattern Recognition, 42(11):2979-88.

Martis, R.J., Acharya U.R., Prasad, H., Chua, C.K., Lim, C.M. & Suri, J.S. 2013. Application

of higher order statistics for atrial arrhythmia classification, Biomedical Signal Processing and

Control, 8:888– 900.

Mehta, S.S. & Lingayat, N.S. 2008. Support Vector Machine for Cardiac Beat Detection in Single Lead

Electrocardiogram, IAENG International Journal of Applied Mathematics, 36(2):1-7.

Mehta, S.S. & Lingayat, N.S. 2009. Identification of QRS complexes in 12-lead electrocardiogram,

Expert Systems with Applications, 36:820–8.

Minas, A.K., Moutiris, J.A., Hadjipanayi, D. & Pattichis, C.S. 2010. Assessment of the risk factors of

coronary heart Events based on data mining with decision trees, IEEE Transactions on Information

Technology in Biomedicine, 14(3):559-66.

Mishra, A.K. & Raghav, S. 2010. Local fractal dimension based ECG arrhythmia classification,

Biomedical Signal Processing and Control, 5:114–23.

Moavenian, M. & Khorrami, H. 2010. A qualitative comparison of Artificial Neural Networks and

Support Vector Machines in ECG arrhythmias classification, Expert Systems with Applications,

: 3088–93.

Moody, G.B., Mark, R.G. & Goldberger, A.L. 1988. Evaluation of the ''TRIM'' ECG data compressor,

Computers Cardiology, 15:167-70.

Moody, G.B. & Mark, R.G. 2001. The Impact of the MIT-BIH Arrhythmia Database, IEEE Engineering

in Medicine & Biology, 20(3):45-50.

Moody, G.B. 2004. Spontaneous Termination of Atrial Fibrillation: A Challenge from PhysioNet and

Computers and Cardiology, Computers in Cardiology, 31:101-4.

Nikolopoulos, S., Alexandridi, A., Nikolakeas, S. & Manis, G. 2003. Experimental analysis of heart

rate variability of long-recording electrocardiograms in normal subjects and patients with coronary

artery disease and normal left ventricular function. Journal of biomedical informatics, 36(3): 202-17.

Olvera, F.E. 2006. Electrocardiogram Waveform Feature Extraction Using the Matched Filter, Statistical

Signal Processing, II,1-6.

Pan, H.S., & Tompkins, W.J. 1986. Quantitative Investigation of QRS Detection rules using the MIT

BIH Arrhythmia Database, IEEE Transactions on Biomedical Engineering, BME-33(12), 1157-65.

Patel, V., Chatterji, S., Chisholm, D., Ebrahim, S., Gopalakrishna, G., Mathers, C., Mohan, V.,

Prabhakaran, D., Ravindran, R.D. & Reddy, K.S. 2011. Chronic diseases and injuries in India,

The Lancet, 377(9763): 413-28.

Patra, D., Das, M.K. & Pradhan, S. 2005. Integration of FCM, PCA and Neural Networks for

Classification of ECG Arrhythmias, IAENG International Journal of Computer Science, 36(3):1-5.

Pecchia, L., Melillo, P., Sansone, M. & Bracale, M. 2011. Discrimination Power of Short-Term Heart

Rate Variability Measures for CHF Assessment, IEEE Transactions on Information Technology in

Biomedicine, 15(1):40-6.

Pecchia, L., Melillo, P. & Bracale, M. 2011. Remote health monitoring of heart failure with data mining

via CART method on HRV features, IEEE Transactions Biomedical Engineering, 58(3):800-4.

Pooyan, M., Taheri, A., Moazami-goudarzi, M. & Saboori. 2005. Wavelet Compression of ECG Signals

Using SPIHT Algorithm, World Academy of Science, Engineering and Technology, 2:705-8.

Reddy, B.R.S. & Murthy, I.S.N. 1986. ECG data compression using Fourier descriptors, IEEE

Transactions on Biomedical Engineering, BME 33(4):428–34.

Roche, F., Pichot, V., Sforza, E., Court-Fortune, I., Duverney, D. & Costes, F. 2003. Predicting sleep

apnoea syndrome from heart period: a time-frequency wavelet analysis, European Respiratory

Journal, 22:937–42.

Ros, E., Mota, S., Fernández, F.J., Toro, F.J. & Bernier, J.L., 2004. ECG Characterization of paroxysmal

atrial fibrillation: parameter extraction and automatic diagnosis algorithm, Computers in Biology

and Medicine, 34:679–96.

Sathyadevi, G. (2011, June). Application of CART algorithm in hepatitis disease diagnosis. In Recent

Trends in Information Technology (ICRTIT), 2011 International Conference on (pp. 1283-1287).


Shih, D.H., Chiang, H.S., Lin, B. & Lin, S.B. 2010. An Embedded Mobile ECG Reasoning System for

Elderly Patients, IEEE Transactions on Information Technology Biomedicine, 14(3):854-65

Steinberg, C.A., Abraham, S. & Caceres, C.A. 1962. Pattern Recognition in the Clinical

Electrocardiogram, IRE Transactions on Bio-Medical Electronics, 9(1): 23-30.

Sufi, F., Fang, Q., Khalil, I. & Mahmoud, S.S. 2009. Novel Methods of Faster Cardiovascular Diagnosis

in Wireless Telecardiology, IEEE Journal on Selected Areas in Communications, 27(4):537-52.

Sufi, F. & Khalil, I. 2011. Diagnosis of Cardiovascular Abnormalities from Compressed ECG: A Data

Mining-Based Approach, IEEE Transactions on Information Technology in Biomedicine, 15(1):33-9.

Sufi, F. & Khalil, I. 2011. Faster person identification using compressed ECG in time critical wireless

telecardiology applications, Journal of Network and Computer Applications, 34:282–93.

Tohumoglu, G. & Sezgin, E. 2007. ECG signal compression by multi-iteration EZW coding for different

wavelets and thresholds, Computers in Biology and Medicine, 37:173-82.

UCI Machine Learning Repository. Available from: http://www.ics.uci.edu/~mlearn/MLRepository.

html. Accessed on 24 February 2015.

Wang, J.S., Lin, C.W. & Yang, Y.T.C. 2013. A k-nearest-neighbor classifier with heart rate variability

feature-based transformation algorithm for driving stress recognition, Neurocomputing, 116:136–43.

Welch, T.A. 1984. A technique for high-performance data compression, IEEE Computer, 17(6):8–19.

Womble, M.E., Halliday, J.S., Mitter, S.K., Lancaster, M.C. & Triebwasser, J.H. 1977. Data

compression for storing and transmitting ECGs/VCGs, Proc. IEEE May, 65(5):702–6.

Xu, R. & Wunsch, D.C. 2010. Clustering Algorithms in Biomedical Research: A Review, IEEE Reviews

in Biomedical Engineering, 3:120-54.

Yeh, Y.C., Wang, W.J. & Chiou, C.W. 2009. Heartbeat Case Determination Using Fuzzy Logic Method

on ECG Signals, International Journal of Fuzzy Systems, 11(4):250-61.

Yeh, Y.C., Chiou, C.W. & Lin, H.J. 2012. Analyzing ECG for cardiac arrhythmia using cluster analysis,

Expert Systems with Applications, 39:1000–10.

Yeragania, V.K. & Rao, R. 2003. Effect of nortriptyline and paroxetine on measures of chaos of heart

rate time series in patients with panic disorder, Journal of Psychosomatic Research, 55:507– 13.

Yildiz, A., Akın, M. & Poyraz, M. 2011. An expert system for automated recognition of patients with

obstructive sleep apnea using electrocardiogram recordings, Expert Systems with Applications,


Yoo, J., Yan, L., Lee, S., Kim, H. & Yoo, H.J. 2009. A Wearable ECG Acquisition System with Compact Planar-Fashionable Circuit Board-Based Shirt, IEEE Transactions on Information Technology in

Biomedicine, 13(6):897-902.

Young, T.Y. & Huggins, W.H. 1963. On the Representation of Electrocardiograms, IEEE Trans Biomedical

Electronics, 10(3):86-95.

Zheng, H., Wang, H.Y., Black, N.D. & Winder, R.J. 2010. Data structures, coding and classification,

Technology and Health Care, 18(1):71-87.