Wilfrid Laurier University
CP 421
Exploring Data Mining Techniques in COVID-19 Research
Alexandros Ioannou#1, Cameron Anderson$2, Lubna Al Rifaie*3, Luke Aikman^4
[email protected], [email protected], [email protected], [email protected]
Abstract—The COVID-19 pandemic posed unprecedented and
unpredictable challenges across the world, requirin
...[Show More]
Exploring Data Mining Techniques in COVID-19 Research
Alexandros Ioannou#1, Cameron Anderson$2, Lubna Al Rifaie*3, Luke Aikman^4
1[email protected], 2[email protected], 3[email protected], 4[email protected]
Abstract—The COVID-19 pandemic posed unprecedented and
unpredictable challenges across the world, requiring the
deployment of complex data mining techniques to manage and
mitigate the impact of the virus. This paper reviews recent
research works that apply various data mining methods, such as
Natural Language Processing (NLP), supervised learning
techniques, clustering algorithms, frequent itemsets, and
association rules, to address the big data challenges within the
healthcare sector during the COVID-19 crisis. By analyzing these
methods’ applications in tracking the virus spread, predicting
outcomes, and enhancing healthcare responses, we aim to identify
existing data mining problems and propose viable solutions. The
insights from this study show the potential of data mining in
revolutionizing healthcare monitoring and information
technology, particularly in managing pandemic situations.
Keywords—Data mining(DM), COVID-19, Healthcare
monitoring, Clustering algorithms, Frequent itemsets,
Association rules, Big Data, Predictive analytics.
I. INTRODUCTION
The emergence of the COVID-19 pandemic has not only
brought up a global health crisis but also led to an increased
dependence on data science and informatics in public health
responses. The vast amounts of data produced by health
monitoring systems, contact tracking, and case reporting offer
both opportunities and challenges. Understanding the
complexity of COVID-19 data has become more dependent on
data mining techniques, which have historically been essential
in extracting relevant information from huge datasets. To
support well-informed healthcare decisions and public health
policies this study explores the various applications of these
techniques – ranging from NLP and supervised learning to
association rules – in filtering through data linked to
pandemics.
One of the methods in the reviewed papers was the use of
Natural Language Processing. This method was a key tool for
tracking misinformation, analyzing sentiment on social media,
and compiling clinical data to comprehend the dynamics of the
pandemic. Similarly, significant effectiveness was shown by
supervised learning algorithms in forecasting infection rates,
and patient outcomes, and identifying high-risk groups, which
allowed for more focused interventions. Also, clustering
algorithms have made it possible to stratify patient groups and
implement individualized treatment plans by providing
insights into patient symptomatology and disease
development.
However, the deployment of data mining in healthcare,
particularly in a crisis of this magnitude, is full of challenges.
Data quality and availability remain significant obstacles, with
the heterogeneity of data sources and formats complicating
[Show Less]