pISSN: 2723 - 6609 e-ISSN: 2745-5254
Vol. 5, No. 5 Mei 2024 http://jist.publikasiindonesia.id/
Jurnal Indonesia Sosial Teknologi, Vol. 5, No. 5, Mei 2024 2311
Classification Of Malaria Types Using Naïve Bayes
Classification
Hadin La Ariandi
1*
, Arief Setyanto
2
, Sudarmawan
3
Universitas Amikom Yogyakarta, Indonesia
1*
2
,
3
*Correspondence
ABSTRACT
Keywords: Naive Bayes
Classification;
Malaria Type
Classification;
Expert System for Malaria
Diagnosis.
This study was conducted to determine the level of accuracy
of the naïve Bayes classification method in determining the
group type of malaria. This method predicts the malaria
category based on the symptoms displayed. This study
divided the dataset used into 60% for training and 40% for
testing. The results showed that the naïve Bayes algorithm
had an accuracy rate of 99.8% in predicting malaria
categories. Model performance evaluation using confusion
matrix and ROC curve also showed promising results, with
classification accuracy of 0.998, error 0.002, and AUC
0.999. The results of the classification report show that the
Quartana, Tertiana, and Tropica categories are more
dominant than the Ovale categories based on precision,
recall, and f1-score. These results show that the naïve Bayes
classification method is effective in classifying types of
malaria and can be used to diagnose malaria.
Introduction
Malaria is a disease caused by inflammation of protozoa of the genus Plasmodium
and is easily recognised by signs of heat, cold, chills, and continuous chills (Dinata, 2018).
Malaria is one of the most widespread mosquito-borne diseases (Madhusudan, 2020).
Disease caused by inflammation of protozoa from the genus Plasmodium is transmitted
through the intermediaries of various vector genera Anopheles (Alviyanil’Izzah et al.,
2021). Malaria is still a threat to public health status, especially to people living in remote
areas. This is reflected in the issuance of Presidential Regulation Number: 2 of 2015
concerning the National Medium-Term Development Plan for 2015 - 2019, where malaria
is a priority disease that needs to be overcome and in RPJMN IV for 2020-2024 it is also
stated that the prevalence of major infectious diseases, one of which is malaria is still high
accompanied by the threat of emerging diseases due to high population mobility so that
it affects the degree of public health (Ramadhan & Khoirunnisa, 2021). This commitment
to malaria control is expected to be of concern to all of us nationally, regionally, and
globally, as produced at the 60th World Health Assembly (WHA) meeting in Geneva in
2007 on malaria elimination (Prajarini, 2016).
Hadin La Ariandi, Arief Setyanto, Sudarmawan
Jurnal Indonesia Sosial Teknologi, Vol. 5, No. 5, Mei 2024 2312
To the World Health Organization (World Health Organization), malaria can be
classified into 5, namely plasmodium falciparum, which causes tropical malaria;
plasmodium vivax, which causes malaria Persian; plasmodium ovale, which causes maria
ovale; plasmodium malaria According According According to causes quaternary
malaria, and plasmodium knowlesi causes malaria (Madhusudan, 2020). Malaria is
categorised as one of the diseases with effects and a reasonably large mortality rate. The
World Health Organization (World Health Organization) recorded 229 million malaria
problems and 409. 000 deaths were registered in 2019. Areas at risk are mainly in Africa,
but Southeast Asia, the Western Pacific, and the Mediterranean are also listed as areas at
risk. Each country strives to overcome malaria cases by referring to the comprehensive
commitment in the 60th World Health Assembly (WHA) in 2007 regarding malaria
elimination (Jiang et al., 2021).
The objectives of this study are:
1. Knowing the level of accuracy of the naïve Bayes classification method in determining
the group of types of malaria.
2. Knowing how many results are accurate and the performance of malaria types using
the naïve Bayes algorithm.
3. Prove whether the naïve Bayes classification method effectively classifies malaria
types.
Research Benefits
With the research that will be held, several hopes for the results of this research can
be helpful and play an essential role in adding insight into science. The benefits obtained
by conducting this research are as follows:
1. Mitigating and assisting the performance of medical professionals in classifying types
of malaria.
2. Provide information on the level of accuracy in the process of classifying malaria.
3. Adding insight for readers who want to learn naïve Bayes classification.
Research Methods
Researchers use quantitative research, a process of mathematical calculations, to
achieve the desired results. In this case, the dataset was compared with the Naïve Bayes
algorithm to find the most malaria-related impacts in each Puskesmas in Irian Jaya.
Nature of Research
The nature of the research carried out is experimental. It conducts a research
experiment to obtain accurate results or parameters by comparing the Naïve Bayes
algorithm. The accuracy results obtained from the comparison can be used to make
decisions about determining the feasibility of lending.
Research Approach
This research approach is quantitative, and researchers conduct research by the
stages or lines of research that have been made.
Data Collection Methods
Classification Of Malaria Types Using Naïve Bayes Classification
Jurnal Indonesia Sosial Teknologi, Vol. 5, No. 5, Mei 2024 2313
The data used in this study is obtained directly from the Darun Nahdla Capita Sharia
Cooperative and includes private data that has not been used in previous studies. The data
used in this study is from datasets from cooperative customer data from 2020 to 2022,
totalling 166 data points with 10 variables: gender, marital status, occupation, dependents,
income, loan amount, term, interest, instalments, and categories.
Data Analysis Methods
The data analysis method for this study is quantitative, while the data analysis
method follows the stages in the knowledge discovery in database (kdd) process used in
this study using Excel software tools and orange tools as follows:
Research Flow
Figure 1
Research Flow
Results and Discussion
Preprocessing Data
The data preprocessing stage is carried out to clean duplicate data, missing values,
and outliers in the dataset so that they are valid during the data processing. At this stage,
data transformation is also carried out by analysing variables that do not have contributive
information to make predictions and converting object-type data into integer form to
facilitate the data processing process. The following data preprocessing process uses
Jupyter Notebook software with Python programming language (Lestari et al., 2018).
Hadin La Ariandi, Arief Setyanto, Sudarmawan
Jurnal Indonesia Sosial Teknologi, Vol. 5, No. 5, Mei 2024 2314
The first step is to import the library that will be used to display the dataset using
the numpy and Pandas methods, which can be seen in the code below.
import numpy as np
import pandas as PD
import matplotlib.pyplot as plt
import seaborn as sns
The second step is to call the CSV format dataset into the data frame with the
PD.read_csv function and display the dataset, code and output results, as shown in Figure
2 below.
filecsv='Dataset_Patient_Malaria.CSV
teks = pd.read_csv(files, header = 0, delimiter= ';', encoding='utf-8')
df=pd.DataFrame(teks)
print(df)
df.head()
output:
Figure 2 Import Research Dataset
Figure 2 shows the 37 dataset variables used in this study, and several are
unnecessary, such as No, province, district, health facility, and patient name.
The third step deletes the columns not needed for the next process and the columns
to be deleted.
columns = ['No.','Provinsi ', 'Kabupaten','Fasyankes','Nama Pasien']
copy = df
dfClean = dfCopy.drop(columns, inplace=True, axis=1)
list(df.columns)
After deleting the columns that are not needed, the following columns will be used
for the following process: type of discovery, number, month/year, gender, pregnant / not
pregnant, hamlet address, village kelurahan, type of parasite, symptoms1, symptoms2,
Classification Of Malaria Types Using Naïve Bayes Classification
Jurnal Indonesia Sosial Teknologi, Vol. 5, No. 5, Mei 2024 2315
symptoms3, symptoms4, symptoms5, symptoms6, symptoms7, symptoms8, symptoms9,
symptoms10, livestock sheds, leaving the house at night, use of mosquito repellent,
ventilation gauze, puddles, history of living in endemic areas, the use of mosquito nets,
walls, the state of the house sky, mosquito breeding grounds, air temperature (°C),
humidity (%), rainfall (mm), malaria diagnosis (Shofia, Putri, & Arwan, 2017).
The fourth step separates variables into category and number variables using the
following code command:
#untuk define category variables
categorical = [var for var in pdf. columns if df[var].dtype=='O']
Output:
Discovery Type', 'Month/Year', 'Gender', 'Pregnant/Not Pregnant', 'Dusun_Alamat',
'Village Village', 'Parasite Type', 'Symptoms1', 'Symptoms2', 'Symptoms3', 'Symptoms4',
'Symptoms5', 'Symptoms6', 'Symptoms7', 'Symptoms8', 'Symptoms9', 'Symptoms10',
'Kandang_Ternak', 'Night rumah_pada Exit', 'Mosquito Obat_Anti Use',
'Kassa_Ventilasi', 'Genangan_Air', 'History of tinggal_di endemic areas',
'Penggunaan_Kelambu', 'Walls', 'House sky conditions', 'Mosquito Breeding Sites',
'Diagnosa_Malaria']
#to define a number variable
numerical = [var for var in pdf.columns if df[var].dtype!='O']
output:
['Number', 'Air Temperature (°C)', 'Humidity (%)', 'Rainfall (mm)']
Next, do data cleaning to clean up duplicate data or unused variables, missing
values and outliers. The code and output results can be seen in Figure 3 below.
df[categorical].isnull().sum()
df[numerical].isnull().sum()