Feasible Prediction of Multiple Diseases using Machine Learning

. Automated Multiple Disease Prediction System using Machine Learning is an advanced healthcare application that utilizes machine learning algorithms to accurately predict the likelihood of a patient having multiple diseases based on their medical history and symptoms. The system employs a comprehensive dataset of medical records and symptoms of various diseases, which are then analysed using machine learning techniques such as decision trees, support vector machines, and random forests. The system's predictions are highly accurate, and it can assist medical professionals in making more informed decisions and providing better treatment plans for patients. Ultimately, the viable Multiple Disease Prediction System using Machine Learning has the potential to improve healthcare outcomes and reduce healthcare costs by predicting and preventing disease early.


Introduction
Machine Learning (ML) gives computers the capability to learn without explicitly programmed.ML is the technology that no one come across.It is apparent from its name that makes the computer to work more similar to humans -the learning ability.
Arthur Samuel coined the term machine learning was coined in 1959.The term 'machine learning' discusses gathering information that may be measured or evaluated and then used to train a machine learning model.How effectively a machine-learning model works is significantly influenced by the quantity and quality of data utilized for training and testing.Data can take many different formats, including numerical, category, or time-series data, and can originate from a range of sources, including databases, spreadsheets, and APIs.Machine learning algorithms use data to find relationships and patterns between input parameters.Applications of ML are as follows: 1. Google Maps 2. Linkedin 3. Facebook 4. Medical Imaging 5. Art Restoration 6. Forensic Science 7. Robotics 8. Satellite Imaging The advantages of ML are as follows: 1. Ability to learn complex representations 2. End-to-end learning 3. Improved accuracy 4. Improved performance 5. Scalability and parallel processing 6.Cost computation Developing a Machine learning implementation comes with its fair share of challenges and pitfalls.Firstly, acquiring and preparing a high-quality dataset can be time-consuming and expensive, especially for niche domains.Secondly, choosing the right architecture and hyper parameters requires extensive experimentation and can be computationally demanding.Thirdly, over fitting is a common pitfall, where the model performs well on training data but fails to generalize to unseen examples.Fourthly, training deep learning models often requires significant computational resources.Lastly, deploying and scaling learning models in production can be complex and may require additional infrastructure and maintenance efforts.

Existing methods
The automated Multiple Disease Prediction System using Machine Learning is an advanced healthcare application that utilizes machine learning algorithms to accurately predict the likelihood of a patient having multiple diseases based on their medical history and symptoms.The system employs a comprehensive dataset of medical records and symptoms of various diseases, which are then analysed using machine learning techniques such as decision trees, support vector machines, and random forests.The system's predictions are highly accurate, and it can assist medical professionals in making more informed decisions and providing better treatment plans for patients.Ultimately, the Multiple Disease Prediction System using Machine Learning has the potential to improve healthcare outcomes and reduce healthcare costs by predicting and preventing disease early.
Authors [16] highlighted the significance of ML in prediction, pattern recognition and error reduction across diverse fields, emphasizing the impact of AI in broad domain.Authors [17] presented text classification algorithms for various applications and explores the use of machine learning in detecting phishing attacks.Authors [18] discussed the use of machine learning and neural networks, especially CNN, for recognizing handwriting patterns, with a focus on Telugu film industry names, achieving high accuracy (98.3%).The paper [19] discussed the role of Intelligent Decision Support Systems (IDSS) in Healthcare Monitoring, especially for heart disease.Results claimed that IDSS enhances decision-making functionalities in uncertain healthcare scenarios, there by significantly improving the monitoring and remedial activities.Authors [20] suggested data mining techniques to predict disease-prevalence based on symptoms in healthcare data.The appropriate prediction helps healthcare organizations avoid drug shortages and further ensures timely treatment of patients.The paper [21] explores the distinct ML applications in predicting heart attacks using patient health records.It compares Random Forest and CNN methods, and findings showed that Random Forest's better performance in terms of accuracy.The DT, LR and MLP algorithms showed maximum precision and minimum errors among all algorithms [14] Data collected from Hospitals, this data consists of both structured and unstructured data Decision Tree, Regression model has the best prediction accuracy in the models generated [15] Datasets available in "Kaggle" and "University of California, Irvine (UCI) database" In the model generated SVM gave high precision and accuracy 3 Problem statement and objectives

Problem Statement
The Multiple Disease Prediction System using Machine Learning is an advanced healthcare application that utilizes machine learning algorithms to accurately predict the likelihood of a patient having multiple diseases based on their medical history and symptoms.The system employs a comprehensive dataset of medical records and symptoms of various diseases, which are then analysed using machine learning techniques such as decision trees, support vector machines, and random forests.The system's predictions are highly accurate, and it can assist medical professionals in making more informed decisions and providing better treatment plans for patients.Ultimately, the Multiple Disease Prediction System using Machine Learning has the potential to improve healthcare outcomes and reduce healthcare costs by predicting and preventing disease early.

Objective
The objective of this paper is to investigate how supervised Machine Learning (ML) algorithms can enhance healthcare by enabling more precise and early detection of diseases.In order to achieve this, we will evaluate research studies that employ multiple supervised ML models for each disease recognition task.By using a variety of algorithms in our analysis, we can obtain more comprehensive and accurate results.This approach helps to mitigate biases that may arise from evaluating a single algorithm across different research scenarios, which can lead to misleading conclusions.

Models used in proposed method
There are various methodologies that can be adapted to satisfy the objective of disease prediction.Here are some of them: • Machine learning algorithms Machine learning algorithms can be trained on a dataset of features such as demographic information, medical history, lifestyle factors, and biomarkers to predict the likelihood of an individual having a particular disease.There are various machine learning algorithms such as logistic regression, random forests, and neural networks that can be used for disease prediction.• Risk scores Risk scores are widely used to predict the likelihood of developing a particular disease.These scores are usually calculated based on a set of risk factors such as age, sex, family history, and lifestyle factors.For example, the Gail model is used to predict the risk of breast cancer, and the Framingham risk score is used to predict the risk of cardiovascular disease.• Decision trees Decision trees are a type of algorithm that can be used to predict the likelihood of a particular disease based on a set of symptoms and risk factors.Decision trees are particularly useful when the data is structured and can help identify the most important factors for disease prediction.• Bayesian networks Bayesian networks are a probabilistic graphical model that can be used to represent the relationships between different diseases and risk factors.Bayesian networks can be used for disease prediction by incorporating prior knowledge about the relationships between diseases and risk factors and then predicting the probability of an individual developing a particular disease.

Proposed method
The Multiple Disease Prediction System using Machine Learning is an advanced healthcare application that utilizes machine learning algorithms to accurately predict the likelihood of a patient having multiple diseases based on their medical history and symptoms.The system employs a comprehensive dataset of medical records and symptoms of various diseases, which are then analysed using machine learning techniques such as decision trees, support vector machines, and random forests.The system's predictions are highly accurate, and it can assist medical professionals in making more informed decisions and providing better treatment plans for patients.Ultimately, the Multiple Disease Prediction System using Machine Learning has the potential to improve healthcare outcomes and reduce healthcare costs by predicting and preventing disease early.

Architecture diagram of the proposed work
The process of disease prediction involves several steps that are designed to identify the likelihood of an individual developing a particular disease.It starts with data collection, where relevant information about the individual is collected, including their medical history, lifestyle factors, demographic information, and biomarkers.Once the data is collected, it is preprocessed to ensure that it is suitable for analysis.This may include cleaning the data, handling missing values, and normalizing the data.Feature selection is then performed to identify the most important features relevant to disease prediction.This helps to reduce the dimensionality of the data and improve the accuracy of the prediction.Next, an appropriate model is selected for disease prediction.This can involve choosing a machine learning algorithm, decision tree, or Bayesian network based on the type of data and disease being predicted.Once the model is selected, it is trained on the data to identify patterns and relationships between the features and the disease being predicted.Model evaluation is then performed to test the accuracy and generalization performance of the trained model on a separate dataset.Finally, the trained model is used to predict the likelihood of an individual developing the disease.Overall, the disease prediction process requires expertise in data science, machine learning, and medical domain knowledge to ensure accurate disease prediction.

Description about dataset
In this paper we are using multiple datasets for training our model, these multiple datasets are diabetes, heart, kidney, liver, and cancer and are downloaded from the web.These multiple datasets contain multiple attributes among those attributes only attributes that help in predicting disease are considered, and which help in increasing accuracy are considered.The datasets used in this paper are: Bilirubin, Alkaline Phosphate, Aspartate Aminotranferase, Albumin, Gender, Conjugated Bilirubin, Alamin Aminotransferase, Total Protein, Albumin and Globulin Ratio.

Experimental results
The accuracy of Diabetes prediction model performed is given by ROC curve, among all the models Random Forest gave an accuracy of 92.5%.The actual web site that shows details about multiple disease, the web page include details about the disease and symptoms related to disease.The patient now can choose his preferred disease at top right and start entering values in the text boxes, diabetes web page where user have to enter values in the text boxes.The breast cancer web page where user     Figure 5 represents results web page where user will get the results if the user is facing any problem then the result page will display positive, the patient may be facing disease else negative the patient is safe and also displays home button below.The figure 6 represents the result page for diabetes.In this way for every page, the result page will be displayed as shown above.

Significance of proposed method
Predicting multiple diseases using dedicated models offers several advantages: 1.Comprehensive risk assessment multiple disease prediction models consider a wide range of risk factors and indicators associated with different diseases.By combining these factors, they provide a more comprehensive evaluation of an individual's overall disease risk.This allows for a more accurate assessment compared to individual disease-specific models.2. Holistic approach to healthcare multiple disease prediction models promote a holistic approach to healthcare by considering the interplay between different diseases.They In summary, multiple disease prediction models offer a holistic and integrated approach to disease risk assessment, enabling early detection, personalized interventions, and efficient resource allocation.They provide valuable insights into disease interrelationships, promoting better healthcare decision-making and improving health outcomes for individuals at risk for multiple diseases.

Conclusion
The primary objective of this paper is to automatically predict diseases accurately based on patient-reported symptoms by implementing Machine Learning algorithms.In this study, four Machine Learning algorithms were utilized, achieving a mean accuracy of over 95%.This signifies significant improvement and higher accuracy compared to previous works, making the system more reliable and satisfying for users.
Overall, the integration of Machine Learning in disease prediction has the potential to revolutionize healthcare by improving prediction accuracy, enabling early interventions, facilitating personalized medicine, optimizing resource allocation, and generating datadriven insights.Continued research, collaboration between healthcare professionals and data scientists, and ethical considerations are essential for harnessing the full potential of Machine Learning in disease prediction and providing enhanced healthcare solutions for patients.Additionally, this system is user-friendly and accessible to a wide range of users without any specific threshold.
/doi.org/10.1051/e3sconf/20234300105151 430 /doi.org/10.1051/e3sconf/20234300105151 430 have to enter values in the text boxes.The user has to enter detailed values in the text boxes, the user then clicked predict to get the results.It will redirect to result page.

Table 1 .
Summary of existing approach.

•
Deep learning Deep learning is a subset of machine learning that uses neural networks to extract features from the data.Deep learning algorithms can be used for disease prediction by training on large datasets of features such as medical images, electronic health records, and genomic data.•These methodologies can be adapted to different diseases and datasets to predict the likelihood of an individual developing a particular disease.It is important to note that these methodologies require large datasets and expert knowledge to ensure accurate disease prediction.
• Diabetes Dataset Commonly used diabetes attributes are No.Of pregnancies, Glucose, Blood Pressure, Skin Thickness, Insulin, BMI, Diabetes Pedigree Function, and Age.• Breast Cancer Dataset Attributes helped in breast cancer detection are ID, Diagnosis, Radius, Texture, Perimeter, Area, Smoothness, Compactness, Concavity, Concave Points, Symmetry, Fractal Dimension, area worst, and symmetry worst.• Heart Disease Dataset Commonly used attributes are Age, Sex, Chest Pain Type, Resting Blood Pressure, Cholesterol, Fasting Blood Sugar, Resting Electrocardiographic Results, Maximum Heart Rate Achieved, Exercise Induced Angina, ST Depression Induced by Exercise, Slope of the Peak Exercise ST Segment, Number of Major Vessels (0-3) Colored by Fluoroscopy, Thalassemia, Target Variable.• Kidney Disease Prediction Attributes helped in kidney disease detection are Age, Blood Pressure, Specific Gravity, Albumin, Sugar, Red Blood Cells, Pus Cell, Pus Cell Clumps, Bacteria, Blood Glucose Random, Blood Urea, Serum Creatinine, Sodium, Potassium, Hemoglobin, Packed Cell Volume, White Blood Cell Count, Red Blood Cell Count, Hypertension, Diabetes Mellitus, Coronary Artery Disease, Appetite.• Liver Dataset Commonly used attributes in liver disease prediction are Age, Total ://doi.org/10.1051/e3sconf/20234300105151 430 take into account the potential correlations, shared risk factors, and interactions among diseases, providing a more accurate representation of an individual's health status.3. Early detection of interconnected diseases often share common risk factors and may co-occur in individuals.Multiple disease prediction models can identify such interrelationships, enabling early detection of associated diseases.This early detection facilitates timely intervention, leading to improved outcomes and potentially preventing the progression of multiple diseases.4. Efficient use of data and resources Developing multiple disease prediction models allows for the integration and utilization of diverse datasets and resources.By leveraging shared data and risk factors, researchers can optimize the use of available resources, such as large-scale population health data or electronic health records, to develop accurate and efficient models.5. Personalized prevention and intervention strategies Multiple disease prediction models enable the development of personalized prevention and intervention strategies.They can identify individuals at high risk for multiple diseases and provide tailored recommendations for risk reduction, lifestyle modifications, and targeted interventions.This personalized approach enhances the effectiveness of preventive measures and interventions.6. Improved decision-making for healthcare providers Multiple disease prediction models provide healthcare providers with comprehensive information to make informed decisions.By considering the risks of multiple diseases, healthcare professionals can prioritize preventive measures, screenings, and interventions based on an individual's overall disease risk.This aids in optimizing healthcare management and resource allocation.7. Insights into disease interactions and comorbidities Multiple disease prediction models contribute to a better understanding of disease interactions, comorbidities, and their underlying mechanisms.They help identify common pathways, shared risk factors, and potential synergistic effects between diseases.This knowledge can lead to advancements in disease prevention, treatment, and the development of targeted therapies. https