Exploratory Data Analysis (EDA)

By: A. Mishra

EDA is a technique for analyzing data that makes use of visual techniques. By using statistical summary and graphical representations, it can be utilized to discover trends and patterns, as well as to test assumptions and hypotheses. As well as assisting in the identification of obvious errors, it can also aid in the better understanding of patterns within the data, the detection of outliers or unexpected events, and the finding of interesting relationships between the variables, among other things[1-2].

There are various types of EDA

Types of EDA are as follows [3]:

• Univariate analysis

The study of univariate data, as a result, is the easiest sort of data analysis because the information is concerned with only one variable that is subject to variability. Unlike other types of analysis, this sort of analysis does not concern itself with causes or links; rather, its major objective is to characterize the data and discover patterns that can be found within it.

• Bivariate analysis

It consists of two separate variables that are analyzed together. The analysis of this type of data is concerned with causes and relationships, and the analysis is carried out in order to determine the relationship between the two variables under consideration.

• Multivariate analysis

When there are three or more variables in the data, the data is classified as multivariate.

Exploratory Data Analysis Tools

The language is an interpreted object-oriented programming language with dynamic semantics, and it can be used to develop EDAs[4].

Python as scripting or glue language to tie together previously produced components, it is particularly appealing because of its high-level, built-in data structures, as well as its dynamic typing and binding. It is also quite desirable for rapid application construction. Missing values in a data collection may be discovered by combining Python with EDA, which is important since it lets you choose how missing values are treated in machine learning applications, which is important.

R is an open-source programming language and free software environment for statistical computing and graphics, which is maintained by the R Foundation for Statistical Computing. R is used extensively in the field of statistical computing and graphics. R is a programming language that statisticians in the field of data science typically employ for the production of statistical observations and the analysis of vast volumes of data.

EDA Applied in:

Data Sciences

Data Analysics

Business Analysis

Data Mining

References:

[1] Sahoo, S. R. et al. Classification of various attacks and their defence mechanism in online social networks: a survey. Enterprise Information Systems, 13(6), 832-864.

[2] Nguyen, G. N. et. al (2021). Secure blockchain enabled Cyber–physical systems in healthcare using deep belief network with ResNet model. Journal of Parallel and Distributed Computing, 153, 150-160.

[3] García-Peñalvo, et. al. (2021). A Survey on Data mining classification approaches.

[4] Gudivada, A., et al. (2020). Developing concept enriched models for big data processing within the medical domain. International Journal of Software Science and Computational Intelligence (IJSSCI), 12(3), 55-71.

431040cookie-checkExploratory Data Analysis (EDA)

Post Views: 842

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Exploratory Data Analysis (EDA)

There are various types of EDA

Leave a Reply Cancel reply

Detecting and Preventing Phishing Attacks in IoT-Based Smart Healthcare Systems

Data-Driven Insights into Rare Disease Diagnosis and Treatment with AI

Genetic Algorithms and Data Analytics for Cybersecurity in Phishing and Blockchain Systems

Machine Learning in Biometric Security Systems

The Role of AI and Machine Learning in Cloud Storage

How AI is Revolutionizing Cyber Forensics

Explainable Multi-Agent Reinforcement Learning for Algorithmic Trading

Internet of Things and Advancements in Businesses

Efficient and Sustainable Desalination using IoT, Cloud Computing, Embedded Systems and Nanotechnology

Role of Machine Learning in Embedded Systems

Pocket Hacking: From Root Access to Kali Linux