The Data Science Method (DSM) -Exploratory Data Analysis
This is the third article in a series about how to take your data science projects to the next level by using a methodological approach similar to the scientific method coined the Data Science Method. This article is focused on the number of step three Exploratory Data Analysis. If you missed the previous article(s) in this series, you can go to the beginning here, or click on each step title below to read a specific step in the process.
- Problem Identification
- Data Wrangling
- Exploratory Data Analysis
- Pre-processing and Training Data Development
- Modeling
- Documentation
EXPLORATORY DATA ANALYSIS (EDA)
Step number three in the Data Science Method (DSM) assumes that both steps one and two have already been completed. At this point in your data science project, you have a well-structured and defined hypothesis or problem description. The model development data set is up and ready to be explored, and your early data cleaning steps are already completed. At a minimum, you have one column per variable and have a clear understanding of your response variable.