Exploratory Data Analysis (EDA) on world-wide health effects from recent data on people's food and beverage consumption. This is a milestone project, developed as part of an internal training on Data Engineering and Analysis during my junior position in Adastra Bulgaria.
As society develops over the last century, our standards of life improve, our choice of foods and drinks is more numerous than ever, and our supply is more abundant and affordable. We expect these recent developments to lead to either positive or negative health effects.
In this study, we explore how food, sugar, and alcohol consumption have influenced key health parameters such as blood pressure, BMI, total cholesterol, and life expectancy. We will look at the effect on a global scale over the last ~100 years.
This is a Jupyter Notebook project and this section will show you how to set up your environment and run the notebook yourself.
Prerequisites:
- Python 3.8+
Clone the repo:
git clone https://github.com/Pejo-306/bd-python-milestone-project
cd bd-python-milestone-project/(Recommended) Setup virtual environment (venv) and install project requirements:
python3 -m venv ./venv
source ./venv/bin/activate
pip install -r requirements.txtLaunch Jupyter and open the notebook:
jupyter notebook "Effects on health from choices in foods and drinks.ipynb"Jupyter will initialize a local server and open the notebook in your primary browser.
Datasets are stored as Excel spreadsheets in the project's /datasets directory. They contain the following data for each country:
- "0. life_expectancy_at_birth.xlsx": average life expectancy in years from 1800 to 2016
- "1. food_consumption.xlsx": average daily food consumption in kcals from 1961 to 2007
- "2. sugar_consumption.xlsx": average daily sugar consumption in grams from 1961 to 2004
- "3. alcohol_consumption.xlsx": average daily alcohol consumption in grams of pure alcohol from 1985 to 2008
- "4.1. bmi_male.xlsx": mean male BMI from 1980 to 2008
- "4.2. bmi_female.xlsx": mean female BMI from 1980 to 2008
- "5.1. blood_pressure_male.xlsx": mean male SBP from 1980 to 2008
- "5.2. blood_pressure_female.xlsx": mean female SBP from 1980 to 2008
- "6.1. cholesterol_male.xlsx": mean male TC from 1980 to 2008
- "6.2. cholesterol_female.xlsx": mean female TC from 1980 to 2008
Each dataset has individual years as columns and individual countries as rows.
The raw datasets have numerous problems like missing data points, unknown countries, differentiating formats, etc. Before data analysis begins, a standard cleansing procedure is performed to fill missing values, standardize formatting, etc. Inspect the code here for more details.
Below we discover key takeaways from correlating and visualizing datasets in different ways:
- Increased food intake is strongly correlated to increased sugar intake worldwide
- Increased food intake has lead to a worldwide increase in BMI
- Increased sugar intake has only lead to increased BMI in the Western Hemisphere
- Countries in the Southern Hemisphere have experienced increased cholesterol due to more sugar consumption
- Africa, Oceania, India have seen increased blood pressure due to increased sugar intake
- Life expectancy at birth from 1800 to 2016
-
After numerous correlations between different datasets and analyzing the results, we primarily discover that the increase in food and sugar intake has lead to increased BMI, especially in the Western Hemisphere.
-
Countries in Africa, Oceania and India have had their blood pressure and cholesterol increased due to increased sugar intake.
-
In the Western Hemisphere, a large part of increased food intake is based on consuming more sugar. In the rest of the world, increased food intake is not strongly due to increased sugar intake.
-
Life expectancy has increased rapidly in the last 50 or so years, but not as a result of changes in food and drink choices.
-
We conclude that there are other unknown factors (not explored in this study) which have drastically improved the life expectancy. The choices of foods, sugar, and alcohol intake have had a negligible effect OR are just one part of a complex web in worldwide environmental changes over the last 50 years.
This project is distributed under the MIT license.









