COVID-19 Global Data Analysis Case Study
Objective:
To analyze global COVID-19 data to identify the most affected countries, mortality patterns, recovery trends, testing efficiency, and risk indicators using Python, PostgreSQL, and interactive dashboards.
Tools Used:
-
Python
-
—Pandas
-
—NumPy
-
—Plotly
SQL
-
Power BI (optional dashboard integration)
Dataset Source:
Global COVID-19 public datasets including:
Total Cases
Total Deaths
Total Recovered
Active Cases
Serious/Critical Cases
Total Tests
Population
Goal
To analyze global COVID-19 case trends, recovery rates, and mortality patterns in order to identify infection waves, regional impact differences, and data-driven public health insights.
Process
-
Cleaned and transformed raw time-series datasets using Python
-
Performed exploratory data analysis (EDA)
-
Built SQL queries to extract country-level trends
-
Developed Power BI dashboards for confirmed, active, recovered, and death cases
-
Visualized growth rates and moving averages
Highlight
Created an interactive dashboard that allowed filtering by country and time period, revealing infection peaks and recovery acceleration phases.
Figure 1. This graph shows how COVID-19 cases grew and changed over several months. Confirmed cases steadily increased, while recoveries also rose as more people got better, and deaths increased at a slower pace. The chart helps visualize how the pandemic evolved over time.
Figure 2. This chart shows how COVID-19 deaths were distributed among the 10 most affected countries. The United States accounts for the largest share, followed by Brazil, with the remaining countries contributing smaller portions. It highlights how a significant percentage of total deaths were concentrated in just a few nations.
Figure 3. This chart shows how COVID-19 cases developed in Switzerland over time. Confirmed cases and recoveries rose sharply in early 2020 before gradually stabilizing, while deaths increased at a slower but steady pace. The trends highlight the rapid initial spread followed by a period of controlled growth.
Summary
The analysis revealed distinct epidemiological waves characterized by exponential acceleration, peak saturation, and gradual normalization phases. Clear regional disparities emerged, with infection velocity and recovery ratios varying significantly across geographies. Temporal correlations between case surges, recovery lag, and mortality trends highlighted the dynamic interplay between transmission intensity and healthcare response capacity.
Conclusion & Analysis
The COVID-19 dataset provided a real-world case study in exponential growth dynamics, nonlinear trend behavior, and time-lagged correlations under crisis conditions. By applying time-series modeling, growth-rate analysis, and comparative scaling (linear vs. logarithmic), the project translated raw case counts into interpretable patterns of contagion momentum and structural inflection points.
Beyond visualization, the work strengthened my ability to manage high-volume, multi-regional datasets, detect structural breaks, and extract signal from volatility. It reinforced a disciplined analytical workflow — from data cleaning and transformation to trend modeling and strategic interpretation — enabling complex epidemiological data to be converted into decision-oriented insights.