top of page

Data Analytics:
Premier League Deep-Dive

A deep dive into the English Premier League backed by real-life statistics.

Written and developed through python & the pandas library.

pandas.png

Background

​

This project was completed as part of a Data Analytics class. After spending a semester learning the fundamentals of Pandas and data analysis in Python, we were tasked with finding a suitable dataset and conducting a range of relevant analyses.

​

The project spanned the entire semester and gave us the opportunity to strengthen our programming skills while producing tangible evidence to support any conclusions we reached.

​

For each analysis, we began by identifying a clear and relevant question. In the context of my project, for example, this might have been: "Which age group scores the most goals?" We would then follow this question with the appropriate code and a graphical visual to help us explore and answer it.

​

Below are some selected samples from my work on the project, along with a link to my GitHub where you can find the full version.

Pie chart

​

In this analysis, I calculated the total number of English players in the league and broke it down by club to see how many each team had on their roster.

​

I then took the question a step further: why do some clubs have significantly more English (homegrown) players than others? What I found was that teams with higher numbers of domestic players were often those battling relegation. These clubs tended to rely more heavily on players from their youth academies or local transfers, likely due to financial constraints; domestic transfers typically cost less.

​

However, this strategy didn’t appear to be particularly effective. In fact, two of the top five clubs with the most English players were relegated that season.

Pie Chart.png
lINE Plot.png

Line Plot

​

In this analysis, I calculated the average save percentage for goalkeepers by age.

​

The goal was to explore whether younger, more agile goalkeepers outperformed their older, more experienced counterparts — or if experience alone was enough to keep them competitive.

​

From the plot, I observed a general trend: goalkeepers tend to improve with age, with many in their late 30s still performing at a high level.

 

Note: My dataset had a noticeable gap in goalkeepers around the age of 30. However, based on the overall trend in the graph, it's reasonable to assume that keepers in that age range performed similarly to those just above or below them.

Screenshots

Full Project

92813512-27f0bb80-f376-11ea-8562-ee2b3e416aec.png
bottom of page