For this challenge we will investigate how to retrieve and filter data from a CSV file to perform a goals analysis of the 2018 Football World Cup.
We have collated a CSV files of all the goals scored during this world cup and stored this information in a CSV file, using the semi-colon as a separator. The data is formatted as follows:
On this data, time represents the minute in the game when the goal was scored.
You can preview and download the file:
Our aim is to write a collection of subroutines that will be used to extract and display relevant information from this text file. We will then display a menu of options for the user to select what goals analysis to perform. Here will be the options from our menu:
- A: Total number of goals scored by a given country
- B: Total number of goals scored by a given player
- C: List the name of all the players who scored for a given country
- D: Total number of goals by all countries
- E: Total number of goals scored during the first half (45 minutes)
- F: Total number of goals scored during the second half (45 minutes to 90 minutes)
- G: Total number of goals scored during extra time (after 90 minutes of play)
- X: Exit?
File Handling in Python
When using text file in Python, you should always start by opening the text file using the right file access mode. In this case will will open the file in read mode (“r”).
file = open("goals.txt","r")
We can then access each line of the file one at a time using a for loop:
for line in file: print(line)
Because our file is a CSV file, each line of the file will contains several fields (in our case three fields: player, country, minutes). We can extract this data by splitting the line using the separator “;”.
for line in file: data = line.split(";") player = data country = data minutes = int(data) print(player + " from " + country + " scored a goal at the " + str(minutes) + "th minutes")
You can filter the data by using an if statement. This code will perform a linear search to find all the goals scored by Spain:
for line in file: data = line.split(";") player = data country = data minutes = int(data) if country=="Spain": print(player + " from " + country + " scored a goal at the " + str(minutes) + "th minutes")
Once you have completed your analysis, do not forget to close the text file:
We have started the code for you, but you will to add more subroutines to implement all menu options.
Extension Task 1: World Cup Quiz
Write a Python program that randomly pick two countries from the list of countries which took part in the 2018 world cup.
Ask the user to guess which of the two countries scored the most goals during this world cup. Check their answer to see if they guessed it right.
Here is some code to get your started. First we will generate a list of all countries which scored goals during the 2018 world cup:
countries =  file = open("goals.txt","r") for line in file: data = line.split(";") country = data if country not in countries : countries.append(country) file.close() print(countries)
Then to randomly select a country from this list, we will use the random library:
import random randomCountry = random.choice(countries)
Note that you should add the import statement at the very top of your code.
Extension Task 2: Top Scorers
Complete your code by adding two options to your main menu as follows:
- H: List the total number of goals scored for each country
- I: Which country scored the most goals?
- J: Which player scored the most goals?
Implementing these two subroutines will be more advanced than just performing a linear search through the text file. Our tip is to use a hash table (dictionary data structure) to store the total number goals scored by each country to then find out which country scored the highest number of goals.