London Underground – Journey Planner

london-underground-signThe aim of this Python challenge is to investigate how graphs can be used in Computer Science and investigate the key algorithms used when manipulating graphs in Python such as an algorithm to find the shortest path between two nodes of a graph.

For this challenge we will focus on a graph used to represent the London Underground Map (Zone 1).
tubemap

Access London Tube Map

Most programming languages do provide direct support for graphs as a data type. Python for instance does not have a graph data structure. However, graphs can be built out of lists and dictionaries. For instance, here is a graph to represent some of the connections between a selection of London tube stations:

tubemap = {"Aldgate" : ["Liverpool Street","Tower Hill"],
"Aldgate East" : ["Tower Hill","Liverpool Street"],
"Angel" : ["King's Cross St. Pancras","Old Street"],
"Baker Street" : ["Marylebone","Regent's Park","Edgware Road (C)","Great Portland Street","Bond Street"],
"Bank" : ["Liverpool Street","St. Paul's","London Bridge","Moorgate","Waterloo"],
"Barbican" : ["Farringdon","Moorgate"]
#...
}

This graph is a dictionary where the keys are the nodes of the graph. For each key, the corresponding value is a list containing the nodes that are connected by a direct arc from this node.

The full dictionary with all the stations for London Underground can be found in the Python trinkets below.

Find Path Algorithm


Our first algorithm is used to determine if there is a path between two nodes of a graph, and if so it returns one path that can be used to connect both nodes. A path is a list of nodes which are connected to each others.

This algorithm uses a backtracking approach to explore different paths. It does stop as soon as a path has been found. Note that the path that will be found first may not be the shortest path though.

def find_path(graph, start, end, path=[]):
  path = path + [start]
  if start == end:
    return path
  if not graph.has_key(start):
    return None
  for node in graph[start]:
    if node not in path:
      newpath = find_path(graph, node, end, path)
      if newpath: return newpath
  return None

Find All Paths Algorithm


This second algorithm is very similar to the find_path() algorithm. The difference is that this time the backtracking algorithm will explore and return all the possible paths between the two nodes as a list of paths. (List of lists).

def find_all_paths(graph, start, end, path=[]):
  path = path + [start]
  if start == end:
    return [path]
  if not graph.has_key(start):
    return []
  paths = []
  for node in graph[start]:
    if node not in path:
      newpaths = find_all_paths(graph, node, end, path)
      for newpath in newpaths:
        paths.append(newpath)
  return paths

Find Shortest Path Algorithm


Our final algorithm is probably the most useful algorithm as it will identify amongst all the paths that have been found, which one is the shortest path (The path that has the smallest number of nodes).

To make this algorithm more effective we also have a stopping condition to stop exploring a path that would be longer than the shortest path found so far.

def find_shortest_path(graph, start, end, shortestLength=-1, path=[]):
  path = path + [start]
  if start == end:
    return path
  if not graph.has_key(start):
    return None
  shortest = None
  for node in graph[start]:
    if node not in path:
      if shortestLength==-1 or len(path)<(shortestLength-1):
        newpath = find_shortest_path(graph, node, end, shortestLength, path)
        if newpath:
          if not shortest or len(newpath) < len(shortest):
            shortest = newpath
            shortestLength = len(newpath)  
  return shortest

Python Code


We will use the find_shortest_path() algorithm to find the shortest path between two tube stations.

Zone 1 OnlyFull Map (Zone 1 to 10)

With a more complex graph, this algorithm will take longer to find the shortest path.

Extension


A backtracking algorithm can sometimes take a very long time to explore all possible paths. This is due to the complexity of the graph (e.g. tube map) which contains many nodes and connections between nodes, hence the high number of potential routes to explore using a backtracking algorithm.

More advanced algorithms can be implemented to overcome this issue. You may want to explore the following algorithms:

Did you like this challenge?

Click on a star to rate it!

Average rating 3.9 / 5. Vote count: 7

No votes so far! Be the first to rate this post.

As you found this challenge interesting...

Follow us on social media!

Tagged with: , ,