PyCity

PyCity

A python app that automatically collects data from various cities’ open data portals for benchmarking. The City of Vancouver’s financial department manually collects data to benchmark and compare Vancouver’s metrics to other cities like New York City, Toronto, Seattle and Calgary. The process of data collection is labour intensive and time-consuming. Our simple app automated this task by leveraging open data portals and their APIs to extract relevant data.

School:

BCIT

Department:

School of Computing & Academic Studies

Course:

ISSP 3900

Instructors:

  • Jeremy Holman
  • Michal Aibin
  • Neda Changizi

City of Vancouver:

  • Connie McGinty
  • Stephanie Petsinis
  • Tadhg Healy
  • Ada Lam
  • Rene Cravioto

Student Team:

  • Rica De Vera
  • Sagarika
  • Paul Chen
  • Sajjin Nijjar
  • Sean Lim
  • Ryan Cho
  • Cedrick Abenes

Strategy:

  • Corporate Plan 2020

Term:

Spring 2021

Summary

The major challenge the Financial Planning and Analysis department at the City of Vancouver (COV) was looking to solve was their labour intensive data collection process. COV’s financial department refers to a list of benchmarking metrics that are used to compare Vancouver’s performance in every department with that of other cities, to ensure that the City’s financial resources are effectively allocated. This process was entirely manual, i.e. a person would search the web for the metric they were looking for and find the exact data point they needed for that specific metric. This can be an extremely time-consuming process.

Our team devised a solution wherein we could pull the latest data from the Open Data Portals that almost every city has. Our process would start by looking for the metric’s dataset in the Open Data Portal and comprising that data to put it in our app. So the input data provided to the app is the API Endpoint – the source link to the data, parse code – code that extracts the data and other relevant information to COV. This information is stored in a separate file in the app which is in a JSON (JavaScript Object Notation) format. When the app runs, it creates excel sheets specific to the city and every JSON object in the file is entered as a separate row in the excel sheet. Thus, the manual process of looking through tens of websites for a single data point was limited to 1 (the Open Data Portal) and the process of manually entering everything into excel sheets was eliminated altogether.

The process of automation wasn’t as easy as we thought it would be as there wasn’t a one-size-fits-all solution. We had to eliminate some cities like Montreal. The open data portal for Montreal was primarily in French, and our team members are not proficient in the language. Therefore, we had to skip on collecting data from Montreal as important details can get lost in translation along with the integrity of data. For the future of this project, our recommendation would be to provide all the data the cities have on their open data portal as the input and not just the selected metrics and organize it well. There is a potential of creating a user-friendly web app that can make the navigation through the app a lot easier and faster.