Skip to content

Created End-to End data pipeline by scrapping toronto real estate prices from realtor.ca, supplementing data by adding postal code (scrapped it using google map), and created visualizations based on areas.

Notifications You must be signed in to change notification settings

princeatul/real_estate_prices_toronto

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

real_estate_prices_toronto

Objective

This project was to craete an end to end data pipeline for getting real estate price points in toronto area and visualizing it.

Steps involved in this projects are:-

  1. Scrapping realtor.ca for starting data points
  2. Data Enrichment by adding postal codes for each address. I did this by scrapping search results on Google Map.
  3. Data transformation and analysis
  4. Visualizing the price points

Part of this projects can be used for following exercises:-

  1. Scraping realtor.ca using python (Selenium and BeautifulSoup)
  2. Scraping seach result for pincodes on Google Map using Python (Selenium and BeautifulSoup)

List of most relevant packages:-

  1. Pandas
  2. Selenium
  3. BeautifulSoup
  4. Seaborn
  5. Folium

References

I used code from this repo for drawing Choropleth map for Toronto. GeoJSon file for Toronto is not readily available so huge thanks to A Gordon. Relevant links are:-

  1. Medium post explaining this - medium post
  2. GitHub Repo - github_choropleth Please check above links.

Some of the created graphs are

no of listings by bedroom

Looked into no of listings by no of bedrooms. 1 bedrooms apartment are the most common in Toronto area.

no of listings by bathrooms

No of lsitings by no of bathrooms. 1 bathrooms apartment are the most common.

Real estate price and number of listings Created a histogram of price points. We can see that it is right skewed. One of the reason for thsi graph is the presence of all types of houses i.e. 2 bedrooms, 3 bedrooms and 4 bedrooms etc.

Postal Code wise Listings In this graph, we can see no of listings for different postal code. I put a filter of no of listings more than 2 to focus more on high frequency postal code. It also helped in hightlighting some of the error in postal codes.

City_area_mean_price Drew mean price points for different area of toronto by no of bedrooms.

geo_json toronto all Choropleth map for toronto for all the listings

geo_json toronto 2 bhk Choropleth map for toronto for all the 2 bedrooms real estate listings

geo_json toronto 3bhk Choropleth map for toronto for all the 3 bedrooms real estate listings

I have also posted this project here

About

Created End-to End data pipeline by scrapping toronto real estate prices from realtor.ca, supplementing data by adding postal code (scrapped it using google map), and created visualizations based on areas.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published