• Home
  • Resume/CV
  • Technical Skills
  • Soft Skills and Collaboration
  • Certifications and Courses
  • Interactive Dashboards
  • Projects
  • Volunteer Work

Course: DATA 201 - Digital Visualizations

  

DATA 201 | Data Wrangling | Digital Visualizations

Assignment 4: Digital Visualization | Examining Rental Properties in Calgary


Displaying rental prices in descending order.

Visualization of how many rentals there are in each neighbourhood.

The average going prices for rentals in each neighbourhood.


1.

What are some distinctive observations between the attributes of neighbourhood and price?

Which neighbourhood, on average, houses the most expensive rentals?

Are there common attributes among the most expensive neighbourhoods?


Looking at the raw data in Excel, one can infer that the neighbourhood “WEST HILLHURST '' had the highest sum of the prices in the neighbourhoods with a whopping $485. (I created a table, aggregated the sum of prices, and sorted it from largest to smallest.) I wanted to test whether the MEAN prices reflect the same story in my analysis.

First, I looked at the rental count in each neighbourhood by using the Calculated Field with the function Count on Tableau. My analysis shows that “BELTLINE” significantly contributes to the attribute neighbourhood. 

With the Count function, I could calculate each neighbourhood's MEAN price. I did this by collecting the sum of the prices and dividing it by the Count function. With further analysis and the help of the attribute borough, I ultimately deduced that the top 3 most expensive rentals are all located in the “CENTRE” boroughs, and these neighbourhoods consist of “WINSTON HEIGHTS/MOUNTVIEW,” “WEST HILLHURST,” and “LOWER MOUNT ROYAL.”


I conclude that the least desirable neighbourhood is “WINSTON HEIGHTS/MOUNTAIN VIEW” because it does not offer various amenities (reflected in the Count analysis). Second, it was a ridiculously large rental price when other great alternatives existed.


Bar charts with varying orientations and orders are most effective because prices were the focus of this analysis/research, specifically which rentals were most expensive and which neighbourhoods have the highest average rental prices. These visualizations are useful for someone wary about their budget. It considers three major aspects of a rental consumer. One is the price of a specific location. Two are selection (how frequently rentals occur in each neighbourhood), and three are which neighbourhoods to avoid. Bar charts are one of the most effective ways to display quantity.


Visualization of the counts for each of the borough categories.

The count of the visualization on a pie chart (left).

A screenshot of the distribution of boroughs is displayed in percentages.


2.

What is the distribution of the boroughs?

First, I determined the total number of entities there are. Then, I quantified the data by counting the number of each category and recording it. I determined that there is a total of 500 responses, 269 consisting of “CENTRE,” 19 of “EAST,” 38 of “NORTH,” 22 of “NORTHEAST,” 61 of NORTHWEST, 45 of “SOUTH,” 15 of “SOUTHEAST, and lastly 31 of “WEST.”


It’s interesting to see that proportionally, rentals are located mostly in the “CENTRE” borough. This means that that location is very saturated in the rental market. It would serve as a good indicator of where competition is most present for an aspiring landlord.


A pie chart best represents proportion because a pie represents 100%, and it’s also very flexible to whatever total quantity. Moreover, it’s an excellent visualization of dominance within a category, just as the “CENTRE” borough encases most of the rentals.


3.

Is there a relationship between room ID and the number of accommodations?

What is the proportion of the number of accommodations to the number of reviews?


Display the average number of reviews in each neighbourhood from highest to lowest.

Displaying the average amount of accommodations in each neighbourhood.

Displaying the varying ratios between Accommodations and Reviews.


3.

Does the number of reviews directly relate to the number of accommodations?

Is there a relationship between accommodation and reviews?

Is there a uniform range with the ratio of accommodations to review?


Before I analyzed these two attributes, I first needed to analyze the MEANS of each one. I wanted to see if the amount of accommodations reflects the level of engagement in the reviews. I wanted to see if housing more people results in a higher volume of reviews. By analyzing these attributes, I could see the level of engagement within each neighbourhood. 

As I previously mentioned, the first task I had to tackle was identifying the MEAN of accommodations and reviews in each neighbourhood. I hit a standstill when I failed to realize there was a MEAN function in Tableau, so I ended up taking the Counts and Sums of each attribute. And created my own MEAN formula. (Sum(attribute x) / Count(attribute x)). It’s great how flexible Tableau is as a software.


Solely looking at the average reviews in each neighbourhood. It's clear that the engagement level in “FOREST LAWN” is the highest. It does not, however, give insight into the quality of time/satisfaction you will have in that rental/neighbourhood.

Moving on to average accommodations, I found that “SPRINGBANK” has the highest housing. This leads me to deduce that this neighbourhood provides the most flexibility regarding space and quantity of people. However, once again, it does not indicate the quality of time/satisfaction you can have in that neighbourhood.

My conclusion for this question is that there is no relationship between the number of reviews and the number of accommodations available within a neighbourhood. There is no uniform ratio between the average number and the average number of reviews. The range extends from 1.4% to 225%. Moreover, as I manipulate the mean of one attribute, the other doesn’t show a significant similarity. 


A layered bar chart best represents whether two attributes follow a trend because you can easily identify similarities and differences. Plus, adding colour allows you to easily segregate neighbourhoods to avoid confusion. Having three bar charts stacked on top of each other makes it so that each bar chart serves as a confirmatory measure. If I were to compare just two Means and look at the overall shape of my bar chart, I wouldn't know for sure whether to reject or accept that there’s a trend between the two (looks can be misleading). By having proportion, I can confidently reject that the two Means have a relationship.

Displaying the filter budget price.

Displaying the filter for desired satisfaction score.

Excluding borough areas from the visualization.

Allowing every neighbourhood for analysis.

Screenshot of the final visualization.


4.

What is the average number of rooms in the “CENTRE” Bourough’s neighbourhood with a satisfaction rating of over four and a budget of $100?

This question was very specific, so I decided that a pivot chart could best help me allocate the attributes in an organized fashion. I had to read the context of the question to assign the attributes to a specific parameter. The parameters within a pivot chart include “Filters,” “Legend(series),” “Axis(Categories),” and “Values.” What I needed to identify was the “average number of rooms,” and so I deduced that the attribute “bedrooms” would be my “Values.” I disregarded the parameter “Legend(Series)” as I didn’t have use for it. For “Axis(Categories), I went with the “neighbourhood” since I wanted to segregate locations in the “CENTRE” boroughs. Lastly, for filters, the criteria were that they needed to be within a “CENTRE” borough, have a satisfaction score of 4 and above, and have a budget of $100.


A pivot chart is a good visualization for very specific questions as it can be easily manipulated and flexible. Moreover, using filters helps with the analysis process as it narrows the data set to just the required information. A horizontally oriented bar chart best represents the information because it identifies quantities. From the image above, I can immediately say that if I were looking for a place to stay with many people, my best choices would be “PARKHILL” and “HOUSEFIELD HEIGHT/BRIARHILL.” Another way I could improve the visualization is to sort it from highest to lowest. However, I didn’t choose to do that as the neighbourhoods with the highest averages are already pretty explicit. My conclusion to this question is that with a $100 budget, one can have a satisfying stay at a rental and have at most three bedrooms.


Visualization of the parallel amongst two attributes.


5.

Is there a correlation between the host id and the number of reviews?

Looking at the host ID information alone, it’s hard to tell how many times each person hosted a room, especially with 500 or so entities. I could have used sorting and observation to calculate this information. However, that would have been too time-consuming and tedious. With the help of Tableau and changing the aggregation to coincide, I could see how often each host hosted a room. I then compared it to the number of reviews and found interesting results. 


With the information, I was able to gather the assumption that the number of reviews has a direct correlation with the number of hostings for every host, which is valid. Moreover, it’s also valid to assume that reviews were left only by the hosts, not the accommodates. My previous analysis of the number of accommodations and reviews also supports this. 


It’s interesting to see that reviews and hosts depend on one another. It’s unusual, considering that the accommodations usually review the location, not the host, when considering rentals. Without the analysis, I would not have been able to uncover this fact, and I would have kept the assumption that guests made the reviews.


This visualization best represents frequency because it clearly displays how often each attribute (hosts) (reviews) occurs. Moreover, by incorporating colo,r, I can display how saturated each host is in the hosting market. I think this graph can be improved by incorporating a date attribute. (how many times each attribute occuspecific periodcperiode). With this additional attribute, I can see specific times when the rental market is most profitable.