CS 360 Homework #2 - Multivariate Data

@chunheisiu


About the Author

Charles Chun Hei Siu

As a keen person in the field of both photography and computer science, I always enjoy engaging in various activities, taking challenges, and meeting new people. My organization and communicational skills also allow me to collaborate with other people in the group to achieve goals and objectives. I seldom hesitate to share the knowledge of my own with people in need, or with the public through volunteering services.

Description

This visualization shows the pair-wise relationship between the kid income quintile groups in a scatterplot matrix, using data from the Equality of Opportunity Project, Table 2: Baseline Cross-Sectional Estimates by College table from the study Mobility Report Cards: The Role of Colleges in Intergenerational Mobility by Chetty, Friedman, Saez, Turner, and Yagan (2017). The data derived from the table contains the authors' baseline estimates of the income distributions of parents and children by college.

The particular part of the data I am focusing on visualizes the pair-wise relationship across kids in income quintile groups in colleges located in California. The data is further categorized into two geographic regions within California, in which schools located in Northern California are colored in green while schools located in Southern California are colored in red. It is, therefore, observable whether there exists a correlation between kids' income and the geographic region of the college they attend, and if schools in different geographic regions prefer students from certain income groups.

From the scatter plot below, it is observable that there is a relatively strong negative correlation between the fifth and first, fifth and second, fifth and third quintiles, which shows that kids in the top income quintile tend to go to the same school as the other kids in the same high-income group rather than the lower ones. It can also show the tendency of schools that already have a high population of high-income students to prefer those with a higher socio-economic status. Meanwhile, among those schools that prefer high-income students, most of them are located in Northern California, which might coincide with the better economy in the north. On the other hand, it is also observable that there is a relatively strong positive correlation between the first and second quintiles, which shows that kids in the bottom 40% of the income groups tend to go to the same schools. Among those schools, most of them are located in Southern California.


Tableau Prototype

D3 Implementation

Used modified code from https://bl.ocks.org/mbostock/3213173.



Back to Portfolio