//
Yuanyuan Ruan
I am a senior student in USF. My major is data science. I love data visualiazation class, because it shows me the power of variety of graphs to visualize data, which are quite fun and impressive.
In my freetime, I love attending data science related meetup events, self learning new skills, now I am aiming to find a data science/analystic/engineer job or intern, by leargning and exploring machine learning skills.
This visualization tells the baseline estimates of parents’ and children’s income distributions by college in a scatterplot matrix, the data source is from Equality of Opportunity Project, Table 2: Baseline Cross-Sectional Estimates by College table from the study Mobility Report Cards: The Role of Colleges in Intergenerational Mobility by Chetty, Friedman, Saez, Turner, and Yagan (2017).
I am focusing on two different types of datasets. I comparing the mean/median incomes between parents and kids from colleges in California filtered by the three types of colleges, such as Selective public, Nonselective for-year private not-for-profit and Two-year for-profit. Using the same filter, I visualize the pair-wise relationship among fraction female and married kids from colleges in Califonia.
What I find interesting is the correlation in the scatterplot matrix. From the matrix, we can easily see that there is a very strong positive correlations between the mean kid incomes and median kis incomes, which the correlation is nearly equal to one. Another strong positive correlation is between mean parential incomes and median parential incomes, which is close to 0.8. Moreover, parential mean and parential median have some positive correlation/ratio to kid mean, which might implies that the higher parential incomes it is, the higher chance the kids might have higher incomes as well, because they high income parents offer better quality of education to their kids.
Email:yruan2@dons.usfca.edu
LinkedIn: linkedin.com/in/tracyruan