Exploring a Dataset on Video Game Sales

Jordan Maulino
Final Project
CS360
Prof. Sophie Engle


Source: The Odssey Online

Motivation

When deciding on a dataset for my final project, I wanted choose something that I thought would be interesting to explore. As someone who spent a lot of time playing video games growing up, I was excited to stumble upon this dataset while perusing Kaggle. After looking at the dataset and seeing what kind of features it contained, I thought it would be fun to make some visualizations that provided some insights on different types of videogames, their ratings, and their sales.

The Data

I found this dataset on Kaggle.com, which is a website that hosts predictive modeling and analytics competitions. It contains close to 17,000 video game observations along with 16 columns containing information on each videogame's:


Processing the data

I processed this dataset using Trifacta Wrangler to remove all of the incomplete entries (any entries that had empty or null values). This left me ~6,900 complete obvservations to work with. The data didn't need much further processing after that. Although I decided to keep all of the columns, my visualizations only use the columns concerning a videogame's name, platform, genre, sales, and user/critic scores. In addition, I took subsets of this processed data (using Excel) in my Parallel Coordinates and Multiline Time-Series visualizations to make the data a little easier to manipulate in D3.

You can use the toolbar above to view each visualization.


About


Random Name

Jordan Maulino


Hi, I'm Jordan and I'm a fourth-year data science student at the University of San Francisco. My involvements at USF include being president of Chi Upsilon Zeta (XYZ), a multicultural and social justice fraternity, as well as an active member in USF Kasamahan, our Filipino student org. During my free time, I enjoy playing guitar and boxing.
Here is my LinkedIn Profile if you'd like to get to know me a little more.