Work
Some things I've worked on
- decision trees
- logistic regression
- stochastic gradient descent
- backpropagation
- forward-backward algorithm
- Q-learning
Statistics Capstone
As part of the course 36-490 Undergraduate Research, I worked on a capstone project in collaboration with the Office of the Dean of Students. The project focused on identifying “high-impact” courses that are associated with increased student success, and then improving the college’s understanding of the quality of these experiences and how they align with learning targets in the General Education Program.
The project poster (above) was chosen as the winner of the 2021-2022 CMU Department of Statistics and Data Science Poster Competition.
Causal Inference Research
As part of the course 36-318 Introduction to Causal Inference, I worked on a final project researching the method of overlap weights for causal effect estimation and conducted a simulation study to compare the performance of overlap weighting to other standard weighting methods such as inverse propensity weighting.
The performance of these methods were assessed by computing bias, standard deviation, and confidence interval coverage on simulated datasets, and the results aligned with those found in prior published work on overlap weights. The project was also extended by checking via simulation that the sandwich-based variance estimator for the estimated treatment effect is consistent.
Tableau Iron Viz 2020
I submitted a visualization to the 2020 Tableau Iron Viz Qualifier!
I gathered and compiled data about diets and created a long-form interactive visualization in Tableau for the qualifier competition. You can find the link to the full visualization here!
Update: I was chosen as a Runner-Up! I placed 6th overall out of over 370 entrants. Check out the Tableau blog post here.
ML Algorithm Implementation
As part of the course 10-301 Introduction to Machine Learning, I implemented several ML algorithms from scratch using Python and numpy including:
These were used to implement end-to-end systems that were able to learn several interesting tasks, such as sentiment polarity analysis, handwritten letter classification, named entity recognition, and solving the OpenAI Gym Mountain Car environment.
Big Data Bowl
With my partner Peter Wu, I participated in the 2019 Big Data Bowl, a sports analytics competition in which contestants used NFL player tracking data to analyze and rethink player performance.
Our team was selected as one of four college finalists for this event and we were able to present our findings to members of the NFL league office, team executives, and league sponsors at the 2019 NFL Combine in Indianapolis, which was an awesome experience.
We used statistical modeling in R to create an expected catch probability model and used this model to consider new standards for pass interference; in our paper we proposed a novel two-level system for interference calls.
You can find the paper here, and find all the code, data, and figures for the project on Peter’s GitHub.