I'm an MS Business Analytics and Information Management graduate student with five years of financial industry experience. I'm passionate about analytics. Proficient in Python, Tableau, MySQL, AWS, and more, I excel in leveraging quantitative methods to solve business challenges and drive innovation. With hands-on experience in data-driven projects, I bring a versatile skill set ready to make impactful contributions to any analytics team.
April 2024
This project tackled the issue of inadequate product descriptions in e-Commerce by utilizing Large Language Models and image-to-text technology. Working with a national retailer, we proposed novel solution to improve the quality of product descriptions for 111,000 unique items. The solution successfully enhances 73% of products with insufficient descriptions or only images, thereby boosting customer experience.
View ProjectApril 2024
Our team represented Purdue University's Daniels School of Business to participate the case competition. We delved into analyzing and proposing actionable recommendations for a leading sustainable aluminum packaging solutions firm. We performed data analysis, leveraged external sources, and employed advanced modeling techniques like PCA, clustering, and LLMs. With a focus on operational excellence and HR strategies, our presentation captivated the panel, leading to engaging discussions and valuable feedback from judges.
View ProjectFebruary 2024
This project aimed to predict ticket purchases and identify whether tickets would be bought on the primary or secondary market for NCAA Division I Women’s Basketball. The methodology included data exploration, feature engineering, and model building. Our final model was Gradient Boosting, which secured 8th place out of 54 teams from 4 universities. Results were visualized using Tableau to highlight key trends and the model's effectiveness in predicting ticket purchases.
View ProjectFebruary 2024
In this project, our team predicted Walmart's sales based on the dataset from a Kaggle competition. We applied tree-based models, ensemble methods, Deep Neural Networks, Long Short-term Memory (LSTM), and Transformer. We achieved a weighted root mean squared scaled error (RMSSE) of 0.686.
View ProjectDecember 2023
Our team developed a model to detect inappropriate content in the posted articles on Craigslist, a classified advertisements website. We scraped the website to obtain our training data and utilized large language model to label the data. Then, we selected the logistic regression model as the final model from a variety of models, such as SVM, Naive Bayes, Gradient Boosting, and Deep Neural Networks. The final model achieved an AUC score of 0.88. The model can be applied to provide a better customer experience.
View ProjectNovember 2023
I achieved 4th place among 290 student teams nationwide on the Kaggle Leaderboard by leveraging Large Language Models (LLMs) to automate medical documentation extraction. Through meticulous model selection and prompt engineering techniques, I optimized Named Entity Recognition (NER) outputs for precise extraction of vital patient data. Employing post-processing techniques such as filling in missing values, reformatting, and error checking, I attained an impressive Word Error Rate of 0.54.
View ProjectNovember 2023
This project proposed a data-driven business strategy for Airbnb. First, we proved that superhosts can earn higher revenue than normal hosts using difference-in-difference estimation. Second, we built a gradient-boosting model to predict potential superhosts. Lastly, we recommended that Airbnb provide customized incentives for these potential superhosts to drive Airbnb's profit based on our revenue-predicting model.
View Project