Hi, I’m Renee, and I’m a senior at the University of Washington double majoring in Informatics (Data Science) and Economics, graduating in Spring 2026. You can find documentation of all my past projects here.
Developed and optimized robust deepfake detection models using Support Vector Classifier and Multi-Layer Perceptron and fine-tuning hyperparameters with GridSearch cross-validation to achieve an 82% accuracy rate on test data, demonstrating strong performance in identifying AI-generated content in real-world scenarios. Preprocessed a dataset of 2,000 real and AI-generated facial images, applying grayscale conversion and eigenface principal component analysis to extract key visual features, such as texture irregularities, color distortions, and high-frequency noise that serve as indicators of AI-generated content.
Libraries: pandas, numpy, scikit-learn, matplotlib
Developed an interactive web application, MobySearch, to enable users to search and explore video game data from MobyGames. Crawled and pre-processed 1026 documents using wget and BeautifulSoup, then indexed the data using python-terrier without stemming or stopword removal. Implementing HiemstraLM for retrieval resulted in a mean average precision of 21%, R-precision of 10%, and reciprocal rank of 23%. Logged user searches provided weekly trending queries and personalized recommendations, enhancing the user experience with advanced search features and a popularity-based recommendation system.
Libraries: pandas, numpy, python-terrier, BeautifulSoup
Developed a comprehensive relational database system for Furry Friends Shelter, a conceptual no-kill shelter for cats and dogs, to streamline the adoption process, enhance health monitoring, and improve overall animal care. Utilized relational database design principles to create a schema with 10 entities. Implemented critical SQL queries for tracking adoptions, managing health records, and optimizing animal care, resulting in over 200 health logs, 100+ vaccinations, and 150+ medication administrations.
Designed and implemented a scalable MongoDB health metrics database, optimizing schema design with nested arrays and indexing strategies to enhance query performance, scalability, and data retrieval efficiency. Leveraged Python to implement ETL processes, extracting relational data, transforming it into a non-relational format , and loaded it into MongoDB,streamlining the migration of~250k rows of country health data and optimizing data integration workflows. Developed and optimized complex aggregation queries to extract insights, improving query execution runtimes by 30%.
Designed the high-level software architecture for a scalable, event-driven data platform for smart home security devices, leveraging AWS, Azure, and NoSQL databases to define real-time data validation processes and facilitate system communication. Clarified ambiguity in system requirements by defining MVP features, identifying dependencies, and prioritizing features. Developed REST API contracts to ensure seamless data exchange between components, focusing on reliability and low-latency communication. Outlined comprehensive logging, monitoring, and alerting schemas to improve system observability and guide fault-tolerant design.
Wrote 10 page analysis of 5 Vanguard ETFs from the Wealthfront Classic Portfolio. Utilized R to calculate returns, sample statistics and Value-at-Risk, created and visualized efficient portfolios for different expected return targets to determine optimal investment.
Libraries: tidyverse, zoo, xts, PerformanceAnalytics
Leading development of a research-based misinformation game directory to promote information and media literacy resources to educators and students. Collected metadata on 45 misinformation games, and developed a search and filter system based on target age, number of players, genre, and format.