Projects

Here are some of the key projects I've worked on that demonstrate my skills in data science, machine learning, and (lightly) web development.

Rust LLM Web Server

Llama on Rust

Developed a lightweight web interface for interacting with open-source Large Language Models (LLMs) powered by Rust. Built with a clean client-server architecture leveraging mistral.rs for the backend, supporting various model formats including GGUF and Hugging Face models. Features configurable parameters for temperature, top-p, and max tokens.

Python XGBoost Data Visualization

Airbnb Pricing Prediction Model

Developed a machine learning model to predict Airbnb rental prices in NYC using a dataset of 48,000+ listings. Implemented feature engineering, data preprocessing, and model selection, achieving 64% variance explanation with XGBoost. Generated actionable insights on key price determinants like room type, location, and occupancy rate.

Python MS Azure SQL Machine Learning Data Visualization

MBTA System Analysis

Developed a comprehensive data analytics system to identify and address bottlenecks in Boston's transit system. Implemented a cloud-based architecture on MS Azure that integrates real-time MBTA operational data with weather information to predict headway and dwell times. Used Gradient Boosted Trees to achieve up to 77% prediction accuracy on both headway and dwell times, revealing significant correlations between weather conditions and transit delays.

Python Machine Learning Data Visualization

MLB Player Analysis

Analyzed MLB player data to determine the ideal height and weight for successful players. Used PCA and K-means clustering to identify player types and linear regression to quantify the relationship between performance metrics (Wins Above Replacement, Runs Created) and player salaries. Attempted to find the height and weight that maximizes the player's average WAR value.