Specializing in production LLM systems, cloud-native data platforms, and MLOps for large-scale environments. Experienced in end-to-end system design spanning modeling, optimization, governance, and deployment.
Agentic AI & LLM Systems
Architecting production-grade knowledge systems and autonomous workflows using RAG pipelines, prompt engineering, and LLM orchestration.
Distributed Data Engineering
Designing scalable cloud-native architectures and high-throughput ETL/ELT pipelines to manage complex, high-dimensional datasets.
Algorithmic Optimization & Decision Logic
Translating multi-dimensional constraints into scalable decision-making logic using graph-based algorithms and mathematical optimization frameworks.
Production MLOps & Cloud Infrastructure
Managing the end-to-end machine learning lifecycle through containerized microservices, Infrastructure as Code, and automated CI/CD pipelines.
Advanced Predictive Modeling
Developing statistical and neural architectures for high-fidelity pattern recognition, time-series forecasting, and automated data synthesis.
"In God we trust, all others must bring data."
"It works on my machine... but I don't live on my machine."
Roche
Design and deliver end-to-end AI/ML, data engineering, and optimization systems supporting Rocheβs R&D and product pipelines
AI & LLM Engineering and Optimization
- Architect production-grade LLM knowledge systems by building RAG pipelines and benchmarking NLP models (LDA, BERT, BART) to automate R&D data synthesis
- Integrate evaluation frameworks (Ragas) to quantify retrieval accuracy and generation quality, ensuring high-fidelity outputs for sensitive scientific workflows
- Build OCR and embedding-based parsing pipelines to convert compliance documents into structured data, automating form-filling and reducing manual effort
- Apply graph-based optimization algorithms (e.g., Dijkstra's) to automate combinatorial oligonucleotide design, translating scientific constraints into scalable decision logic
- Optimize manufacturing logistics and reduce line failures using Optimization to balance inventory and production targets
Data Architecture and Governance
- Engineer Data Vault 2.0 on Snowflake, integrating SAP, MES, and SQL Server sources via Qlik, Talend, and dbt to establish a governed, analytics-ready data foundation
- Build a data governance stack integrating Monte Carlo, Collibra, and Immuta to enable observability, metadata management, and secure access across AI workloads
Cloud Infrastructure and MLOps
- Modernize production infrastructure by migrating 10+ projects to AWS ECS/Fargate using GitLab CI/CD and Infrastructure as Code (IaC)
Analytics Platforms & BI Automation
- Pioneer the first Posit Connect Server deployment to host Streamlit applications, replacing manual workflows with containerized, production-ready analytics environments
- Lead multi-site Tableau rollouts by automating ingestion from Monday.com and Smartsheet into Snowflake, improving operational visibility and reporting consistency
Cisco
Empowered supply chain teams with an end-to-end analytics platform to detect market shifts and potential disruption risks
- Built ELT pipelines to aggregate semiconductor news and commodity price data via web scraping (Beautiful Soup) and external APIs, creating a unified signal layer for downstream analysis
- Leveraged fine-tuned prompt engineering with LLM APIs to automate news summarization and topic modeling
- Distilled risk signals using NLP (NER, sentiment, lemmatization) in NLTK and spaCy, improving intent and risk models
- Developed Tableau dashboards to track emerging risks and support data-driven decisions for supply chain stakeholders
H-E-B
Applied ML, AI, and optimization to solve large-scale supply chain challenges, improving efficiency and driving innovation
- Engineered 100M+ record datasets with PySpark SQL, building scalable ETL pipelines and data-quality checks for forecasting and optimization
- Developed demand forecasting models using time-series methods (e.g., SARIMAX) and deep learning architectures (e.g., GRU, LSTM), incorporating feature-driven signals to capture demand dynamics
- Designed an inventory simulation framework to evaluate policy performance under demand uncertainty and operational constraints; applied Hyperopt to tune forecasting and inventory parameters
- Optimized inventory strategies with Google OR-Tools and heuristics, producing actionable supply-chain recommendations
Sabre
Developed and deployed statistical and ML solutions to support dynamic pricing and market strategy in the airline industry
- Built airline price elasticity models using Poisson Regression, XGBoost, Random Forest, and LSTM, improving demand sensitivity estimation across markets and fare granularities to inform revenue and market entry strategies
- Performed market segmentation using K-means, Agglomerative Clustering, and DBSCAN, enabling personalized prediction and differentiated pricing strategies across customer and route segments
- Architected automated ML pipelines using YAML, Docker, and Python-based UIs for model training and real-time inference, accelerating experimentation cycles and enabling production-grade predictions
Refinitiv (London Stock Exchange Group)
Delivered data-driven insights on user behavior and product engagement to support analytics and platform strategy for Eikon
- Design and run A/B experiments on Eikon webpage layouts, driving a 16% increase in Daily Active Users (DAU)
- Build dashboards for bond credit rating monitoring using Python (Bokeh), reaching 5,000+ views in the first month
- Analyze subscriber behavior and retention with R to inform product and marketing strategy
- Authored and delivered Python tutorials on Eikon API usage; trained 1,000+ users in programmatic data access and analysis
Minsheng Securities
Built data pipelines, analytical models, and forecasting tools to inform trading and client-profitability analysis
- Built and maintained ETL pipelines for Private Placement transaction data using SQL, extracting key trading and client metrics; performed feature engineering and exploratory data analysis in Python to prepare modeling-ready datasets
- Developed time-series forecasting models (e.g., ARIMA) in R (forecast) to predict Private Placement trading volume, achieving a MAPE of approximately 15% and supporting short-term market activity planning
- Analyzed client profitability and portfolio relationships using factor-exposure analysis to identify concentration risk
- Translated analytical findings into structured insights for stakeholders, informing trading and investment decisions
Dongxing Securities
Applied quantitative modeling and analytics to assess portfolio risk, support equity research, and inform client investment insights
- Quantified portfolio risk by computing and comparing VaR and ES, enabling robust cross-method risk assessment
- Developed risk analysis workflows to evaluate portfolio sensitivity under different market scenarios, synthesizing results into structured risk assessment reports for internal teams and external clients
- Performed equity valuation and factor analysis, implementing Discounted Cash Flow (DCF) models and estimating equity beta via linear regression in R to support fundamental research and risk-adjusted return analysis
- Built visualizations to analyze stock-price dynamics and key indicators (RSI, MACD) for equity-research insight
Northwestern University
University of Wisconsin β Madison
Value effect of AI innovation zones: Green premium and cost reduction pathways in environmental disclosure
Quantifying the Environmental Impact of Electric Vehicle Adoption
The Forecast Ability of a Belief-Based Momentum Indicator in Full-Day, Daytime, and Nighttime Volatilities of Chinese Oil Futures
University of Wisconsin β Madison
Wisconsin School of Business
- Constructed a break-even analytical framework to investigate the Jevonsβ Paradox threshold, quantifying the energy-efficiency gains required to achieve net environmental savings in alternative energy vehicle markets
- Designed and executed a multi-source data integration architecture, harmonizing high-dimensional datasets from state and federal agencies to establish normalized energy features for longitudinal econometric analysis
- Utilized input-output analysis to model ecological footprints, establishing a quantitative framework for assessing cross-sectoral environmental impacts of emerging transportation technologies
- Authored peer-reviewed research on the rebound effect in EV adoption, presented at international conferences
University of Wisconsin β Madison
Wisconsin School of Business
- Implemented a Difference-in-Differences (DiD) model to quantify the causal effect of economic downturns on population-level health indicators across demographic cohorts
- Engineered a harmonized longitudinal database from 20+ public health surveys (including BRFSS and NHANES) using survey-weighted adjustments and cell-based weighting
- Authored the research proposal defining the technical methodology and led the initial data-engineering architecture
AI, ML & LLM Systems
Programming Languages
Data Engineering & Platforms
Cloud & DevOps
Web & Application Frameworks
Data Visualization & BI
Certificate in Sustainable Investing