Zhaohui Wang

Full Stack Software Engineer / ML Engineer

zwang000@usc.edu

+1 (213) 910-9843

Los Angeles, CA

GitHub

Professional Summary

Computer Science M.S. candidate at USC with extensive experience in full-stack development and machine learning systems. Proven track record at Meetfood, NSFocus, and Tencent building high-performance backend systems, mobile applications, and AI systems using Python, Java, TypeScript, and Rust. Deep expertise in multi-agent reinforcement learning, knowledge distillation, recommender systems, and high-performance computing, with demonstrated ability to translate cutting-edge AI research into production-ready solutions.

Core Competencies

Technical Skills

Languages: Python, Java, TypeScript, JavaScript, Rust, C/C++, CUDA, SQL
ML & DL: PyTorch, Transformers, PEFT, TRL, LangChain, scikit-learn
ML Specializations: Knowledge Distillation, Model Compression, Neural Architecture Search, Reinforcement Learning
NLP & LLMs: LoRA/QLoRA, RAG, SHAP, Fisher Information, Prompt Engineering
Recommender Systems: DeepFM, AutoInt, DIN, xDeepFM, DCNv2, Transformer4Rec
Web & Backend: Spring Boot, Node.js/Express, FastAPI, Flask, MongoDB, PostgreSQL, Elasticsearch, Redis
Frontend & Mobile: React, React Native, Svelte 5, Tailwind CSS, WebAssembly (Rust/WASM)
HPC & Parallel: CUDA, OpenMP, Ray, AsyncIO, ROS
Cloud & DevOps: AWS (EC2, S3, CloudFront, Lambda), Docker, Kubernetes, CI/CD (GitHub Actions)
Vector DBs: FAISS, Chroma DB, IndexedDB

Core Strengths

Cross-functional Team Collaboration & Project Management
Full-Stack System Architecture Design
Fast Learner - Rapidly Apply New Technologies to Production
Problem Solving & Analytical Thinking
CI/CD Pipeline Optimization
Technical Documentation & Communication

Professional Experience

Full-Stack Software Engineer Intern

Meetfood | AWS, React Native, TypeScript, Node.js

May 2025 - Present | Los Angeles, CA

Engineered full-stack mobile application for restaurant discovery platform, leading development from architecture design to deployment
Built cross-platform mobile UI with React Native, TypeScript, and Jotai state management, implementing responsive design patterns and reusable component library to accelerate feature delivery by 20%
Developed RESTful backend APIs using Node.js, Express, and MongoDB, handling media uploads, user authentication, and real-time data synchronization for thousands of concurrent users
Architected scalable video processing pipeline with AWS MediaConvert and CloudFront CDN, optimizing client-side playback with adaptive bitrate streaming and reducing load latency by 20%
Deployed cloud infrastructure on AWS (EC2, S3, RDS) with automated CI/CD via GitHub Actions and CodePipeline, reducing deployment time by 25% and ensuring zero-downtime releases
Implemented real-time features including push notifications, live updates, and offline-first architecture with local caching and sync mechanisms

Full-Stack Software Engineer

NSFocus | Java, Python, React, Elasticsearch

Jun 2023 - Jun 2024 | Beijing, China

Developed full-stack network security analytics platform with React frontend and Spring Boot backend, supporting real-time threat visualization, log analysis, and incident response workflows
Built interactive dashboards with React, D3.js, and ECharts for security event visualization, implementing real-time updates via WebSocket and optimized rendering for 100K+ data points
Engineered backend microservices with Spring Boot and Python FastAPI, integrating Elasticsearch for log aggregation and reducing query latency by 35% through query optimization
Implemented authentication and authorization system with JWT tokens, role-based access control (RBAC), and session management for enterprise multi-tenant environment
Containerized application stack with Docker and established CI/CD pipelines, automating testing, building, and deployment processes

Backend Engineer Intern

Tencent | Spring Boot, MyBatis, Redis, Docker

Jul 2019 - Aug 2019 | Shenzhen, China

Developed backend services for enterprise messaging platform using Spring Boot and MyBatis, implementing RESTful APIs with Redis caching to support high-volume traffic
Optimized asynchronous processing with CompletableFuture and reactive programming patterns, achieving 25% throughput improvement and reduced response time by 30%
Deployed microservices to Kubernetes (TKE) with health checks, auto-scaling, and monitoring via Prometheus/Grafana, reducing deployment time by 40%

Key Projects & Achievements

LayerwiseAdapter: Multi-Teacher Fusion for Recommendation

PyTorch, CUDA | Jun 2024 - Sep 2024

Designed and implemented a novel 3-layer adaptive framework for multi-teacher knowledge fusion in recommendation systems, integrating traditional ML algorithms with LLMs through Fisher-guided knowledge distillation
Achieved SOTA performance on MovieLens 1M with RMSE=0.8921, surpassing best individual algorithm AutoInt (0.8910), while maintaining 43.8% parameter reduction
Implemented 6 SOTA recommendation algorithms (DeepFM, AutoInt, xDeepFM, DIN, DCNv2, Transformer4Rec) with CUDA optimization, achieving 25x faster inference than LLM baseline
Developed Fisher Information-guided importance analysis for intelligent layer selection and pruning-aware knowledge distillation (PAKD), validated with 75% size reduction and 400% speedup

FisherLD: Fisher-Guided Knowledge Distillation for LLM Compression

PyTorch, Transformers | Oct 2024 - Dec 2024

Researched and implemented Fisher Information-guided layerwise distillation framework for efficient LLM-based recommendation systems
Trained 12-layer Transformer baseline on Amazon Electronics reviews (86K samples) achieving 87.53% test accuracy, then compressed to 6-layer student model
Designed layer importance analysis using Fisher Information Matrix and gradient norms, identifying top-6 critical layers with 1577x higher importance for 50% layer reduction
Achieved compression breakthrough where student model outperformed teacher (54.9% vs 54.2%) with 43.8% parameter reduction and 44% model size reduction

Interruptr: Game-Theoretic Multi-Agent Code Analysis

Python, OpenAI API, Ollama | Oct 2024 - Nov 2024

Architected a game-theory optimized heterogeneous multi-agent system with 3+1 architecture (3 GPT experts + 1 local verifier) for cost-quality Nash equilibrium in code analysis
Achieved 22% cost reduction ($0.094 vs $0.120) and 13% quality improvement (F1: 0.85 vs 0.75) compared to single GPT-4, with 44.6% efficiency boost
Designed adversarial verification system using lightweight Qwen3 local model (0.5B) as quality detector, detecting 20%+ LLM hallucinations with less than 5% cost overhead
Implemented true parallel collaboration with async execution, achieving 3x speedup vs serial execution without quality degradation

Emotion-Aware Language Models with Audio Augmentation

PyTorch, Transformers, PEFT | Mar 2024 - Jul 2024

Researched methods for training language models that perceive and respond to emotional context from speech by integrating audio features into LLMs
Achieved 48% reduction in cross-speaker degradation (41.7x vs 80.2x) by training on real emotional speech (RAVDESS, 24 speakers) vs TTS-generated data
Identified Feature-Generalization Paradox where better in-domain performance inversely correlates with cross-speaker robustness, establishing need for strict cross-speaker evaluation
Fine-tuned Qwen2.5-1.5B-Instruct with 4-bit quantization and LoRA on emotional speech datasets

Persona-RAG: Memory-Enhanced Multi-Persona Conversational AI

PyTorch, LangChain, FAISS | Jan 2024 - May 2024

Developed lightweight personality-driven conversational AI combining MBTI personality models with memory-enhanced retrieval (RAG)
Implemented 3 distinct MBTI personalities using QLoRA fine-tuning on Qwen3-0.6B, achieving 97.3-97.6% token accuracy with only 39MB per personality adapter
Designed persona-aware memory retrieval algorithm with weighted scoring, achieving better contextual relevance than standard RAG
Built FAISS-based vector search system achieving <0.1s search time through 100+ memories with multi-user support

Music Style Filter PWA

Rust, WebAssembly, React, TypeScript | Sep 2024 - Dec 2024

Built production-grade Progressive Web App for real-time audio style filtering with 8 preset styles using DSP algorithms compiled to WebAssembly
Implemented audio processing engine in Rust with WASM compilation, achieving 5-10x performance improvement over JavaScript through SIMD optimizations
Architected offline-first PWA with service workers and IndexedDB caching, achieving Lighthouse score 95+ across all metrics

Browser ML Inference Engine

Rust, WASM, React, TypeScript | Jan 2024 - Dec 2024

Developed production-ready neural network inference engine running entirely in browser using Rust/WebAssembly, supporting 17+ operators
Engineered high-performance WASM runtime with SIMD acceleration, Web Workers, and INT8 quantization, achieving 10-50x speedup over JavaScript
Optimized binary size to under 1MB through aggressive optimization, tree shaking, and compression

Vector Database with Browser UI

Rust, WASM, React, IndexedDB | May 2024 - Nov 2024

Created production-grade browser-based vector database using Rust/WASM with HNSW indexing, supporting 7 distance metrics and persistent storage via IndexedDB
Implemented SIMD-accelerated vector operations achieving near-native performance in browser environment
Published npm package with full documentation and example applications

Desktop Monitoring Widget

Rust, Tauri 2.x, Svelte 5 | Aug 2024 - Dec 2024

Built cross-platform desktop application using Tauri 2.x with Svelte 5 frontend, supporting Windows, macOS, and Linux
Developed Rust backend for real-time system metrics collection with minimal performance overhead
Achieved sub-100ms launch time and under 50MB memory footprint through optimization

High-Performance Fractal Rendering Engine

C++, CUDA, OpenMP, WebAssembly | Sep 2024 - Dec 2024

Designed massively parallel fractal rendering engine achieving 2 billion pixels/sec on dual RTX 3090
Optimized CUDA kernels with memory coalescing, delivering 1,216x speedup over CPU baseline and 400x vs OpenMP for 4K resolution (6ms rendering)
Developed WebAssembly version using Emscripten for 5-20x faster browser performance than JavaScript

AgentMesh: Distributed Multi-Agent Coordination Framework

Python, Ray, AsyncIO | Aug 2024 - Nov 2024

Architected distributed async agent coordination framework using Actor model to solve race conditions for concurrent LLM agents
Implemented AST-based semantic conflict detection and three-way merge algorithm, achieving 85% automatic merge success rate
Designed priority-based task scheduling supporting 1000+ pending tasks with P99 latency < 500ms

SmartNet: Explainability-Guided Neural Architecture Search

PyTorch, Flask | May 2024 - Aug 2024

Developed first platform combining visual network construction with explainability-guided neural architecture search (NAS)
Implemented multi-objective NAS optimizing for accuracy + explainability (SHAP/Fisher scores) + efficiency + speed
Built modular block system with 6+ reusable components and web-based drag-and-drop interface with real-time code generation

Multi-Robot Communication System

PyTorch, ROS, CUDA | Sep 2020 - Apr 2023

Led design and development of multi-agent reinforcement learning system improving robot coordination through distributed control and real-time communication
Implemented custom PyTorch-based MARL algorithms (DDPG, MADDPG, QMIX) with CUDA parallelization, achieving 4x faster convergence
Integrated ROS middleware for real-time robot communication, supporting SLAM localization and distributed control across heterogeneous platforms
Validated system across simulated (Gazebo) and physical robots (TurtleBot), improving task success rate by 30%

Education

M.S. in Computer Science

University of Southern California | Jul 2024 - Dec 2025

Los Angeles, CA

Ph.D. Candidate in Computer Science

University of Chinese Academy of Sciences | Aug 2020 - Mar 2023

Beijing, China

B.S. in Computer Science

Central South University | Sep 2016 - Jun 2020

Changsha, China

Additional Information

Work Authorization: F-1 Student Visa with OPT eligibility

Availability: Available for full-time positions starting January 2026

Languages: English (Fluent), Mandarin Chinese (Native)

BACK_TO_HOME()