> Infer-GC
Garbage Collection-style memory management for LLM inference achieving 37-52% GPU memory reduction
Building the future, one commit at a time. Here are some of my open-source projects and research work.
Garbage Collection-style memory management for LLM inference achieving 37-52% GPU memory reduction
Interactive Vulkan learning system with 20 progressive exercises and a final renderer project
Browser-based music style filter and audio processing workstation with real-time effects
Hierarchical context memory system for multi-agent LLM collaboration with 3-layer architecture
Distributed S3-compatible object storage system with consistent hashing and 3-way replication
Enterprise-grade social media timeline feed system with sub-15ms P95 latency and 99.99% availability
High-performance browser-based ML inference engine using Rust and WebAssembly with SIMD optimization
Memory compression framework for long conversations based on compiler optimization theory
Intelligent PowerPoint presentation generator using LLMs with multi-agent architecture