// AI/ML
COMPLETED
AI/ML
Triton-Kernel-Optimization
Production-grade Grouped Query Attention (GQA) in Triton with composable attention patterns: 8 variants via tl.constexp…
Python
Triton 2.1+
PyTorch 2.0+
CUDA
+1