Full-time Posted May 31, 2026
Apply Now

Job Description

Job Responsibilities:
1. Distributed Training Engineering Participate in the implementation of large-scale distributed training solutions. Lead the engineering deployment of data parallelism, model parallelism (TP/PP), and ZeRO optimization. Continuously tune GPU compute utilization and ensure stability of ultra-large-scale training tasks. 2. Compute Scheduling Optimization Deeply involved in the development and optimization of AI task scheduling logic. Implement fine-grained resource management, fault self-healing, and efficient Checkpoint mechanisms. Solve compute bottlenecks in complex gaming scenarios. 3. End-to-End Model Engineering Own the full pipeline from model training to inference deployment. Participate in operator performance profiling, model quantization, and high-performance inference pipeline construction. Support rapid iteration of AI in gaming business. 4. AI-Driven Engineering Evolution Actively adopt AI Coding technologies to improve development efficiency. Drive...

Apply for This Position

Ready to take the next step? Click the button below to submit your application.

Submit Application