AWS AI Chips

AWS Inferentia Customers

See how customers are using AWS Inferentia to deploy deep learning models.

NetoAI

NetoAI provides the TelcoCore suite—including TSLAM, ViNG, DigiTwin, and NAPI—to help telcos automate their complex, multi-domain operations and customer life cycle management. A cornerstone of this is our TSLAM LLM, the first open-source, action-oriented model for this sector. To build it, we needed to fine-tune a model on our massive 2 billion-tokens of proprietary dataset, and using Amazon SageMaker with AWS Trainium trn1 instances, we achieved remarkable cost savings and completed the entire fine-tuning in under three days. For production, AWS Inferentia2 and the Neuron SDK give us consistently low inference latency between 300-600ms. This end-to-end solution on AWS purpose-built AWS AI chips is key to our mission of delivering specialized, high-performance AI to the entire telecom industry.

Ravi Kumar Palepu Founder & CEO

SplashMusic

Training large audio-to-audio models for HummingLM is both compute-intensive and iteration-heavy. By migrating our training workloads to AWS Trainium and orchestrating them with Amazon SageMaker HyperPod, we achieved 54 percent lower training costs and 50 percent faster training cycles while maintaining model accuracy. We also migrated over 2 PB of data to Amazon S3 in just one week, leveraging Amazon FSx for Lustre for high-throughput, low-latency access to training data and checkpoints. With AWS Inferentia2-powered Inf2 instances, our inference latencies can be reduced by up to 10×, enabling faster, more responsive real-time music generation.