AWS for Industries
Opportunities for telecoms with small language models: Insights from AWS and Meta
The telecommunications (telecom) industry is undergoing a transformative shift driven by advancements in artificial intelligence (AI) and machine learning (ML). Small Language Models (SLMs), which can run efficiently on Internet of Things (IoT) and edge devices, are at the forefront of this revolution. SLMs are scaled-down versions of large language models (LLMs) that deliver comparable AI capabilities while needing significantly less computational power and memory. By offloading the heavy AI processing tasks to the cloud, SLMs enable resource-constrained devices to perform complex functions such as natural language processing, user profiling, and predictive automation. Companies such as Amazon Web Services (AWS) and Meta are pioneering the integration of SLMs into telecom services for edge devices, which enables hyper-personalized customer experiences and autonomous environments.
This post explores how telecom operators can use SLMs to unlock new opportunities, reduce operational costs, and create innovative services that meet the evolving demands of their customers. By transitioning from rigid, rule-based systems to autonomous, learning-based systems, operators can drive customer satisfaction and open new revenue streams. By running SLMs locally, autonomous systems such as smart homes can automatically adjust settings in real-time based on user behavior, thus improving responsiveness. This approach enhances data privacy because sensitive information can be retained locally, thereby reducing exposure to security threats. When optimization is completed in the cloud, services such as AWS IoT Greengrass can be used to enable deployment to edge devices. By deploying SLMs on-device rather than relying on cloud-based models, telecom operators can use reduced latency, thus enabling faster responses and immediate decision-making. It makes sure of offline functionality, allowing systems to operate seamlessly without an internet connection. Finally, the approach is cost-efficient, which reduces the need for constant cloud interactions and lowering bandwidth.
Introduction
The rapid evolution of AI and ML technologies presents telecom operators with unprecedented opportunities to enhance their services and customer interactions. SLMs, with their efficiency and ability to run on IoT and edge devices, are ideal for telecom applications such as autonomous environment management, personalized customer interactions, and on-device IoT tasks such as over-the-air updates and real-time sensor data analysis. In the 1980s, the advent of personal computers introduced individualized computing. This set the stage for ubiquitous computing, where computers are integrated into everyday objects. This vision has materialized today with embedded systems in cars, factories, smart cities, home appliances, and gadgets. The emergence of embedded devices, such as microcontrollers has enabled these systems to interact transparently through wireless technologies such as LoRaWAN, Bluetooth, Wi-Fi, and 5G. The IoT paradigm seeks to integrate information processing and communication into everyday objects. As IoT devices have evolved, so have their capabilities, leading to the Internet of Intelligent Things (IoIT). IoIT represents the convergence of embedded systems, edge computing, and ML, where devices are not only connected but also intelligent, and they can process data and make decisions autonomously.
Problem statement
As of this writing, telecom operators face several challenges for IoT and edge device management, such as rigid rule-based systems, data privacy, and latency challenges to make decisions in near real-time. Today’s connected devices are resource constrained, which makes it challenging for them to support LLMs. Traditional automation relies on inflexible rule engines based on “if-else” statements, which makes it the management of numerous devices and sensors cumbersome, because adding or removing devices requires manual adjustments to the rules. Processing data in the cloud offers powerful resources but also introduces latency, which is problematic for real-time applications such as IoT and autonomous environments. Furthermore, sending sensitive data to the cloud requires stringent measures to protect data and make sure of compliance. SLMs can help overcome these challenges by running on cloud and edge devices while delivering comparable AI capabilities.
Opportunities with SLM and IoIT
The following opportunities are made available by SLMs and IoIT.
An Autonomous Environment (AE) uses sensory data and edge-deployed SLMs to automatically optimize settings based on behavior patterns, thus eliminating the need for explicit commands or predefined rules. For autonomous smart homes, telecom operators can offer services that adjust lighting, temperature, and other settings dynamically. For example, modifying lighting based on time of day or activities, implementing proactive climate control, automating household tasks such as dispatching a robotic vacuum, and performing predictive maintenance by scheduling repairs autonomously. These intelligent behaviors are learned from both general patterns and individual user behaviors, which continually adapt to provide a personalized living experience. Running these processes on edge devices makes sure of faster responses, and a better user experience. By integrating automation control systems with AI agents that allow natural language interaction, users can manage home devices through voice commands and create personalized routines. This seamless and intuitive management enhances convenience and user engagement. For telecom operators, offering these advanced smart home services differentiates them in the market and fosters increased customer loyalty.
Enhancing Customer Premises Device Functionality through cloud-based SLMs
As the demand for IoIT devices grows, integrating advanced AI-driven functionalities into customer premises devices (CPDs), such as smart speakers, thermostats, and security systems, has become a priority. SLMs are designed specifically for edge devices, enabling them to process complex AI tasks while using minimal computational resources. However, many legacy edge devices, such as brownfield hubs, cannot accommodate these SLMs. To maintain a unified architecture and solution across both modern and legacy devices, a viable mitigation strategy is to run SLMs in the cloud, as though they are operating on the edge, using secure tunneling. This approach enables non-capable edge devices to use the power of SLMs without needing a complete hardware overhaul. By offloading heavy AI processing tasks to the cloud, SLMs allow resource-constrained devices to perform functions such as natural language processing, user profiling, and predictive automation. These cloud-hosted SLMs can process and analyze data from multiple CPDs across a network, thus enabling real-time AI-enhanced responses. For example, cloud-based SLMs can improve smart home systems by managing voice interactions, automating tasks, and learning user preferences to optimize device settings dynamically. This could include automatic lighting adjustments, proactive climate control, and the synchronization of security systems based on user behavior. Running SLMs in the cloud extends AI capabilities to customer devices while addressing security and scalability concerns. Cloud platforms such as AWS provide robust encryption, thus making sure that data exchanged between cloud-hosted SLMs and edge devices remains secure. Furthermore, telecom operators can dynamically allocate resources within scalable cloud infrastructure to manage varying loads, thereby reducing operational costs and enhancing customer experiences. By using cloud-based SLMs with secure tunneling, operators can offer advanced AI-driven services that meet evolving customer expectations, even on legacy devices, without needing extensive local hardware upgrades.
Network Troubleshooting with SLMs on Customer Premises Equipment
Telecom operators can deploy AI agents that can resolve issues in real-time without human intervention. These agents can use SLMs to understand and process customer inquiries and troubleshoot connectivity issues. Language models can assist device operators in using or troubleshooting equipment more effectively by explaining diagnostic codes, providing tips for proper usage, or interacting with other equipment through AI agents. In cases where devices have limited cloud connectivity or there are concerns about transmitting data outside of a facility, running an SLM locally on the device is highly beneficial. This makes sure of real-time support and maintains data privacy. By integrating SLMs into customer-facing applications, operators can offer conversational interfaces for network troubleshooting. Customers can interact with AI agents through mobile apps or smart home devices to check network status, optimize settings, and resolve connectivity problems. Implementing AI-driven diagnostics offers the benefit of immediate resolution of network issues without human intervention, which leads to higher customer satisfaction and reduced call center costs. In this use case, AI agents identify and fix Wi-Fi problems through a conversational interface. Customers can ask the AI agent to check network status, optimize settings, or troubleshoot issues. If necessary, the agent can seamlessly escalate the issue to a human agent, thus providing context to reduce resolution times.
Solution overview
Telecom operators can deploy SLMs by first selecting a model, such as Meta’s Llama 3.2 1B, which is optimized for resource-constrained devices. This model enables real-time processing and decision-making on customer devices such as smartphones, home equipment, and Customer Premises Equipment (CPEs), thus making sure of efficient operation within an edge computing environment. When the model is chosen, operators can use services such as Amazon Bedrock to fine-tune the SLM for specific telecom use cases, such as network troubleshooting, customer support automation, and smart environment management. This customization improves the model’s accuracy and responsiveness to industry-specific challenges. Following model optimization, deployment and management are facilitated through platforms such as AWS IoT Greengrass, which allows SLMs to be efficiently deployed on edge devices. AWS IoT Greengrass supports real-time updates, model management, and robust security, thus making sure of low-latency processing while safeguarding data privacy. Furthermore, more complex tasks can be offloaded to the cloud, when necessary, while seamlessly integrating with existing telecom infrastructure. Using this approach, operators can effectively scale SLMs, thus enabling real-time AI-driven automation.
Example Architecture for Smart Home Automation
An autonomous home system uses sensory information to automatically optimize the environment for occupants. Key components of this system include sensors and actuators, devices that collect data such as temperature, motion, and occupancy, and perform actions such as adjusting lighting and temperature. Edge computing devices, such as local hubs or controllers running SLMs, process sensor data and make real-time decisions. Communication protocols such as Wi-Fi, Zigbee, Bluetooth, or Matter enable device communication within the system. Central to the system are AI models: SLMs trained to understand context, analyze user behavior, and autonomously control home systems.
Figure 1: Reference architecture for On-Device SLM Deployment in Autonomous Smart Home Systems
Model Deployment and Management
Implementing SLMs requires an efficient deployment pipeline that includes model optimization, over-the-air updates, and continuous monitoring with feedback. Models are optimized for edge deployment using techniques such as quantization to make sure that they run efficiently on local devices. Over-the-air updates allow models to be updated remotely, thus improving performance or adding new features without the need for physical intervention. Continuous monitoring of model performance and user feedback is essential for refining and adapting the models over time to better meet user needs.
To test the deployment of an SLM using AWS services, AWS IoT Greengrass was used to deploy an SLM to a simulated AWS IoT Greengrass core device. The first step involved setting up an Amazon Elastic Compute Cloud (Amazon EC2) instance for simulation. An m5.large instance (with 2 vCPU and 8 GB of RAM) running Ubuntu 24.04 was launched and configured as a Greengrass core device. The AWS identity and Access Management (IAM) role assigned to the core device needs read access to an Amazon S3 bucket where model artifacts are stored. The Llama 3.2 1B model was chosen for deployment for its performance and availability in the ONNX format, which allows compatibility with various device frameworks and ML accelerators. Two Greengrass component recipes were created to support this deployment. The first recipe sets up the ONNX runtime and other dependencies on the core device, such as creating a Python virtual environment and installing the necessary modules. The second recipe downloads the model artifacts to the core device. The Llama 3.2 1B model was downloaded from the Hugging Face ecosystem, and then uploaded into the S3 bucket. After deploying the two components, we can use the SLM. On the core device, we activate the Python virtual environment and run the inference script. In a real-world scenario, the SLM could be integrated into any application running on the device. AWS IoT services can also be used to capture model input, output, and diagnostics, and send this data to an MQTT topic for auditing and cloud-based analysis. This approach demonstrates how AWS services can support the deployment and management of SLMs on edge devices.
Figure 2: Reference Architecture for Deploying Llama 3.2 1B Model on an edge device using AWS IoT Greengrass
Using the AWS and Meta’s Contributions
AWS and Meta are collaborating to bring smaller Llama models to on-device applications, featuring new Meta models with 1 billion and 3 billion parameters. These lightweight models enable Llama to run efficiently on mobile and other edge devices. This significantly lowers the cost of safety models such as Llama Guard and enables affordable agents for tasks such as multilingual summarization and retrieval-augmented generation (RAG). These models are expected to drive a wave of on-device and local agent innovation, offering capabilities such as data retrieval and summarization locally, with minimal latency, and while preserving user privacy. This allows telecom operators to create hyper-personalized experiences. AWS services such as Amazon Bedrock and Amazon SageMaker provide fully managed platforms that make these foundation models available through API, thus eliminating the need to manage infrastructure. Furthermore, Agents for Amazon Bedrock offer a turnkey solution for creating natural language-based agents. Both the Llama 3.2 1B and the Llama 3.2 3B models are supported on Amazon Bedrock and SageMaker. Making sure of interoperability and security is critical. This requires the adoption of open standards, compatibility with various devices, and robust security protocols to protect customer data on edge devices, all while making sure of compliance with data protection regulations, which are especially important for telecom operators.
Conclusion
The integration of SLMs into telecom services represents a significant opportunity for operators to explore and lead in a competitive market. By embracing SLMs, telecom companies can enhance customer satisfaction by providing personalized, responsive services that meet individual customer needs. Furthermore, they can open new revenue streams by offering autonomous environments such as advanced smart home services, and other innovative offerings.