AWS Compute Blog Note

AWS Compute Blog

aws.amazon.com/blogs/compute is the official blog of Amazon Web Services (AWS) Compute. It provides a platform for AWS to share knowledge, insights, and best practices on various compute-related topics, including serverless computing, containers, and more. The blog features articles written by AWS experts, engineers, and architects, offering in-depth technical information, tutorials, and case studies on how to use AWS compute services. The content is geared towards developers, architects, and IT professionals who want to learn about the latest trends, features, and innovations in cloud computing. Some of the topics covered on the blog include: - Serverless computing with AWS Lambda - Containerization with Amazon Elastic Container Service (ECS) and Amazon Elastic Container Service for Kubernetes (EKS) - Virtualization with Amazon Elastic Compute Cloud (EC2) - High-performance computing with AWS Batch and AWS ParallelCluster - Machine learning and artificial intelligence with AWS SageMaker The blog also features guest posts from AWS partners and customers, sharing their experiences and use cases of using AWS compute services. Overall, the AWS Compute Blog is a valuable resource for anyone looking to stay up-to-date with the latest developments in cloud computing and to learn from the experts at AWS.

Thread Of Notes

AWS Nitro Isolation Engine: Formally verifying the hypervisor in the AWS Nitro System

Ali Saidi is a VP and Distinguished Engineer at AWS Millions of customers use the AWS Nitro System to protect their most sensitive workloads, and AWS is an industry leader in innovation to secure customer data. Helping our customers keep their data secure and confidential is our highest priority, and we continue to make investments […]
CdXz5zHNQW_RX2bF7mogO.jpeg

Build RAG-powered AI solutions at the edge with AWS Local Zones and Outposts

Organizations in regulated industries or with strict information security requirements are increasingly looking to use generative AI. However, they often face a dilemma: how to utilize powerful models while keeping data strictly on-premises or within specific geographic boundaries. The solution lies in deploying self-managed Small Language Models (SLMs) on premises with AWS Outposts or in […]
CdXz5zHNQW_O06u2DTJCJ.png

Optimize EC2 costs with AWS Compute Optimizer right sizing

One of the most impactful ways to improve the ROI on your Amazon Elastic Compute Cloud (Amazon EC2) investment is rightsizing — when you match your instance types and sizes to the actual resource demands of your workloads. However, doing this manually across hundreds or thousands of instances is time-consuming and error-prone. AWS Compute Optimizer […]

Integrating Event Source Mappings with AWS Lambda tenant isolation mode

Building event-driven multi-tenant SaaS applications typically requires compute isolation between tenants to prevent data leakage, maintain security boundaries, and ensure compliance. Traditionally, you had to choose between two approaches: sharing execution environments across tenants (risking cross-tenant contamination of in-memory state) or managing separate Lambda functions per tenant (which introduces operational overhead, increasing costs, and complicating […]
CdXz5zHNQW_RVew2yRUNp.png

Multi-Region event-driven failover architecture with Amazon EventBridge and Route 53

Multi-Region Event-Driven Failover Architecture with Amazon EventBridge and Route 53 Event-driven architectures enable applications to respond to events in real-time, providing scalability and loose coupling between components. However, ensuring high availability across multiple AWS regions requires careful design of failover mechanisms. This post demonstrates how to build a resilient multi-region event-driven architecture using Amazon EventBridge, […]
CdXz5zHNQW_Rp35XDuwwJ.png

Migrating your Java applications to AWS Graviton using AWS Transform custom

For Java applications, modern JVMs like Amazon Corretto and OpenJDK are highly optimized for Arm64 and modern applications that are pure Java often require zero changes to run on Graviton. In many cases, applications aren’t fully modernized or purely Java and have a range of dependencies. When you’re responsible for migrating workloads, it’s helpful to […]
CdXz5zHNQW_9mlrVKtfBB.png

Streamline your infrastructure: Automating AMI creation with Kiro CLI and EC2 Image Builder

Managing infrastructure at scale requires robust automation tools that reduce manual effort while maintaining consistency and security. The combination of Kiro CLI and AWS EC2 Image Builder offers a powerful solution for automating the creation, testing, and deployment of Amazon Machine Images (AMIs). The challenge of manual image management Traditional approaches of creating and maintaining AMIs often involve manual […]
CdXz5zHNQW_opopd859IO.png

Sharing Capacity Blocks for ML Across Your AWS Organization

When your data science team reserves GPU instances for a two-week training job but completes it in four days, that capacity has the potential to sit unused while your computer vision team waits another week to start their project. Now you can eliminate this GPU waste and scheduling conflict by sharing Capacity Blocks for ML […]
CdXz5zHNQW_e2mS3oLvRZ.png

Enhancing network observability with new AWS Outposts racks LAG metrics

When you deploy AWS Outposts racks, you can run AWS infrastructure and services in on-premises locations. Maintaining seamless connectivity, both to the AWS Region and your on-premises network, is fundamental to delivering consistent, uninterrupted service to your applications. Implementing an observability strategy that uses available network metrics is key to understanding the health of this […]
CdXz5zHNQW_4HBDY9QvI9.png

Serverless ICYMI Q1 2026

Stay current with the latest serverless innovations that can improve your applications. In this 32nd quarterly recap, discover the most impactful AWS serverless launches, features, and resources from Q1 2026 that you might have missed. In case you missed our last ICYMI, check out what happened in Q4 2025. 2026 Q1 calendar Serverless with Mama […]
CdXz5zHNQW_7tKlGfFKQj.png

AWS Outposts monitoring and reporting: A comprehensive Amazon EventBridge solution

Organizations using AWS Outposts racks commonly manage capacity from a single AWS account and share resources through AWS Resource Access Manager (AWS RAM) with other AWS accounts (consumer accounts) within AWS Organizations. In this post, we demonstrate one approach to create a multi-account serverless solution to surface costs in shared AWS Outposts environments using Amazon […]
CdXz5zHNQW_tGSgsK2wb5.png

Building Memory-Intensive Apps with AWS Lambda Managed Instances

Building memory-intensive applications with AWS Lambda just got easier. AWS Lambda Managed Instances gives you up to 32 GB of memory—3x more than standard AWS Lambda—while maintaining the serverless experience you know. Modern applications increasingly require substantial memory resources to process large datasets, perform complex analytics, and deliver real-time insights for use cases such as […]
CdXz5zHNQW_gRZ6jW1ebW.png

Accelerate CPU-based AI inference workloads using Intel AMX on Amazon EC2

This post shows you how to accelerate your AI inference workloads by up to 76% using Intel Advanced Matrix Extensions (AMX) – an accelerator that uses specialized hardware and instructions to perform matrix operations directly on processor cores – on Amazon Elastic Compute Cloud (Amazon EC2) 8th generation instances. You'll learn when CPU-based inference is cost-effective, how to enable AMX with minimal code changes, and which configurations deliver optimal performance for your models.
CdXz5zHNQW_g7Y8WnSog9.png

Build high-performance apps with AWS Lambda Managed Instances

In this post, you will learn how to configure AWS Lambda Managed Instances by creating a Capacity Provider that defines your compute infrastructure, associating your Lambda function with that provider, and publishing a function version to provision the execution environments. We will conclude with production best practices including scaling strategies, thread safety, and observability for reliable performance.
CdXz5zHNQW_zuyTUQ2jwT.png

Enhancing auto scaling resilience by tracking worker utilization metrics

A resilient auto scaling policy requires metrics that correlate with application utilization, which may not be tied to system resources. Traditionally, auto scaling policies track system resource such as CPU utilization. These metrics are easily available, but they only work when resource consumption correlates with worker capacity. Factors such as high variance in request processing time, mixed instance types, or natural changes in application behavior over time can break this assumption.

Testing Step Functions workflows: a guide to the enhanced TestState API

AWS Step Functions recently announced new enhancements to local testing capabilities for Step Functions, introducing API-based testing that developers can use to validate workflows before deploying to AWS. As detailed in our Announcement blog post, the TestState API transforms Step Functions development by enabling individual state testing in isolation or as complete workflows. This supports […]
CdXz5zHNQW_KciUKFZoVy.png

Enabling high availability of Amazon EC2 instances on AWS Outposts servers (Part 3)

This post is part 3 of the three-part series ‘Enabling high availability of Amazon EC2 instances on AWS Outposts servers’. We provide you with code samples and considerations for implementing custom logic to automate Amazon Elastic Compute Cloud (EC2) relaunch on Outposts servers. This post focuses on guidance for using Outposts servers with third party storage for boot […]
CdXz5zHNQW_jVSUR7YKgI.png

Optimizing Compute-Intensive Serverless Workloads with Multi-threaded Rust on AWS Lambda

Customers use AWS Lambda to build Serverless applications for a wide variety of use cases, from simple API backends to complex data processing pipelines. Lambda's flexibility makes it an excellent choice for many workloads, and with support for up to 10,240 MB of memory, you can now tackle compute-intensive tasks that were previously challenging in a Serverless environment. When you configure a Lambda function's memory size, you allocate RAM and Lambda automatically provides proportional CPU power. When you configure 10,240 MB, your Lambda function has access to up to 6 vCPUs.
CdXz5zHNQW_wV1fs7GZS3.png

Amazon SageMaker AI now hosts NVIDIA Evo-2 NIM microservices

This post is co-written with Neel Patel, Abdullahi Olaoye, Kristopher Kersten, Aniket Deshpande from NVIDIA. Today, we’re excited to announce that the NVIDIA Evo-2 NVIDIA NIM microservice are now listed in Amazon SageMaker JumpStart. You can use this launch to deploy accelerated and specialized NIM microservices to build, experiment, and responsibly scale your drug discovery […]

Building fault-tolerant applications with AWS Lambda durable functions

Business applications often coordinate multiple steps that need to run reliably or wait for extended periods, such as customer onboarding, payment processing, or orchestrating large language model inference. These critical processes require completion despite temporary disruptions or system failures. Developers currently spend significant time implementing mechanisms to track progress, handle failures, and manage resources when […]
CdXz5zHNQW_Bf8QkfCkZQ.png

More room to build: serverless services now support payloads up to 1 MB

To support cloud applications that increasingly depend on rich contextual data, AWS is raising the maximum payload size from 256 KB to 1 MB for asynchronous AWS Lambda function invocations, Amazon Amazon SQS, and Amazon EventBridge. Developers can use this enhancement to build and maintain context-rich event-driven systems and reduce the need for complex workarounds such as data chunking or external large object storage.
CdXz5zHNQW_4c6ilzLoDF.png

Simplify network segmentation for AWS Outposts racks with multiple local gateway routing domains

AWS now supports multiple local gateway (LGW) routing domains on AWS Outposts racks to simplify network segmentation. Network segmentation is the practice of splitting a computer network into isolated subnetworks, or network segments. This reduces the attack surface so that if a host on one network segment is compromised, the hosts on the other network segments are not affected. Many customers in regulated industries such as manufacturing, health care and life sciences, banking, and others implement network segmentation as part of their on-premises network security standards to reduce the impact of a breach and help address compliance requirements.
CdXz5zHNQW_ZutEa3886V.png

Optimizing storage performance for Amazon EKS on AWS Outposts

Amazon Elastic Kubernetes Service (Amazon EKS) on AWS Outposts brings the power of managed Kubernetes to your on-premises infrastructure. Use Amazon EKS on Outposts rack to create hybrid cloud deployments that maintain consistent AWS experiences across environments. As organizations increasingly adopt edge computing and hybrid architectures, storage optimization and performance tuning become critical for successful workload deployment.
CdXz5zHNQW_kOk9g9PUek.png

.NET 10 runtime now available in AWS Lambda

Amazon Web Services (AWS) Lambda now supports .NET 10 as both a managed runtime and base container image. .NET is a popular language for building serverless applications. Developers can now use the new features and enhancements in .NET when creating serverless applications on Lambda. This includes support for file-based apps to streamline your projects by implementing functions using just a single file.
CdXz5zHNQW_GOcrgqwrHZ.png

Building zero trust generative AI applications in healthcare with AWS Nitro Enclaves

In healthcare, generative AI is transforming how medical professionals analyze data, summarize clinical notes, and generate insights to improve patient outcomes. From automating medical documentation to assisting in diagnostic reasoning, large language models (LLMs) have the potential to augment clinical workflows and accelerate research. However, these innovations also introduce significant privacy, security, and intellectual property challenges.
CdXz5zHNQW_zl67sLbK3b.png

Orchestrating large-scale document processing with AWS Step Functions and Amazon Bedrock batch inference

Organizations often have large volumes of documents containing valuable information that remains locked away and unsearchable. This solution addresses the need for a scalable, automated text extraction and knowledge base pipeline that transforms static document collections into intelligent, searchable repositories for generative AI applications.
CdXz5zHNQW_xyBlbhk5JE.png

Node.js 24 runtime now available in AWS Lambda

You can now develop AWS Lambda functions using Node.js 24, either as a managed runtime or using the container base image. Node.js 24 is in active LTS status and ready for production use. It is expected to be supported with security patches and bugfixes until April 2028. The Lambda runtime for Node.js 24 includes a new implementation of the […]
CdXz5zHNQW_NhcY1VROVB.png

Performance benefits of new Amazon EC2 R8a memory-optimized instances

Recently we announced the availability of Amazon Elastic Compute Cloud (Amazon EC2) R8a instances, the latest addition to the AMD memory-optimized instance family. These instances are powered by the 5th Generation AMD EPYC (codename Turin) processors with a maximum frequency of 4.5 GHz. In this post I take these instances for a spin and benchmark MySQL later on, but first I discuss the top things you should know about these instances.
CdXz5zHNQW_ZaRU4oaQoe.png

The attendee’s guide to hybrid cloud and edge computing at AWS re:Invent 2025

AWS re:Invent 2025 returns to Las Vegas, Nevada, from December 1–5, 2025. This year, we’re offering a comprehensive lineup of sessions and booth activities to help you build resilient, performant, and scalable applications wherever you need them—in the cloud, on premises, or at the edge.

Optimize unused capacity with Amazon EC2 interruptible capacity reservations

Organizations running critical workloads on Amazon Elastic Compute Cloud (Amazon EC2) reserve compute capacity using On-Demand Capacity Reservations (ODCR) to have availability when needed. However, reserved capacity can intermittently sit idle during off-peak periods, between deployments, or when workloads scale down. This unused capacity represents a missed opportunity for cost optimization and resource efficiency across the organization.
CdXz5zHNQW_2iUkZZ5bTi.jpeg

How potential performance upside with AWS Graviton helps reduce your costs further

Amazon Web Services (AWS) provides many mechanisms to optimize the price performance of workloads running on Amazon Elastic Compute Cloud (Amazon EC2), and the selection of the optimal infrastructure to run on can be one of the most impactful levers. When we started building the AWS Graviton processor, our goal was to optimize AWS Graviton […]
CdXz5zHNQW_XmwpaWMT9i.png

Enhancing API security with Amazon API Gateway TLS security policies

In this post, you will learn how the new Amazon API Gateway’s enhanced TLS security policies help you meet standards such as PCI DSS, Open Banking, and FIPS, while strengthening how your APIs handle TLS negotiation. This new capability increases your security posture without adding operational complexity, and provides you with a single, consistent way to standardize TLS configuration across your API Gateway infrastructure.

Improving throughput of serverless streaming workloads for Kafka

Event-driven applications often need to process data in real-time. When you use AWS Lambda to process records from Apache Kafka topics, you frequently encounter two typical requirements: you need to process very high volumes of records in close to real-time, and you want your consumers to have the ability to scale rapidly to handle traffic spikes. Achieving both necessitates understanding how Lambda consumes Kafka streams, where the potential bottlenecks are, and how to optimize configurations for high throughput and best performance.
CdXz5zHNQW_Hg0YemfgPy.png

Build scalable REST APIs using Amazon API Gateway private integration with Application Load Balancer

Today, we announced Amazon API Gateway REST API’s support for private integration with Application Load Balancers (ALBs). You can use this new capability to securely expose your VPC-based applications through your REST APIs without exposing your ALBs to the public internet.
CdXz5zHNQW_bx9sfV5bSW.png

Serverless strategies for streaming LLM responses

Modern generative AI applications often need to stream large language model (LLM) outputs to users in real-time. Instead of waiting for a complete response, streaming delivers partial results as they become available, which significantly improves the user experience for chat interfaces and long-running AI tasks. This post compares three serverless approaches to handle Amazon Bedrock LLM streaming on Amazon Web Services (AWS), which helps you choose the best fit for your application.
CdXz5zHNQW_P2jfuswZgQ.png

Building multi-tenant SaaS applications with AWS Lambda’s new tenant isolation mode

Today, AWS is announcing tenant isolation for AWS Lambda, enabling you to process function invocations in separate execution environments for each end-user or tenant invoking your Lambda function. This capability simplifies building secure multi-tenant SaaS applications by managing tenant-level compute environment isolation and request routing, allowing you to focus on core business logic rather than implementing tenant-aware compute environment isolation.
CdXz5zHNQW_8DhXkHKt72.png

Building responsive APIs with Amazon API Gateway response streaming

Today, AWS announced support for response streaming in Amazon API Gateway to significantly improve the responsiveness of your REST APIs by progressively streaming response payloads back to the client. With this new capability, you can use streamed responses to enhance user experience when building LLM-driven applications (such as AI agents and chatbots), improve time-to-first-byte (TTFB) performance for web and mobile applications, stream large files, and perform long-running operations while reporting incremental progress using protocols such as server-sent events (SSE).
CdXz5zHNQW_wnTBMYqcz5.gif

Optimize latency-sensitive workloads with Amazon EC2 detailed NVMe statistics

Amazon Elastic Cloud Compute (Amazon EC2) instances with locally attached NVMe storage can provide the performance needed for workloads demanding ultra-low latency and high I/O throughput. High-performance workloads, from high-frequency trading applications and in-memory databases to real-time analytics engines and AI/ML inference, need comprehensive performance tracking. Operating system tools like iostat and sar provide valuable system-level insights, and Amazon CloudWatch offers important disk IOPs and throughput measurements, but high-performance workloads can benefit from even more detailed visibility into instance store performance.

Handle unpredictable processing times with operational consistency when integrating asynchronous AWS services with an AWS Step Functions state machine

In this post, we explore using AWS Step Function state machine with asynchronous AWS services, look at some scenarios where the processing time can be unpredictable, explain when traditional solutions such as polling (periodically check) fall short, and demonstrate how to implement a generalized callback pattern to handle asynchronous operations into a more manageable synchronous flow.
CdXz5zHNQW_qUJryOUaXK.png

The attendee’s guide to the AWS re:Invent 2025 Compute track

From December 1st to December 5th, Amazon Web Services (AWS) will hold its annual premier learning event: re:Invent. There are over 2000+ learning sessions that focus on specific topics at various skill levels, and the compute team have created 76 unique sessions for you to choose. There are many sessions you can choose from, and we are here to help you choose the sessions that best fit your needs. Even if you cannot join in person, you can catch-up with many of the sessions on-demand and even watch the keynote and innovation sessions live.

Optimizing nested JSON array processing using AWS Step Functions Distributed Map

In this post, we explore how to optimize processing array data embedded within complex JSON structures using AWS Step Functions Distributed Map. You’ll learn how to use ItemsPointer to reduce the complexity of your state machine definitions, create more flexible workflow designs, and streamline your data processing pipelines—all without writing additional transformation code or AWS Lambda functions.
CdXz5zHNQW_nYIIbg3iLM.png

Introducing AWS Lambda event source mapping tools in the AWS Serverless MCP Server

Modern serverless applications increasingly rely on event-driven architectures, where AWS Lambda functions process events from various sources like Amazon Kinesis, Amazon DynamoDB Streams, Amazon Simple Queue Service (Amazon SQS), Amazon Managed Streaming for Apache Kafka (Amazon MSK), and self-managed Apache Kafka. Although event source mappings (ESM) offer a powerful mechanism for integrating AWS Lambda with […]
CdXz5zHNQW_a31Rnu9GtJ.png