AWS Architecture Blog Note

AWS Architecture Blog

The site is a blog from Amazon Web Services (AWS) focusing on architecture, cloud computing and technology. It features articles on various aspects such as using AWS services, adopting cloud best practices, troubleshooting common issues, case studies, and latest trends in technology. The blog also interacts with users by allowing them to post comments, engage with the AWS community, and get updates on upcoming AWS services. It is designed as a forum for discussing technical know-how, solving problems, and innovating for the future through technology.

Thread Of Notes

Introducing the Snowflake and AWS Custom Lens for the AWS Well-Architected Framework

The Snowflake and AWS Custom Well-Architected Framework Lens brings together AWS Well-Architected best practices and Snowflake guidance into a single review experience, with integrated recommendations that reflect how the two services compose in production. In this post, we walk through each pillar, the three access points (AWS Management Console, Kiro, and Snowflake Cortex Code), and how to run your first review.
CdXz5zHNQW_1txGeWmwhA.jpeg

Automate medical record digitization with Amazon Bedrock Data Automation and AWS HealthLake

In this post, you learn how to build an automated, serverless pipeline that converts scanned PDF medical records into FHIR R4-compliant data using Amazon Bedrock Data Automation and AWS HealthLake. We walk through the architecture, explain how each AWS service connects to the next, show you what the pipeline looks like when it runs, and get you deployed in under 20 minutes.
CdXz5zHNQW_UHZsbf75FK.png

Align your architecture backlog with Tech Roadmap Prioritization (TRP)

In this post, we show you how to run a one-hour prioritization session with your stakeholders, plot competing initiatives on a shared matrix by cost and impact and turn the result into an actionable architecture backlog - using a framework called Tech Roadmap Prioritization (TRP).
CdXz5zHNQW_8KyA2d4Pjk.png

Scaling oncology patient support: How New York Cancer and Blood Specialists transformed customer experience with AWS and Pronetx, now part of Caylent

This post details how NYCBS partnered with Amazon Web Services (AWS) and AWS partner Pronetx (now part of Caylent) to migrate to Amazon Connect Customer, the AWS cloud contact center service. The migration delivered a 54 percent improvement in patient enrollment and transformed the way NYCBS connects with the patients who need them most.
CdXz5zHNQW_WiUxLpYUiU.png

Cyber resilience on AWS: A reference approach for recovery from ransomware and destructive events

Cyber resilience is the ability to recover workloads to a known-good state after an adversary has affected the environment. Prevention works to keep threat actors out and detection works to find them quickly. Cyber resilience focuses on recovery: restoring a trustworthy environment when backups, credentials, or parts of the infrastructure can no longer be assumed […]
CdXz5zHNQW_o8WC3vj4S8.png

How Synthesia optimizes generative AI video inference on Amazon EC2 G7e instances

This post introduces a video decoding optimization technique that we have ideated in collaboration with Synthesia Research Engineering team, which we call Asynchronous Frame Generation Pipeline. Adopting this technique allows you to overlap GPU compute, device-to-host (D2H) data transfer, and host-side post-processing. In this post, we apply this technique to the VAE decoder of a Wan video generation model as an example, where our benchmarks on G7e show increased GPU kernel utilization from 82% to 99.9%, in turn leading to an 8.2% decrease in latency (and increase in throughput) for video decoding. We expect this technique to benefit any customer with a chunked video generation pipeline that transfers frames to host memory.
CdXz5zHNQW_oQdl9aXzZR.png

Building hybrid multi-tenant architecture for stateful services on AWS

In this post, we show you how to build a hybrid multi-tenant architecture that provides strong tenant isolation without requiring per-tenant AWS accounts. You learn how to configure Route 53 weighted routing to distribute traffic across multiple accounts, deploy Application Load Balancer listener rules for tenant-specific routing, create dedicated ECS clusters per tenant, and establish AWS PrivateLink connectivity to shared dependencies.
CdXz5zHNQW_Dp3Tcxyxuk.png

Choosing between single or multiple organizations in AWS Organizations

Organizations face critical architectural decisions that can impact their operations for years to come such as: Is it better to maintain a single organization or implement multiple organizations? In this post, I explain the key advantages and disadvantages of both approaches and the scenarios where each model fits best.

Modernizing KYC with AWS serverless solutions and agentic AI for financial services

This post extends IBM's approach to real-time KYC validation using generative AI, as previously discussed in the post IBM Digital KYC on AWS uses Generative AI to transform Client Onboarding and KYC Operations. It transforms compliance operations through autonomous decision-making and intelligent automation using agentic AI, event-driven architecture, and AWS serverless services. The solution addresses the fundamental limitations of traditional rule-based systems. It provides autonomous decision-making, dynamic adaptation, and intelligent automation that transforms compliance operations.
CdXz5zHNQW_fTHXmhaCjl.png

PACIFIC enables multi-tenant, sovereign product carbon footprint exchange on the Catena-X data space using AWS

This post explores how PACIFIC enables multi-tenant, sovereign PCF exchange on the Catena-X data space using Amazon Elastic Container Service (Amazon ECS) on AWS Fargate, Amazon Cognito, and AWS Identity and Access Management (IAM) to deliver measurable environmental impact and competitive advantage in a carbon-conscious marketplace.
CdXz5zHNQW_SyqI08SdKd.png

Real-time analytics: Oldcastle integrates Infor with Amazon Aurora and Amazon Quick Sight

This post explores how Oldcastle used AWS services to transform their analytics and AI capabilities by integrating Infor ERP with Amazon Aurora and Amazon Quick Sight. We discuss how they overcame the limitations of traditional cloud ERP reporting to deploy real-time dashboards and build a scalable analytics system. This practical, enterprise-grade approach offers a blueprint that organizations can adapt when extending ERP capabilities with cloud-native analytics and AI.
CdXz5zHNQW_k2gt9Jxibz.png

Build a multi-tenant configuration system with tagged storage patterns

In this post, we demonstrate how you can build a scalable, multi-tenant configuration service using the tagged storage pattern, an architectural approach that uses key prefixes (like tenant_config_ or param_config_) to automatically route configuration requests to the most appropriate AWS storage service. This pattern maintains strict tenant isolation and supports real-time, zero-downtime configuration updates through event-driven architecture, alleviating the cache staleness problem.
CdXz5zHNQW_CxtJyoJV6L.png

Unlock efficient model deployment: Simplified Inference Operator setup on Amazon SageMaker HyperPod

In this post, we walk through the new installation experience, demonstrate three deployment methods (console, CLI, and Terraform), and show how features like multi-instance-type deployment and native node affinity give you fine-grained control over inference scheduling
CdXz5zHNQW_ZCqn9yp0Ei.png

Automate safety monitoring with computer vision and generative AI

This post describes a solution that uses fixed camera networks to monitor operational environments in near real-time, detecting potential safety hazards while capturing object floor projections and their relationships to floor markings. While we illustrate the approach through distribution center deployment examples, the underlying architecture applies broadly across industries. We explore the architectural decisions, strategies for scaling to hundreds of sites, reducing site onboarding time, synthetic data generation using generative AI tools like GLIGEN, and other critical technical hurdles we overcame.
CdXz5zHNQW_Rv7F8vC2ks.png

Streamlining access to powerful disaster recovery capabilities of AWS

In this blog post, we take a building blocks approach. Starting with the tools like AWS Backup to protect your data, we then add protection for Amazon Elastic Compute Cloud (Amazon EC2) compute using AWS Elastic Disaster Recovery (AWS DRS). Finally, we show how to use the full capabilities of AWS to restore your entire workload—data, infrastructure, networking, and configuration, using Arpio disaster recovery automation.
CdXz5zHNQW_qdGSMw73Fy.png

How Aigen transformed agricultural robotics for sustainable farming with Amazon SageMaker AI

In this post, you will learn how Aigen modernized its machine learning (ML) pipeline with Amazon SageMaker AI to overcome industry-wide agricultural robotics challenges and scale sustainable farming. This post focuses on the strategies and architecture patterns that enabled Aigen to modernize its pipeline across hundreds of distributed edge solar robots and showcase the significant business outcomes unlocked through this transformation. By adopting automated data labeling and human-in-the-loop validation, Aigen increased image labeling throughput by 20x while reducing image labeling costs by 22.5x.
CdXz5zHNQW_SlHfCygT1W.jpeg

Architecting for agentic AI development on AWS

In this post, we demonstrate how to architect AWS systems that enable AI agents to iterate rapidly through design patterns for both system architecture and code base structure. We first examine the architectural problems that limit agentic development today. We then walk through system architecture patterns that support rapid experimentation, followed by codebase patterns that help AI agents understand, modify, and validate your applications with confidence.
CdXz5zHNQW_uZL1vARXmR.jpeg

The Hidden Price Tag: Uncovering Hidden Costs in Cloud Architectures with the AWS Well-Architected Framework

In this post, we discuss how following the AWS Cloud Adoption Framework (AWS CAF) and AWS Well-Architected Framework can help reduce these risks through proper implementation of AWS guidance and best practices while taking into consideration the practical challenges organizations face in implementing these best practices, including resource constraints, evaluating trade-offs and competing business priorities.

Digital Transformation at Santander: How Platform Engineering is Revolutionizing Cloud Infrastructure

Santander faced a significant technical challenge in managing an infrastructure that processes billions of daily transactions across more than 200 critical systems. The solution emerged through an innovative platform engineering initiative called Catalyst, which transformed the bank's cloud infrastructure and development management. This post analyzes the main cases, benefits, and results obtained with this initiative.
CdXz5zHNQW_tijsZrQ7zI.png

6,000 AWS accounts, three people, one platform: Lessons learned

This post describes why ProGlove chose a account-per-tenant approach for our serverless SaaS architecture and how it changes the operational model. It covers the challenges you need to anticipate around automation, observability and cost. We will also discuss how the approach can affect other operational models in different environments like an enterprise context.
CdXz5zHNQW_wjLTRi2d8h.png

Mastering millisecond latency and millions of events: The event-driven architecture behind the Amazon Key Suite

In this post, we explore how the Amazon Key team used Amazon EventBridge to modernize their architecture, transforming a tightly coupled monolithic system into a resilient, event-driven solution. We explore the technical challenges we faced, our implementation approach, and the architectural patterns that helped us achieve improved reliability and scalability. The post covers our solutions for managing event schemas at scale, handling multiple service integrations efficiently, and building an extensible architecture that accommodates future growth.
CdXz5zHNQW_Is2QZc85BD.png

Sovereign failover – Design for digital sovereignty using the AWS European Sovereign Cloud

This post explores the architectural patterns, challenges, and best practices for building cross-partition failover, covering network connectivity, authentication, and governance. By understanding these constraints, you can design resilient cloud-native applications that balance regulatory compliance with operational continuity.
CdXz5zHNQW_HksVA12vrI.jpeg

Announcing the AWS Digital Sovereignty Well-Architected Lens

As organizations accelerate cloud adoption, meeting digital sovereignty requirements has become essential to build trust with customers and regulators worldwide. The challenge isn’t whether to adopt the cloud—it’s how to do so while meeting sovereignty requirements, using a multidisciplinary approach. Even though requirements vary by geography, organizations commonly address them through technical and operational controls […]
CdXz5zHNQW_Y2jbdSiece.png

How Salesforce migrated from Cluster Autoscaler to Karpenter across their fleet of 1,000 EKS clusters

This blog post examines how Salesforce, operating one of the world's largest Kubernetes deployments, successfully migrated from Cluster Autoscaler to Karpenter across their fleet of 1,000 plus Amazon Elastic Kubernetes Service (Amazon EKS) clusters.

Architecting conversational observability for cloud applications

In this post, we walk through building a generative AI–powered troubleshooting assistant for Kubernetes. The goal is to give engineers a faster, self-service way to diagnose and resolve cluster issues, cut down Mean Time to Recovery (MTTR), and reduce the cycles experts spend finding the root cause of issues in complex distributed systems.
CdXz5zHNQW_QZ7F6C5xDg.png

How BASF’s Agriculture Solutions drives traceability and climate action by tokenizing cotton value chains using Amazon Managed Blockchain

BASF Agricultural Solutions combines innovative products and digital tools with practical farmer knowledge. This post explores how Amazon Managed Blockchain can drive a positive change in the agricultural industry by tokenizing food and cotton value chains for traceability, climate action, and circularity.
CdXz5zHNQW_wlwEUAFgFV.png

She architects: Bringing unique perspectives to innovative solutions at AWS

Have you ever wondered what it is really like to be a woman in tech at one of the world's leading cloud companies? Or maybe you are curious about how diverse perspectives drive innovation beyond the buzzwords? Today, we are providing an insider's perspective on the role of a solutions architect (SA) at Amazon Web Services (AWS). However, this is not a typical corporate success story. We are three women who have navigated challenges, celebrated wins, and found our unique paths in the world of cloud architecture, and we want to share our real stories with you.
CdXz5zHNQW_m3OA7ib2Yi.jpeg

Secure Amazon Elastic VMware Service (Amazon EVS) with AWS Network Firewall

In this post, we demonstrate how to utilize AWS Network Firewall to secure an Amazon EVS environment, using a centralized inspection architecture across an EVS cluster, VPCs, on-premises data centers and the internet. We walk through the implementation steps to deploy this architecture using AWS Network Firewall and AWS Transit Gateway.
CdXz5zHNQW_XCAaMWltk2.png

Building an AI gateway to Amazon Bedrock with Amazon API Gateway

In this post, we'll explore a reference architecture that helps enterprises govern their Amazon Bedrock implementations using Amazon API Gateway. This pattern enables key capabilities like authorization controls, usage quotas, and real-time response streaming. We'll examine the architecture, provide deployment steps, and discuss potential enhancements to help you implement AI governance at scale.
CdXz5zHNQW_xXb9lT4bjJ.png

Architecting for AI excellence: AWS launches three Well-Architected Lenses at re:Invent 2025

At re:Invent 2025, we introduce one new lens and two significant updates to the AWS Well-Architected Lenses specifically focused on AI workloads: the Responsible AI Lens, the Machine Learning (ML) Lens, and the Generative AI Lens. Together, these lenses provide comprehensive guidance for organizations at different stages of their AI journey, whether you're just starting to experiment with machine learning or already deploying complex AI applications at scale.
CdXz5zHNQW_ejLvupDOxm.png

Announcing the updated AWS Well-Architected Generative AI Lens

We are delighted to announce an update to the AWS Well-Architected Generative AI Lens. This update features several new sections of the Well-Architected Generative AI Lens, including new best practices, advanced scenario guidance, and improved preambles on responsible AI, data architecture, and agentic workflows.
CdXz5zHNQW_0QgU8sSinb.png

Build priority-based message processing with Amazon MQ and AWS App Runner

In this post, we show you how to build a priority-based message processing system using Amazon MQ for priority queuing, Amazon DynamoDB for data persistence, and AWS App Runner for serverless compute. We demonstrate how to implement application-level delays that high-priority messages can bypass, create real-time UIs with WebSocket connections, and configure dual-layer retry mechanisms for maximum reliability.
CdXz5zHNQW_cVlVAuZ4wK.jpeg

Know before you go – AWS re:Invent 2025 guide to Well-Architected and Cloud Optimization sessions

Are you ready to maximize your Well-Architected and Cloud Optimization learning and networking time at re:Invent 2025? We have put together this comprehensive guide to help you plan your schedule and make the most of the Well-Architected and cloud optimization sessions available this year. These sessions will deliver the practical guidance your teams need to lead strategic cloud initiatives, design next-generation architectures, optimize costs, or secure AI-powered systems.

BASF Digital Farming builds a STAC-based solution on Amazon EKS

This post was co-written with Frederic Haase and Julian Blau with BASF Digital Farming GmbH. At xarvio – BASF Digital Farming, our mission is to empower farmers around the world with cutting-edge digital agronomic decision-making tools. Central to this mission is our crop optimization platform, xarvio FIELD MANAGER, which delivers actionable insights through a range […]
CdXz5zHNQW_qnPMFZ3luh.png

Modernization of real-time payment orchestration on AWS

The global real-time payments market is experiencing significant growth. According to Fortune Business Insights, the market was valued at USD 24.91 billion in 2024 and is projected to grow to USD 284.49 billion by 2032, with a CAGR of 35.4%. Similarly, Grand View Research reports that the global mobile payment market, valued at USD 88.50 […]
CdXz5zHNQW_yNSrGRgUfr.png

Build resilient generative AI agents

Generative AI agents in production environments demand resilience strategies that go beyond traditional software patterns. AI agents make autonomous decisions, consume substantial computational resources, and interact with external systems in unpredictable ways. These characteristics create failure modes that conventional resilience approaches might not address. This post presents a framework for AI agent resilience risk analysis […]
CdXz5zHNQW_OnF87hW7Tf.png

A scalable, elastic database and search solution for 1B+ vectors built on LanceDB and Amazon S3

In this post, we explore how Metagenomi built a scalable database and search solution for over 1 billion protein vectors using LanceDB and Amazon S3. The solution enables rapid enzyme discovery by transforming proteins into vector embeddings and implementing a serverless architecture that combines AWS Lambda, AWS Step Functions, and Amazon S3 for efficient nearest neighbor searches.
CdXz5zHNQW_6ZTCVBBQHP.png

Simplify multi-tenant encryption with a cost-conscious AWS KMS key strategy

In this post, we explore an efficient approach to managing encryption keys in a multi-tenant SaaS environment through centralization, addressing challenges like key proliferation, rising costs, and operational complexity across multiple AWS accounts and services. We demonstrate how implementing a centralized key management strategy using a single AWS KMS key per tenant can maintain security and compliance while reducing operational overhead as organizations scale.
CdXz5zHNQW_FKQ5WJl1ks.png