Google Cloud Blog Note

Google Cloud Blog

cloud.google.com/blog is the official blog of Google Cloud. It provides news, updates, and insights on Google Cloud's products and services, as well as trends and innovations in the cloud computing industry. The blog features articles written by Google Cloud experts, engineers, and thought leaders, covering a wide range of topics such as artificial intelligence, machine learning, data analytics, security, and more. The articles often include technical tutorials, case studies, and best practices, making the blog a valuable resource for developers, IT professionals, and business leaders who use or are interested in Google Cloud. The blog is well-organized, with articles categorized by topic, product, and industry. Visitors can browse the latest articles, search for specific topics, or subscribe to the blog's RSS feed to stay up-to-date with the latest news and updates. Some of the key features of the blog include: - In-depth articles on Google Cloud products and services, such as Google Cloud Platform, Google Cloud Storage, and Google Cloud AI Platform - Technical tutorials and guides on how to use Google Cloud services - Case studies and success stories from Google Cloud customers - Insights and analysis on industry trends and innovations - News and updates on Google Cloud's partnerships and collaborations - Interviews with Google Cloud experts and thought leaders Overall, the Google Cloud blog is a valuable resource for anyone interested in cloud computing, artificial intelligence, and related technologies.

Thread Of Notes

What’s new in data agents: Supercharging your AI workflows

Generic AI agents often struggle with enterprise data due to a lack of contextual understanding and security concerns. Google's Agentic Data Cloud aims to solve this by integrating AI across its operational and analytical systems. This new platform offers a framework for agents to access real-time enterprise data with high accuracy and unified governance. New tools and data agents are being introduced to enhance development and usability. Conversational Analytics is being expanded across BigQuery, Lakehouse, AlloyDB, Spanner, and Cloud SQL for natural language data interaction. Looker Embedded Conversational Analytics allows agents to be integrated directly into custom applications. A suite of new data agents is available to automate tasks and provide intelligence for data engineers, scientists, and database administrators. These include agents for data engineering, data science, database observability, and database onboarding. Looker Dashboard Agent and Conversational Analytics in Gemini Enterprise offer simplified access to data insights for business users. Tools like the Data Agent Kit and Managed MCP Servers are provided to support developer integration with the agentic ecosystem. These advancements empower organizations to leverage AI agents more effectively and securely with enterprise data.
CdXz5zHNQW_wVdRRTFqPl.gif

Cloud CISO Perspectives: The 4 lessons that guided AI Threat Defense

Chris Betz, the new CISO of Google Cloud, shared four key lessons learned from developing AI Threat Defense. AI is significantly accelerating vulnerability discovery, allowing defenders to find thousands of flaws in minutes. Adversaries are leveraging AI for sophisticated attacks, but defenders can use similar capabilities alongside their business context. Legacy manual defenses are no longer sufficient against machine-speed threats. Google's AI Threat Defense framework involves preparing, scanning, remediating, and monitoring. Preparation involves reducing the attack surface and establishing a robust operational framework with engineering alignment. Scanning and prioritizing requires expert-driven, AI-assisted analysis of the software supply chain. Prioritization shifts to tackling foundational code with larger blast radii first. Remediation focuses on risk-based rollout, comprehensive tracking, and building system resilience. This includes refreshing, removing, or rewriting open-source software to enhance security. Monitoring establishes a continuous feedback loop, tracks remediation health, and utilizes AI agents for future threat evolution. AI agents automate response playbooks and improve coding practices, while red teaming stress-tests infrastructure. This approach ensures a constantly evolving and secure defense.
CdXz5zHNQW_hQrhUBcQRP.png

Architecting a trusted agentic platform with graph technologies: A Yahoo case study

Enterprises are moving from reactive intelligence to proactive systems of action for agentic AI. Google Cloud's Agentic Data Cloud enables this shift, exemplified by Yahoo's Seller Agent digital media buying platform. Yahoo partnered with Google Cloud to build Seller Agent, transforming multi-week manual processes into rapid, governed campaigns executed in seconds. This platform demonstrates how autonomous systems can achieve speed and accountability. Traditional workflows like premium advertising campaigns were inefficient, requiring extensive human intervention and analysis. Simply integrating LLMs is insufficient without deterministic understanding of real-time data and constraints. A trusted agentic platform needs a definitive source of truth to avoid errors and ensure factual grounding. Regulators demand explainability for AI decisions involving real budgets, necessitating built-in governance and auditability. Yahoo's Seller Agent, running on Google Cloud, is a multi-agent system designed for explainability and auditability. Its architecture features a knowledge graph for grounding decisions in business reality and a context graph for auditable memory. The knowledge graph, powered by Spanner Graph, models business operations and policies, ensuring agents act on facts. The context graph, utilizing BigQuery Graph, captures every action as a traceable record for explainability. This dual-graph foundation allows for rapid execution via the knowledge graph and continuous auditing via the context graph. This architecture serves as a blueprint for industries needing trustworthy autonomous systems, grounding decisions in business reality and building auditable memory.
CdXz5zHNQW_9rehc7XuNW.png

Public and Private Medical Community Targeted by China-Nexus Threat Actor Pursuing Artificial Intelligence, Cyber, Medical, and National Defense Research

A sophisticated China nexus threat actor, identified as UNC6508, has been targeting North American academic, medical, and military research institutions. This actor remained undetected for over a year, compromising web applications and deploying custom malware called INFINITERED. UNC6508 aimed to steal sensitive data, including national security intelligence and advanced research. The primary method involved exploiting vulnerable REDCap servers to capture login credentials. After gaining access, the actor pivoted to internal systems and used novel techniques for data exfiltration. Google Threat Intelligence Group (GTIG) disrupted the malicious infrastructure and notified affected organizations. They recommend enabling 2-Step Verification and using security best practices. INFINITERED uses a modular approach with a dropper, credential harvester, and backdoor. It maintains persistence by injecting code into REDCap upgrade processes. The credential harvester captures login details, storing them encrypted in a database. The backdoor allows the execution of commands for data theft and system control. The threat actor also manipulated domain content compliance rules for covert data exfiltration. GTIG collaborated with Mandiant Consulting and other Google teams to provide comprehensive threat intelligence and remediation assistance.
CdXz5zHNQW_Kt1CMvP0Cs.png

How I learned Go in a Day with Antigravity 2.0 and How You Can Do the Same

The author sought to replace a resource-intensive Node.js tool with a performant Go CLI. They defined architectural goals, including a zero-dependency core, speed, and a zero-trust security model. After considering alternatives like Rust, Python, Zig, and Swift, Go was chosen for its balance of features. An AI agent assisted in auditing existing code and verifying no direct Go port existed. The project began by installing a popular Go agent skill to ensure community-standard coding practices. A gap analysis followed, where architectural goals were refined, and the AI agent planned the migration. The migration involved translating TypeScript configurations to Go, mapping various agent directories. User onboarding logic was isolated into a separate file. To ensure functional parity, a Test-Driven Development (TDD) loop was implemented. The AI generated tests before writing production code, starting with frontmatter parsing. Error handling was aligned with Go best practices, ensuring explicit error checks and contextual wrapping. Unit tests were supplemented with end-to-end integration tests to cover diverse scenarios. For managing the large surface area of CLI commands, parallel subagents were utilized, with each agent focusing on a single command. This approach helped identify missing options and tests. The "Elephant and Goldfish" architectural pattern was employed, using a persistent coordinator agent and transient subagents for specific tasks. The Go package structure was finalized, supporting native installation. A CI/CD pipeline was set up using a matrix build to ensure cross-platform compatibility. Despite migrating to Go, the pipeline retained Node.js dependencies for GitHub Actions helpers. Final considerations included code signing for distributing pre-compiled binaries.
CdXz5zHNQW_rpTWrPGL8F.jpeg

Introducing the Open Knowledge Format

Foundation models require relevant context to be effective, especially in agentic systems. Organizations face a fragmented landscape of internal knowledge scattered across various systems. To address this, the Open Knowledge Format (OKF) is introduced as an open specification. OKF formalizes the LLM-wiki pattern into a portable, interoperable, vendor-neutral format. It represents knowledge as a directory of markdown files with YAML frontmatter, using simple conventions for interoperability. This format allows different producers and agents to consume knowledge without translation. OKF is designed to be minimally opinionated, ensuring producer-consumer independence, and prioritizing format over a proprietary platform. Each concept is a markdown file, identified by its file path, with YAML frontmatter for structured fields and markdown for the body. Concepts link to each other via markdown, creating a graph of relationships. Reference implementations include an enrichment agent and a static HTML visualizer. The format is intended to evolve through community contributions. Google Cloud's Knowledge Catalog will also ingest OKF. Ultimately, OKF aims to be a lingua franca for exchanging knowledge in AI applications.

Powering the next era of Confidential AI

Google Cloud is partnering with Apple to power its expanded Private Cloud Compute (PCC) systems. This collaboration leverages Google Cloud's advanced security and privacy technologies. At the core of this partnership is Google Cloud's Confidential Computing portfolio, which protects data throughout its lifecycle. The Titanium security architecture, featuring the Titan chip, provides a hardware root of trust for Google's infrastructure. Confidential Computing utilizes Trusted Execution Environments (TEEs) to encrypt data even while it is in use. Apple's PCC relies on these TEEs, powered by Intel TDX and NVIDIA Confidential Computing. This hardware-based isolation ensures a highly secure and private environment for sensitive AI workloads. Google Titan chips are integral to establishing the integrity of the hardware platform for PCC. The collaboration also includes an open-source host stack for enhanced transparency and independent verification of security properties. This joint effort creates a robust and verifiable system designed to meet Apple's stringent privacy and security requirements for PCC. These advancements will benefit all Google Cloud customers, especially those working with sensitive AI data.

Transform dashboards into interactive data experiences with Looker agents

Traditional dashboards, while useful, lack interactivity and prevent users from asking follow-up questions. This often disrupts workflows or requires assistance from data analysts. To address this, Looker is introducing dashboard agents in preview, allowing conversational data exploration directly within dashboards. Users can now engage with their business intelligence data using natural language. By clicking a Gemini icon, users can initiate conversations and receive contextual insights. The agent leverages applied filters, cross-filters, and curated tiles for highly relevant answers. If more information is needed, the agent can access underlying Explores for deeper insights, presented with charts and explanations. Data analysts can tailor the agent's responses by providing natural-language instructions, ensuring it interprets business logic accurately for specific audiences. This self-serve capability helps analyst teams scale their efforts. The dashboard agent also ensures trust and transparency by showing its reasoning, referenced tiles, and applied filters. It operates within Looker's existing governance model, respecting user permissions. Administrators can enable this feature in Looker version 26.08.11 and later through Gemini in Looker settings.
CdXz5zHNQW_yizylgs7nw.gif

ShinyHunters Targets Education Sector with Oracle PeopleSoft Exploit

Mandiant and Google Threat Intelligence Group reported an active extortion campaign by UNC6240, also known as ShinyHunters, targeting Oracle PeopleSoft infrastructure. The attackers exploited CVE-2026-35273, a critical zero-day remote code execution vulnerability in the Environment Management component, between May 27 and June 9, 2026. Google notified over 100 organizations, predominantly in higher education, about the potential compromise. Open attacker directories on staging servers revealed customized MeshCentral agents disguised as cloud endpoints. These agents were used to deploy a custom lateral movement and defacement script, [victim_abbreviation]_fanout.sh. This activity directly correlates with data leaks published on the ShinyHunters Data Leak Site on June 9, 2026. The staging infrastructure hosted pre-configured Windows MeshCentral agents masquerading as Microsoft Azure services. Analysis of command history showed attackers configuring the staging environment, mapping PeopleSoft configurations, and propagating a script for lateral movement. This script sprayed SSH credentials against internal hosts and deployed a defacement marker. Organizations running Oracle PeopleSoft are advised to immediately block external network access to sensitive endpoints like /PSEMHUB/hub and /PSIGW/HttpListeningConnector. Additionally, auditing access logs for suspicious POST requests and monitoring outbound SMB traffic are recommended security measures.
CdXz5zHNQW_SKa3xYLQeY.png

10 Indispensable Prompts Our Team Refuses to Build Without

Experienced builders don't improvise; they rely on refined, go-to prompts for consistent, high-quality work. These prompts act as tools to de-risk human assumptions and stress-test ideas before coding. Maja Bilić uses a prompt to embody a cynical architect, refining product requirements and preventing over-engineering. Andrew Brogdon leverages AI for often-neglected widget tests, reducing guilt and improving codebase reliability. Aja Hammerly uses prompts to find forgotten TODOs and clean up commit messages before submitting code for review. Rich Hyndman employs a prompt to verify app permissions and identify unused ones, crucial for Play Store compliance. Shir Meir Lador's prompt forces AI into a harsh code review, identifying critical flaws rather than offering polite suggestions. James O'Reilly uses a prompt to explore trade-offs, keeping AI focused and the developer in control of decisions. Emma Twersky generates checklists with prompts to identify vulnerabilities in AI-generated code, preventing the trust mismatch. Fred Sauer iterates through stages of discovery, proof of concept, evaluation, and refinement using a series of prompts. Remigiusz Samborski embeds a prompt in GitHub Actions for automated pull request reviews, freeing up human reviewers for higher-level tasks. Karl Weinmeister uses DAG analysis to prompt AI for structural testing, focusing on critical boundaries and prioritizing improvements. These essential prompts transform AI into an adversarial thinker, ensuring developers ship with greater confidence.

Deep dive: How Lightning Engine delivers 4.9x faster Apache Spark performance

Apache Spark is a core technology for global data processing, but scaling data volumes can lead to performance and cost trade-offs. The agentic era, with numerous concurrent queries, exacerbates these bottlenecks and impacts unit economics. Managed Service for Apache Spark now offers Lightning Engine, a performance enhancement compatible with existing Spark workloads. This engine is available in both serverless and managed cluster deployment modes, providing a unified solution to accelerate job execution. Lightning Engine has demonstrated significant performance gains, up to 4.9x faster than standard Spark, and offers superior price-performance compared to alternatives. Its core innovation lies in vectorized native execution, compiling Spark plans into optimized C++ instructions, thereby bypassing JVM overhead. It achieves this through vectorized sort, accelerated window functions, and a smart fallback mechanism for unsupported operators. Furthermore, Lightning Engine optimizes cloud storage and BigQuery connectors for faster data retrieval. It also boasts advanced query optimization features like single hash table caching and aggregation pushdown. Getting started is straightforward, with options to enable Lightning Engine via the Google Cloud console or gcloud CLI. This release signifies a new, intelligent, and faster way to utilize Spark.
CdXz5zHNQW_gyIAmWSUK1.jpeg

Choosing your surface: Antigravity 2.0, Antigravity CLI, Antigravity IDE, or Antigravity SDK

Antigravity offers four distinct interfaces for orchestrating autonomous agents, all powered by a shared underlying harness. Antigravity 2.0 is a desktop app ideal for managing multiple tasks across independent projects simultaneously, allowing users to monitor and schedule work without disrupting their main workspace. For command-line enthusiasts and headless execution scenarios, the Antigravity CLI provides a fast, terminal-based experience. Developers who wish to work directly alongside agents and review code changes line by line will find the Antigravity IDE particularly useful, offering integrated debugging and one-click fixes. The Antigravity SDK, a Python library, empowers users to build and deploy their own custom agents. This SDK grants access to the same tools and rules that power Google's official Antigravity tools, enabling local development and seamless deployment to Google Cloud. All interfaces support plugins and skills, ensuring consistent access to core logic regardless of the chosen surface. Users can select the tool that best fits their workflow and project needs. Further guidance and documentation are available on the antigravity.google website, with downloads accessible from a dedicated page.
CdXz5zHNQW_f0XTUFrYdX.png

Claude Fable 5: Available on Google Cloud

Claude Fable 5, Anthropic’s latest frontier model, is now generally available on Google Cloud. This launch is the latest proof point of our ongoing commitment to bring the industry's latest models straight to our Agent Platform.  Claude Fable 5 brings the best of Anthropic model capabilities to all customers, with strong safeguards designed to make it safe for general use. Designed for complex, multi-step reasoning, Claude Fable 5 is good for demanding tasks like advanced software development, long-horizon agents, and deep multimodal document analysis. For more information about this release, visit Anthropic’s blog. Build with Claude Fable 5 and other models from Anthropic — including Claude Opus 4.8 and Claude Sonnet 4.6 — today on Agent Platform.

Gemini for Government: Your blueprint for mission impact

The public sector is transitioning from AI experimentation to impactful applications, demanding integrated solutions beyond individual models. Google Cloud offers a unified AI stack to achieve this transformation in the agentic era. This stack is built on the AI Hypercomputer, optimized for scale and powered by advanced infrastructure like TPUs. It delivers intelligence through Google's frontier models, including Gemini 3.5, alongside third-party options. The agentic data cloud grounds AI in trusted organizational data, enabling a "system of action" with breakthroughs like the Cross-cloud Lakehouse. Agentic defense provides zero-trust protection for the entire AI lifecycle, enhanced by AI Threat Defense. The Gemini Enterprise Agent Platform facilitates building, scaling, and governing agents, with pre-built specialized agents like Workspace Intelligence ready for immediate use. Security is paramount, featuring an AI Control Dashboard, Agent Registry, and Model Armor for comprehensive protection. Gemini for Government boasts FedRAMP High authorization and a Data Privacy Guarantee, with new tools to secure AI-generated code. Scaling agents is achieved through tools like Agent Designer, empowering non-technical users to build agents via no-code interfaces. This initiative aims to automate tasks, boost productivity, and allow personnel to focus on critical work. Public sector teams are already seeing significant productivity gains from generative AI, with many reporting at least doubled employee productivity. Gemini for Government provides a blueprint for moving beyond pilots to scalable, mission-advancing applications. The technology amplifies human capacity and accelerates decision-making.
CdXz5zHNQW_omvR9Ok88T.png

Detecting and containing AI-powered threats with Google Security Operations agents

Organizations face an increasing threat from AI-accelerated adversaries, necessitating faster response times. Google AI Threat Defense is an automated system designed to combat these AI-powered threats. It operates on a four-step framework: prepare, scan and prioritize, remediate, and monitor. Google Security Operations works in conjunction with AI Threat Defense to monitor, detect, and respond to threats, particularly those involving unpatchable code. Exploitation of vulnerabilities is now the most common initial infection vector, with exploitation often occurring before patches are available. Google Security Operations provides the operational framework to autonomously contain active attacks across an entire environment. It offers cross-environment visibility through continuous analysis, autonomous investigation and response, and retroactive hunting. The Detection Engineering agent translates new exploitation patterns into custom detections, analyzing various input sources to identify malicious activity. The Triage and Investigation agent autonomously investigates alerts, reducing analysis time from minutes to seconds. Agentic automation combines AI agents with playbooks to contain attacks, allowing analysts to maintain control while automating workflows. The Threat Hunting agent enables proactive hunting for stealthy adversary behaviors and anomalies that bypass traditional defenses. By integrating these agents, organizations can autonomously generate detections, orchestrate containment, and hunt for threats at machine speed, significantly reducing breach risks and costs.
CdXz5zHNQW_UNHT6iJl7X.png

How to unlock true ROI in software development – a deep dive into the latest DORA research

To prove the business value of generative AI, technology and finance leaders must demonstrate clear returns on investment for ongoing funding. Success hinges on establishing the necessary organizational systems and culture for AI implementation. The DORA: ROI of AI-assisted software development report provides a practical approach for teams navigating early adoption challenges. Key findings highlight the importance of realistic expectations regarding AI value realization, which often follows a J-curve. This J-curve involves a temporary dip in productivity due to the learning curve, verification tax, and pipeline adaptation. Budgeting for this initial learning phase is crucial for sustained progress. The report also reveals a market divide in AI returns, with organizational support heavily influencing success. Calculating AI ROI is essential, focusing on areas like cost reduction, productivity boosts, and improved security across the software development lifecycle. An interactive ROI calculator is available to forecast expenses and realities of AI adoption. Leaders can download the full report and try the calculator to build a defensible business case for AI investments.
CdXz5zHNQW_JeLAU2eN3Q.jpeg

Storage Insights datasets: Enabling org-wide operational discovery with activity insights

AI workloads are transforming storage from passive repositories into active data platforms. Billions of unstructured data objects and associated actions necessitate advanced understanding of data access, movement, and modification. Google Cloud Storage Insights datasets now offer activity insights, providing visibility into operational details of Cloud Storage assets. These new views enable data-driven cost optimization and faster troubleshooting for administrators. For instance, users can determine if objects are in the correct storage classes or if bucket regions are optimally located. Identifying operational errors across the storage estate and understanding their root causes becomes more manageable. Storage Insights datasets provide daily metadata and frequent activity insights, typically within four hours. These insights are delivered as a query-ready BigQuery index, replacing manual data collection. The datasets offer object-level activity, bucket-level aggregate activity, regional traffic, and project-level aggregate activity. This allows for dynamic analysis of the data lifecycle, moving beyond static snapshots. Activity insights can be used to right-size storage by identifying underutilized data. They also aid in architecting for global performance by analyzing regional traffic patterns. Furthermore, these insights help demystify and resolve operational hotspots, such as spikes in error codes.
CdXz5zHNQW_bsDbZO9A6w.png

Report: GKE Inference Gateway delivers up to 92% faster AI responses

Generative AI's production environment demands infrastructure efficiency, and GKE Inference Gateway offers a solution by intelligently routing AI workloads. It goes beyond basic load balancing to leverage advanced features like prefix caching and model-aware routing. This ensures requests are directed to accelerators already primed for them, optimizing hardware utilization and response times. Independent benchmarks show GKE Inference Gateway significantly outperforms competitors in throughput, time to first token, and inter-token latency. Snap Inc. has also seen success, achieving high prefix cache hit rates and seamless integration with their existing infrastructure. Prefix caching works by storing activation states for repetitive prompt prefixes, eliminating redundant reprocessing. This is particularly useful for documentation and codebase Q&A using retrieval-augmented generation. Similarly, in multi-turn chat scenarios, caching system personas and business rules allows for continuous responsiveness. The technology bypasses the need for LLMs to re-evaluate static context for each query. GKE's superior performance was validated in a benchmark report comparing it to a standard managed Kubernetes service. The results highlighted GKE's substantial improvements in processing speed and latency. By minimizing latency and maximizing efficiency, GKE Inference Gateway makes generative AI applications production-ready and cost-effective.
CdXz5zHNQW_Q5fGhEs1eh.jpeg

Modernizing Healthcare: How Alcidion achieved greater stability and performance with AlloyDB

Alcidion, a smart health solutions provider, aims to reduce clinician cognitive load and present critical information at the right time. Their flagship platform, Miya Precision, helps manage patient flow and prevent adverse outcomes. Previously, Alcidion faced performance bottlenecks and operational overhead with its Microsoft SQL Server environment, particularly with complex JSON data processing and stability concerns. To address these issues, Alcidion migrated its platform to Google Cloud's AlloyDB for PostgreSQL. The migration, utilizing Google Cloud's Database Migration Service, was completed efficiently, with the core move taking just over a week. Alcidion achieved a brief 15-minute cutover by creating custom synchronization tools and leveraging managed Google Cloud services. This transition eliminated control plane tasks and administrative overhead associated with managing databases. The results have been significant, with data processing times reduced from 30 minutes to under a minute. Stability has dramatically improved, shifting the team away from constant "firefighting" of infrastructure issues. This move has also reduced the administrative burden on their SRE team, allowing them to focus on product innovation. Alcidion views this modernization as a foundational step for future growth, including exploring AlloyDB's columnar engine and integrating generative AI. This strategic move enables Alcidion to continue delivering smarter, safer healthcare solutions globally.

Seeking Counsel: Ongoing Targeted Campaign Against US Law Firms

A financially motivated data theft and extortion campaign by threat cluster UNC3573 targeted organizations in professional, legal, and financial services. UNC3573 utilizes voice phishing and social engineering to gain remote access, often impersonating IT support. They convince victims to share screens and download remote monitoring tools to locate and exfiltrate sensitive data. In some cases, threat actors have gained physical access to offices, posing as IT technicians to steal data via USB drives. The campaign lifecycle is rapid, with data theft and extortion occurring within a single business day in many instances. Initial access is typically achieved through benign, invoice-themed emails designed to raise security concerns, followed by direct vishing calls. Threat actors exploit legitimate tools like Zoom, Teams, and RMM agents, and use self-destructing notes to deliver commands and links. They also abuse BYOD environments to pivot into corporate networks and target document management systems for specific sensitive files. Data exfiltration methods include cloud storage uploads, FTP utilities, and instructing victims to email stolen files. Extortion communications are aggressive, demanding a ransom within three days and threatening to leak data publicly and contact employees and clients.
CdXz5zHNQW_YqBOAUO037.png

What's new for Managed Service for Apache Spark clusters

Google Cloud's Dataproc, now Managed Service for Apache Spark, offers two deployment modes: serverless for ad-hoc jobs and managed clusters for customized needs. The managed clusters have been re-imagined with a focus on speed, ease of use, and intelligence. A significant enhancement is the Lightning Engine, a native C++ execution engine that dramatically speeds up Spark DataFrame and SQL queries. This engine offers up to 4.9x faster performance and improved price-performance without requiring code changes. Flexible VMs have been introduced to improve cluster resilience by allowing ranked machine type preferences. FinOps features like zero-scale clusters and scheduled stops offer better cost control for development environments. The MCP server integrates generative AI, allowing AI assistants to interact with Spark clusters using natural language for operations. The Data Agent Kit enables data scientists to manage their entire data workload lifecycle within their preferred development environments, integrating with tools like Gemini. Next-generation Lakehouse provides seamless interoperability between Spark and BigQuery, processing data directly from cloud storage or even remote AWS datasets. Cluster Image 3.0, with Spark 4.1 and Java 21, introduces real-time streaming capabilities for continuous, low-latency processing. These updates are available now for Managed Spark clusters, accessible through the Google Cloud console and gcloud CLI.
CdXz5zHNQW_lTUdK8PnPj.jpeg

Scaling AI Agents: A Step-by-Step Guide to Deploying ADK on GKE Autopilot

This tutorial guides you through deploying a Google Agent Development Kit (ADK) agent to Google Kubernetes Engine (GKE) Autopilot. It starts by explaining the architecture, which involves packaging the ADK agent as a Docker image stored in Artifact Registry and running it on GKE Autopilot. The agent will communicate with Vertex AI using Workload Identity for secure permission management. Prerequisites include Python, gcloud SDK, kubectl, and specific API enablement. The initial steps involve configuring your Google Cloud environment where you authenticate and set your project. Next, you provision a GKE Autopilot cluster, which runs in the background while you build the agent. The ADK agent is created using the CLI and configured to use Gemini on Vertex AI. Before deployment, the agent is tested locally through its web UI to ensure functionality. For deployment, you containerize the ADK agent by creating a Dockerfile and building the image, pushing it to Artifact Registry. Workload Identity is then implemented by creating an IAM service account and granting it necessary Vertex AI roles. You then create Kubernetes resources, including a Service Account annotated for Workload Identity, a Deployment for the agent, and a Service to expose it internally. The agent is deployed to GKE using kubectl, and its status is checked. Interaction with the deployed agent is demonstrated via `curl` commands to its API. Optionally, the agent can be exposed externally using the GKE Gateway API with a Google-managed TLS certificate for secure HTTPS access. This involves reserving a static IP, creating an SSL certificate, and defining Gateway and HTTPRoute resources.

What’s new in serverless Managed Service for Apache Spark

Running Apache Spark at scale demands significant resources and expert management. Google Cloud's serverless Managed Service for Apache Spark, specifically its 3.0 runtime version, aims to simplify this by prioritizing speed, simplicity, and reliability. Customer usage of this service for data science has nearly doubled year over year, indicating its growing adoption. The service significantly reduces the time to start running workloads through zero-setup onboarding, automating permissions, networking, and API management. Startup times for sensitive workloads have been reduced by 75%, making serverless Spark suitable for a wider range of applications. Enhanced GPU obtainability is achieved through Dynamic Workload Scheduler Flex Start Mode, which queues requests when GPUs are unavailable. The 3.0 runtime offers first-class support for Apache Spark 4.x innovations, including Spark Connect for decoupled connectivity. Enhanced multi-zonal support is now a default feature, distributing execution nodes across zones within a region for increased availability without extra charges for cross-zonal traffic. Google Cloud is continually innovating to further enhance ease of use in areas like autotuning and autoscaling. Users can start leveraging these features by specifying runtime_version: 3.0 in their workloads.
CdXz5zHNQW_FXigucFN3T.jpeg

Connecting AI agents with unstructured data using Google Cloud Storage MCP Servers

Google Cloud Storage (GCS) is essential for unstructured data in agentic systems. The focus is now on transforming this data into actionable context for agents. Smart storage in GCS makes data agent-ready by enriching passive objects. AI success hinges on seamless agent access to this intelligence for decision-making. This blog highlights three customer examples of agents powered by GCS and explains how to connect agents securely using the Model Context Protocol (MCP). GCS MCP servers, combined with smart storage features, simplify agent deployment. Palo Alto Networks' Strata Co-Pilot uses GCS as its historical memory. Airwallex's AI Assistant leverages GCS for document storage and metadata. Snap's Job Optimization Agent analyzes job data in GCS to find efficiencies. MCP is the universal agent-data standard, and GCS offers Remote and Local MCP server options. The fully-managed Remote MCP server requires no infrastructure and offers immediate access to GCS data. It integrates with major agentic frameworks and ensures robust security through IAM and audit logs. Optionally, Google Cloud Model Armor can be integrated for advanced threat protection. The self-managed Local MCP server is ideal for building custom tools and specific business logic, such as specialized data transformations. It's an open-source option for greater customization. The GCS Local MCP is now part of the MCP Toolbox for Databases, simplifying development and enhancing security. Users can explore the Remote MCP server or the Local MCP GitHub repository to get started.
CdXz5zHNQW_PxY9PQk0UJ.gif

Announcing Spanner Graph algorithms: Google-grade intelligence for connected data

Google Cloud has introduced Spanner Graph algorithms, enabling users to run advanced graph analysis directly within their database. This feature brings Google Research's state-of-the-art graph mining capabilities to Google Cloud customers. Enterprises can leverage these new capabilities to derive insights from complex connected data faster and more efficiently. Graph technologies are increasingly used for applications like fraud detection, social network analysis, and healthcare research. Historically, running graph algorithms at scale has been challenging, requiring complex data pipelines or impacting transactional performance. Spanner Graph algorithms are designed to handle demanding workloads without compromising the operational database's performance. They offer tight integration with GQL, allowing direct invocation of algorithms alongside standard queries, minimizing data movement. Algorithm execution occurs on dedicated compute resources, ensuring near-zero impact on live production traffic. This new engine can process massive graphs with billions of edges in minutes, optimizing performance through dense data encoding. Spanner Graph unifies relational and graph models, allowing developers to query connected data using ISO GQL. The new algorithms help uncover hidden patterns and insights in connected data, such as detecting fraud or recommending products. Key algorithms include Centrality, Community Detection, and Similarity and Path Finding. Users can invoke these algorithms directly using GQL and write results back into Spanner Graph or Cloud Storage. Examples demonstrate how these algorithms can be used to identify fraud ringleaders by combining community detection and PageRank. Industry leaders like DaVita, Yahoo!, and SoundCloud are already utilizing Spanner Graph algorithms for their complex data challenges.
CdXz5zHNQW_KLmuPh1ZYo.jpeg

Accelerating data lakes: Optimizing Apache Iceberg and Spark with gcs-analytics-core

The gcs-analytics-core, a new open-source Java library, aims to resolve the challenges data engineers face with compatibility and performance across multiple analytics engines when working with Google Cloud Storage. This library offers flexibility in choosing analytics engines while ensuring high performance on GCS. It provides optimizations for existing GCS analytics engines like Iceberg Spark and plans to expand to others. The library consolidates and enhances performance for analytics workloads on GCS, integrating natively with Apache Iceberg from version 1.11.0. It improves read operations for columnar formats like Parquet by acting as a centralized optimization layer between analytics engines and the GCS Java SDK. Key technical optimizations include threaded vectored I/O for parallel data range fetching and smart Parquet prefetching for metadata. The first major integration is with Apache Iceberg, where engines using its GCSFileIO automatically benefit from these enhancements. The library is compatible with all Iceberg catalogs, offering consistent read improvements without infrastructure changes. TPC-DS benchmarks using Spark and Iceberg demonstrated significant scan time and execution time improvements across various dataset sizes. To get started, users need Apache Iceberg Spark runtime 1.11.0+, GCSFileIO configured, and specific optimization flags enabled. The library is open source, hosted on GitHub, and encourages community contributions and feedback.
CdXz5zHNQW_mMXNFmmBfk.jpeg

Experimenting with TPUs, GKE Managed DRANET, and Multi-cluster Inference Gateway

This blog explores setting up a highly available AI inference workload on Google Kubernetes Engine. The goal is to ensure service access even if one region fails, using features like Dynamic Resource Allocation (DRA) and Inference Gateway. The experiment uses Google Kubernetes Engine (GKE) managed DRANET for resource sharing between Pods, supporting TPUs across different regions. A multi-cluster GKE Inference Gateway balances AI/ML workloads across multiple clusters, crucial for failover scenarios. Cloud Storage FUSE provides a centralized location for models and logs, speeding up deployments. A Virtual Private Cloud (VPC) ensures secure communication, while GKE Fleets unify cluster management. The setup utilizes TPU v6e accelerators for high-performance AI computation. The design involves deploying a large language model to two GKE clusters in separate regions, both using four TPU v6e chips. The model is stored in Cloud Storage, and traffic is routed globally with failover capabilities. Essential building blocks include VPC and subnet configuration, firewall rules, and static IP reservations for the gateway. Clusters are created with the Gateway API and Cloud Storage FUSE CSI driver enabled, along with dedicated TPU v6e node pools. Managed DRANET is enabled on these node pools for accelerated networking. Clusters are registered to a GKE Fleet for unified management, enabling multi-cluster service discovery and ingress. The AI workload involves downloading model weights to Cloud Storage and deploying inference servers that utilize DRANET for TPU access. The multi-cluster Inference Gateway is configured with custom resource definitions, autoscaling metrics, and an InferencePool for unified deployments. Finally, the gateway is configured to route traffic to the most available region, and testing confirms seamless failover to the secondary cluster during a simulated outage.
CdXz5zHNQW_UvLTnPEClj.png

Modeling a digital twin of a food supply chain using BigQuery Graph

Growing a restaurant chain presents unique challenges that traditional spreadsheets cannot effectively address. These challenges include the bullwhip effect in the supply chain, the erosion of brand consistency due to SOP drift, the wide-reaching impact of food safety issues, and financial losses from maverick spending. To manage these complexities, a digital replica of the business, known as a digital twin, is essential. While relational databases have been standard, their limitations in tracing dependencies become apparent with large-scale operations. BigQuery Graph offers a solution by enabling users to build a digital twin of their supply chain within their existing data platform. This is achieved by modeling physical entities like items and locations as a network of nodes and edges. The system defines a semantic layer by creating a Graph View over existing tables, illustrating their relationships. This approach shifts operations from reactive problem-solving to proactive precision, allowing for surgical recalls and detailed risk analysis. Graph queries simplify complex data analysis, making it easier to trace relationships and gather insights quickly. For optimal results, focus on structuring relationships with graphs, ensure data integrity through clean keys, and capture metadata on the edges of the graph. Ultimately, BigQuery Graph allows businesses to move beyond managing data as simple lists and instead visualize critical inter-domain relationships in seconds.
CdXz5zHNQW_scvEnWRnwp.png

The fully-managed Remote MCP Server for AlloyDB is now Generally Available

AI agents require high-quality, accessible context to function reliably, often found in operational databases. To address this, Google Cloud has released the Remote Model Context Protocol (MCP) Server for AlloyDB, which is now generally available. The MCP is an open-source standard enabling LLMs to securely connect to external data sources. This integration allows AI agents to access real-time enterprise data from AlloyDB, preventing inaccuracies from stale information. AlloyDB provides a robust foundation for agentic applications with its superior vector performance, advanced search capabilities, and real-time intelligence through built-in AI Functions. It offers unified data access by allowing seamless joining of data from AlloyDB, BigQuery, and Iceberg tables. The Remote MCP Server for AlloyDB runs on managed Google Cloud infrastructure, simplifying connectivity for production workloads. It features centralized discovery, fully-managed HTTP endpoints, and fine-grained authorization using Google Cloud IAM for enhanced security. The AlloyDB toolset also empowers agents with operational instance management and Model Armor protection against data exfiltration. Audit logging provides a complete trail of all agent actions. Getting started involves API preparation, database provisioning, and agent configuration with the remote endpoint and IAM credentials. This enables agents to provide reliable, grounded answers by understanding database schemas and executing complex queries, enhancing enterprise agentic applications.
CdXz5zHNQW_ZmZ1bXr5CP.gif

Introducing the GKE standby buffer: Improve node startup times without blowing your budget

Google Kubernetes Engine has introduced a new feature called standby buffers to address the trade-off between application performance and cost. Previously, over-provisioning guaranteed quick startups but was expensive, while minimizing costs led to slow cold starts. GKE standby buffers offer a low-cost, suspended capacity buffer for clusters. This feature builds upon GKE active buffers, which provide readily available capacity to handle traffic spikes with near-zero startup latency. Standby buffers maintain a suspended capacity that incurs minimal cost, typically in the low single-digit percentage. When compared to a cluster without standby buffers, one utilizing them experienced significantly lower latency during identical traffic loads. The cluster with standby buffers maintained single-digit second P50 latency, with P95 and P99 metrics briefly peaking at one minute. Traditional Kubernetes autoscaling is slow and requires workarounds like lowering HPA thresholds or managing balloon pods. These workarounds are often expensive and operationally complex. GKE active and standby buffers offer a declarative approach to capacity management, eliminating the need for manual configurations. Standby buffers reduce infrastructure costs by suspending compute capacity and only incurring persistent disk and IP address costs. Combined with active buffers, they enable near-instant pod scheduling with performance comparable to over-provisioning, but at a much lower price. The system prioritizes refilling the active buffer from the standby buffer, which handles extended loads and protects against slow node cold starts. Standby buffers resume 2-3 times faster than creating a fresh node, effectively bridging the gap between cold starts and always-on capacity. This feature allows businesses to dynamically balance performance and cost for various workloads.
CdXz5zHNQW_8oX6vyeOVS.png

How Trustpilot built a real-time architecture for data enrichment using Gemma

Trustpilot processes millions of user reviews daily, a complex task with strict latency and cost constraints. They are transitioning their core AI strategy to generative AI, partnering with Google to build a high-volume streaming pipeline using fine-tuned Gemma models. This move allows Trustpilot to extract deep, actionable intelligence from reviews, performing tasks like named entity recognition, categorization, sentiment scoring, and intent pinpointing. Opting for fine-tuning open-weight models like Gemma provides Trustpilot total model independence, predictable economics, and expanded MLOps capabilities. They built a suite of specialized models based on the lightweight google/gemma-2-9b. High-quality training datasets were generated using Gemini teacher models and consensus annotation over a stratified review sample. These datasets enabled fine-tuning custom models that outperformed legacy solutions and neared teacher model accuracy. The system architecture utilizes Dataflow and Gemini Enterprise Agent Platform Endpoints, decoupling business logic from raw LLM inference for scalability. Performance tuning focused on optimizing vLLM on A2 VMs with A100 GPUs and configuring the vLLM backend for high streaming volumes. Challenges included private networking limitations, deployment observability, and GPU scarcity in the EU region. Ultimately, Trustpilot achieved Gemini-like performance at a fraction of the cost, transforming millions of reviews into instant, actionable insights.
CdXz5zHNQW_BiOLPvCRmR.png

Cloud CISO Perspectives: How to build an AI-ready security program for the public sector

This Cloud CISO Perspectives focuses on building AI-ready security programs for government and critical infrastructure. The article, by Usman Chaudhary, provides a roadmap for CISOs, highlighting actionable steps. The core of the approach involves building custom workflows, integrating commercial AI, and incorporating them into existing security. The recommended plan includes a 90-day focus, with tactical goals within six months and strategic objectives within six to twelve months. These initiatives are organized across five CISO workload domains for efficient resource allocation and immediate wins. The immediate actions involve executive alignment for business justification, automating context gathering, and vendor optimization. Threat intelligence, SOP integration, and talent development are key within six months. The strategic goals within six to twelve months include Posture elevation, advanced governance, and incident response implementation. The focus is on proactive defense, with AI used for vulnerability prioritization, threat hunting, and automated incident response including the use of Gemini for Government.
CdXz5zHNQW_CtIrww9vHh.jpeg

AlloyDB Hot Standby: Faster failovers, consistent performance

AlloyDB for PostgreSQL is a fully managed database service compatible with PostgreSQL, designed for demanding enterprise needs. It offers exceptional performance, scalability, and availability, integrating seamlessly with Google's infrastructure. The core High Availability (HA) architecture involves an active node and a standby node within different zones. Traditionally, the standby node was idle, leading to potential delays during failover due to database restarts and cache warming. AlloyDB introduces Hot Standby, changing the standby node's role to continuously process write-ahead logs. This dramatically reduces failover times because the standby is already running. Hot Standby also ensures consistent performance after failover, maintaining warm memory caches for optimal speed. This improved availability and resilience are offered at no extra cost to AlloyDB users. A demonstration highlights the faster failover and immediate resumption of performance with Hot Standby. The upgrade is initially available in PostgreSQL 18 and will be rolled out to earlier versions soon. AlloyDB's 99.99% SLA is further enhanced with these advancements in High Availability. Hot Standby represents a significant step in providing a superior PostgreSQL experience.
CdXz5zHNQW_wdPYMX29IR.png

From petabytes to predictions: Easy BigQuery insights in Google Sheets

Organizations often rely on BigQuery for their primary data source, but ad-hoc analysis frequently occurs in Google Sheets. Transferring data between these platforms typically involves inefficient methods like CSV exports, leading to data silos and security concerns. Connected Sheets addresses this by providing a direct, live connection between BigQuery and Google Sheets. This allows users to analyze massive datasets within the familiar spreadsheet interface, without requiring SQL knowledge. Data admins can control access and maintain security, while users gain agility and ease of use with features like pivot tables and charts. Connected Sheets facilitates self-service analysis, operational reporting, and hybrid data modeling, empowering users across various business functions. Business analysts can build customized reports and dashboards, utilizing live data from BigQuery. The setup is simple, requiring only a Google Workspace account and a billing-enabled Google Cloud project, accessible either from Google Sheets or the BigQuery console. The goal of Connected Sheets is to empower users to leverage the scalability of the cloud and the flexibility of spreadsheets. It makes accessing and working with large datasets easier and more efficient, ultimately putting data directly into the hands of those who need it.
CdXz5zHNQW_MrmvTOsjbj.gif

Developer's guide to Gemini Enterprise and A2UI integration

The text introduces A2UI, an open protocol for creating rich, interactive user interfaces within chat applications. It addresses the limitations of chatbots that primarily use text, which leads to inefficient multi-turn conversations. A2UI allows agents to return UI components described in a JSON payload, enhancing user experience. The protocol is declarative, streaming-friendly, and framework-agnostic. It works by utilizing a four-layer stack, with A2UI defining the data format or "cargo" transported across the pipeline. The integration with Gemini Enterprise (GE) is simplified because GE has a built-in A2UI renderer. Developers build A2A agents embedding the A2UI components alongside the tools. These A2A agents are then registered with GE as endpoints. When a user interacts with an A2UI widget, GE sends the interaction back to the agent as input. The reference implementation uses an ADK backend for seamless GE integration. The agent uses Google Maps Embed iframe for components like GoogleMap. The development process involves cloning a reference repository and configuring GE. With A2UI, chatbots can utilize interactive elements like date pickers for a smoother user experience.
CdXz5zHNQW_EnvqazX08Y.jpeg

AI in SRE: Where and how Google is deploying agentic AI to improve operations

Google's SRE team is evolving to incorporate AI, driven by increasing system complexity due to AI advancements. This "SRE AI" initiative aims to leverage AI to enhance the software development lifecycle. Opportunities for AI integration span various phases, including reliability design and incident management. AI-powered anomaly detection and alerting are being implemented to improve response times and reduce alert fatigue. AI agents are being developed to streamline incident investigation, communication, and postmortem processes. The SRE team is building AI Insights to analyze historical incidents and improve risk management. Key design principles prioritize transparency, security, and reliability in AI agent operations. Google SRE AI leverages Google's existing infrastructure, including Gemini and the Agent Development Kit. Autonomous level tracking helps assess the systems' true autonomy. The ultimate goal is to improve service reliability, reduce operational costs, and empower engineers. The whitepaper provides more detailed information on Google's approach to SRE AI.
CdXz5zHNQW_0GUKjXPWQS.png

Nano Banana 2 and Nano Banana Pro are generally available, and already powering creative workflows

Organizations are integrating AI into creative workflows for next-generation experiences. New enterprise-grade AI models, Nano Banana 2 and Nano Banana Pro, are now generally available through the Gemini Enterprise Agent Platform. These models enable high-quality image generation and editing within applications and workflows, backed by robust infrastructure and security. A new preview feature allows Nano Banana 2 to process video files as input prompts, expanding its multimodal capabilities. Customers are using these models to innovate across various industries. In marketing and creative fields, companies like Adobe and WPP are scaling tailored campaigns and enhancing brand engagement. Retailers such as Shopify and URBN are leveraging these capabilities for immersive shopping experiences and accelerated product development. Media and entertainment companies are building next-generation production workflows, ensuring directorial control. Google Cloud offers the models and tools to build enterprise-scale multimodal experiences. Developers can access these models via the Gemini API, with enterprise SLAs available for the Enterprise Agent Platform.
CdXz5zHNQW_1nWvaoQ9CP.png

Evolving Dataflow to process massive datasets for machine learning

Google's data platform, Flume, evolved from MapReduce to address the massive data processing needs of the AI era. Innovations in Flume, now available in Google Cloud's Dataflow, focus on scalability, efficiency, and developer experience. To handle immense scale, Dataflow features liquid sharding for dynamic work unit rebalancing and global compute for scheduling across Google's infrastructure. Automatic pipeline optimization reduces overhead by fusing operations, while rate-limiting external API calls prevents system overload. Tandem pools enable serverless remote inference, overcoming scalability limitations. For efficiency, Dataflow offers heterogeneous worker pools to match workloads with appropriate accelerators like TPUs. TPU-aware autoscaling and duty-cycle policy enforcement optimize TPU utilization, reducing costs. TPU fungibility ensures jobs are scheduled to the most suitable TPU version and location. The developer experience is enhanced with language flexibility through a versatile SDK supporting multiple programming languages and SQL. Integration with ML frameworks like JAX and native support for LLM optimizations are provided. Unified batch and streaming processing allows users to employ the same code for both historical and live data. Observability through a monitoring UI offers comprehensive control and diagnostic data for production pipelines. Advanced workflows like sampling, dry-run, and pausing/resuming pipelines expedite development and operations. Dataflow brings these internal Google innovations to Google Cloud customers, enabling them to tackle demanding ML applications.

Go from resource-level to business-level maintenance in Google Cloud

Managing planned maintenance in a growing cloud environment can be complex and fragmented. Currently, tracking updates often requires manually linking infrastructure resources to the business services they support. Google Cloud is launching App-centric maintenance visibility within Unified Maintenance to simplify this process. This new feature shifts the focus from individual resources to a business-oriented view of maintenance. It integrates with App Hub, allowing users to see maintenance events in the context of their applications. The "application" is now the primary unit for managing maintenance. Resources registered in App Hub, such as GKE clusters or GCE VMs, will have their maintenance schedules aggregated in a single dashboard. This reduces manual effort for platform engineers by eliminating the need for mapping alerts to owners. It also enables faster problem triage by correlating performance issues with infrastructure updates. The goal is to provide predictable operations with a business-centric understanding of maintenance impacts across all services. Users with existing applications in Google Cloud can access these features in the Cloud Console and consult a guide for setup.
CdXz5zHNQW_Oyi9ZwouA7.png

How the University of Central Oklahoma is using AI to streamline analysis of complex criminal cases

The University of Central Oklahoma's Forensic Science Institute (FSI) partnered with Google to develop an AI solution aimed at accelerating criminal case analysis. This collaboration leverages Google's NotebookLM to significantly reduce the time required to analyze complex documents and construct timelines. The project originated from an AI hackathon led by UCO’s CIO, Sonya Watkins, focusing on identifying impactful AI solutions. The hackathon prioritized a case timeline analysis tool using Gemini for idea generation and impact assessment. Initial trials have demonstrated a substantial reduction in the months-long burden of case analysis. The UCO team, including FSI instructors, is meticulously ensuring the AI-generated timelines meet forensic standards. They are developing a repeatable framework, ensuring AI conclusions are directly linked to original source documents. This framework aims for national adoption, standardizing evidence processing for forensic institutes and law enforcement. The project's goal is to create a scalable solution, aiding in the delivery of justice nationwide. Google is offering Gemini for Education to empower institutions with similar AI-powered research capabilities.

Announcing the newest cohort of the Google for Startups Accelerator: Middle East, North Africa & Turkey

Google aims to organize global information and support AI-driven startups, especially in the MENA-T region. The Google for Startups Accelerator program fosters innovation in this area. A new cohort of 15 companies will start the program on June 1st. These startups will benefit from mentorship, technical support and resources for navigating regional complexities.. The previous cohort, which concluded in November 2025, achieved significant milestones with Google expert guidance. These achievements included business strategy refinement and acceleration of AI/ML initiatives. The 2026 program offers additional focus on challenging geopolitical conditions. The selected companies represent diverse fields, utilizing AI for healthcare, e-commerce, education and more. The three-month curriculum provides technical expertise, including security and generative design training. The program integrates strategic business modeling to empower startups to scale innovations. COGNNA saw substantial growth, closing a significant funding round after program improvements. Smart Bricks also secured funding and used Google's AI tools for real estate investment automation. Google is committed to supporting regional founders, providing infrastructure for continued digital growth and innovation.
CdXz5zHNQW_c6CNjEp6Al.jpeg

A Guide to AI Cold Starts on Cloud Run

The text explores the problem of long cold start times for AI models on Cloud Run, frustrating developers and leading some to avoid serverless GPUs. It details the four phases of an AI cold start: infrastructure provisioning, container image streaming, engine initialization, and model loading. The author provides best practices, drawn from Google Cloud documentation, to improve these cold starts. Key optimization strategies include optimizing model format and size, choosing efficient storage options, using startup CPU boost, and configuring Direct VPC Egress. The author emphasizes concurrency tuning to maximize GPU utilization and prevent unnecessary scale-outs. Additional strategies propose a single-region "always-on" service for global deployments or using "wake-up calls" to mask cold starts. He also stresses the importance of adjusting startup probes and gives examples from Elastic's approach to manage multiple service variants. These strategies aim to turn the infrastructure from a problem into a scalable and reliable AI solution. The takeaway is optimizing cold starts is vital for moving AI from hobby projects to production-ready deployments.
CdXz5zHNQW_kcXbDOy26t.jpeg

Introducing Google AI Threat Defense to help you outpace the adversary

Cyber threats are evolving quickly due to the rise of AI, requiring organizations to adapt their defenses. Google is launching AI Threat Defense to combat these evolving threats. Built on Google's security expertise, the system utilizes AI to automate threat detection and response. The platform integrates Mandiant and Wiz with Gemini's capabilities for comprehensive security. AI Threat Defense focuses on preparing, scanning and prioritizing, remediating, and monitoring vulnerabilities. The core features include automated vulnerability identification, remediation, and continuous monitoring. The system uses AI agents to actively predict attack paths and prioritize threats for faster patching. By integrating CodeMender, the platform offers automated patch generation and testing to speed up remediation. It aims to reduce remediation time significantly and improve overall security posture.
CdXz5zHNQW_5WZTzAxzf8.gif

How we evolved Google’s global and data center networks for the AI era

Google's network has evolved significantly, transitioning with each technological era, and now faces the demands of AI. The AI era requires enormous compute and specialized networking, exceeding the capabilities of individual data centers. Google strategically locates data centers with sustainable energy and leverages its network to create a hypercomputing resource. This necessitates a vertically integrated AI stack, from chips to applications, underpinned by the AI Hypercomputer. The network's core elements include the internal fabric, cross-site connectivity, and the global network. Within the AI Hypercomputer, Virgo Network provides massive bandwidth, low latency, and resilience for AI workloads. Autonomous reliability features and high-resolution telemetry enhance network operations and minimize downtime within the fabric. Scaling AI workloads across campuses requires WAN optimization, including multi-shard networks and AI-native Cloud Interconnect. The global network supports AI inference, providing low latency and high availability through extensive infrastructure. These network innovations are directly incorporated into Google Cloud environments, supporting AI workloads for its users.
CdXz5zHNQW_t9FVgUkNUQ.jpeg

New study: Securing AI in the browser is a top priority for IT Leaders

Generative AI has become a common tool in the workplace, presenting new security challenges for IT leaders. The browser is now the primary workspace, necessitating enhanced security measures. A recent Omdia report surveyed IT professionals, revealing that browser security is a top priority for most organizations. GenAI usage is widespread, with most organizations allowing employees to use public applications. This creates new risks for data leakage and emphasizes the importance of securing AI. Browser-based threats, including AI-powered phishing and data leakage, are major concerns for IT professionals. Securing GenAI applications is a critical use case when evaluating new security solutions. Traditional network-based security tools are often insufficient in this evolving landscape. Chrome Enterprise offers a solution by integrating AI security directly into the browser. It allows organizations to discover and govern AI usage, enforce data protection controls, and set context-aware access policies. Securing the browser is now crucial for protecting organizations in the modern, AI-driven workplace.
CdXz5zHNQW_EXfLNFtWW0.png

Exploitation of KnowledgeDeliver via ViewState Deserialization Vulnerability

In late 2025, Mandiant investigated a KnowledgeDeliver LMS compromise due to a critical vulnerability: unauthenticated RCE. The vulnerability stemmed from identical, pre-shared ASP.NET machine keys across all deployments. Attackers exploited this to inject malicious code, aiming to infect users via the website. Because the keys were the same, compromising one instance could compromise all. The attackers deployed a .NET-based in-memory web shell called BLUEBEAM to maintain access. They modified files, including JavaScript, to display fake security alerts and load a remote script. This led to Cobalt Strike BEACON infections on user workstations. To detect such attacks, organizations should monitor application event logs (Event ID 1316), suspicious process activity from w3wp.exe, and file integrity. Monitoring for anomalous user-agent strings is also critical. Remediation involves rotating machine keys and restricting access to the LMS. The incident highlights the risks of shared secrets in deployment templates.

2 PhaaS 2 Furious: The Evolution of Chinese-language Phishing Services

Chinese-language phishing-as-a-service (PhaaS) is rapidly growing, offering mature services and intricate ties to the broader criminal ecosystem. These platforms lower the barrier to entry for cybercriminals, enabling social engineering and credential theft. This ecosystem is moving from password harvesting to real-time interception of OTPs to bypass MFA. Attackers are exploiting digital wallets, turning stolen data into tokenized assets for unauthorized financial control. Unlike Russian counterparts, Chinese-language PhaaS often targets the general public and operates openly. These services offer extensive ancillary services including PII sales, hosting, and money laundering. Attacks leverage RCS and iMessage for encrypted delivery, and real-time interception via administration panels. AI is being used to automate phishing page generation, making detection harder. These platforms are supporting localization-as-a-service, generating localized content for diverse international markets. YY Lai Yu, a case study, focused on Japan with sophisticated targeting and anti-bot measures. The ongoing evolution of these platforms underscores the need for technical security controls beyond user education, like FIDO2/WebAuthn.
CdXz5zHNQW_Hrw87qeaFv.png

The Blueprint: How Movix fills a gap in dental skills with specialized agentic AI

Movix is developing AI-driven solutions for dental appliance manufacturers to address the growing demand and shortage of skilled technicians. The company focuses on creating agentic workflows to improve quality control and streamline processes, reducing costly errors in aligner manufacturing. Their solution utilizes custom AI models built on Google Cloud infrastructure for tasks like defect detection in 3D scans. Cloud Run and Compute Engine provide the necessary scalability and computational power for handling large volumes of data. This allows Movix to integrate with existing dental systems through APIs, catering to a traditionally conservative market. Their agentic approach automates manual tasks, improving efficiency and reducing turnaround times for dental appliances. Orthero, a company using their technology, is already experiencing faster and more consistent quality control. Movix's architecture is built to ensure security and compliance with healthcare regulations. They aim to develop multiple AI agents to cover the entire dental appliance workflow by 2029. Movix offers a hybrid solution that incorporates older systems, expanding market reach.
CdXz5zHNQW_TOZCcb84ee.jpeg

The top announcements for startups from Google I/O ‘26

Google Cloud is focusing on AI solutions for startups, offering a complete AI stack and agentic capabilities. They introduced updates at Google Cloud Next '26 and I/O '26 to improve speed, efficiency, and development workflows. Key features include new generations of Gemini models with enhanced intelligence and cost-effectiveness like Gemini 3.5 Flash and Omni for video creation. Google Antigravity, the ultimate control plane, allows orchestrated AI workflows and specialized agent management. They've streamlined developer workflows using native Android support and managed agents. Gemini Spark, a personal AI agent, assists with daily operational tasks and workflows. The advancements aim to help startups build, test, and deploy applications faster. Google Cloud is also providing a straightforward path from prototype to production for startups. Finally, Google for Startups launched an AI Agents Challenge with cloud credits and a prize pool.
CdXz5zHNQW_rbm1HP9qlH.png

Shipping features to production just got easier with new feature flags in AppLifecycle Manager

Fear often arises before releasing new features, especially with accelerated AI-driven code generation. Feature flags provide a solution to this by separating code deployment from feature release, enabling safer rollouts. Google is launching AppLifecycle Manager Feature Flags (ALM FF) as a public preview service utilizing this principle. ALM FF enhances development velocity by decoupling feature releases from code deployments, allowing for continuous code deployment with feature-level control. This service facilitates gradual enablement via percentage-based rollouts and precise user targeting using the Common Expression Language (CEL). It allows dynamic configuration updates, like system prompts, without code changes or infrastructure modification. ALM FF is built on the OpenFeature standard, ensuring portability and industry best practices without Google-specific dependencies. Teams can now control releases using percentage rollouts and precise user targeting. The service enables instant disabling of problematic features, acting as a quick kill switch. AppLifecycle Manager Feature Flags offers a valuable tool for software development.
CdXz5zHNQW_qxBigBILUN.png