Azure SQL Managed Instance failover groups replicate user databases between instances in different regions. Creation of these failover groups failed in a customer scenario despite apparent network connectivity. The root cause was identified as network path behavior in a hub-spoke topology with centralized firewalling. Specifically, issues arose with required ports, routing, and traffic handling between the managed instances. Failover group replication requires bidirectional TCP connectivity on ports 5022 and 11000-11999 between the instance subnets. In hub-spoke architectures, traffic passes through a central firewall which can interfere with these connections. Problems can include incorrect port allowances, firewall policies overriding network security groups, asymmetric routing, SNAT/NAT application, and aggressive session idle timeouts. DNS zone mismatch and address space overlap are also critical prerequisites. To resolve the issue, prerequisites were confirmed, port openness was ensured across all network points, and the routed path through the firewall was stabilized. Symmetric routing and minimal firewall intervention on east-west traffic were crucial. SQL MI-aware connectivity tests were used for validation. After these changes, failover group creation, seeding, and replication were successful. When deploying failover groups in hub-spoke architectures with centralized firewalls, network path behavior is a primary consideration.
techcommunity.microsoft.com
techcommunity.microsoft.com
