In the ever-evolving world of cloud computing, businesses are increasingly turning to multi-region strategies to enhance the resilience and availability of their critical applications. However, the challenge of ensuring seamless operations during region failovers remains significant. In response to this, Amazon Web Services (AWS) has unveiled a groundbreaking solution called the Amazon Application Recovery Controller (ARC) Region Switch. This innovative tool is designed to empower organizations with the capability to plan, practice, and execute region switches with unprecedented confidence and efficiency, thereby eliminating the uncertainties associated with cross-region recovery operations. Let’s delve into the details of this new offering and explore how it can benefit enterprises operating on the AWS cloud.
Understanding the Need for Region Switching
Many enterprises deploy their applications across multiple AWS regions to meet stringent availability requirements. This multi-region setup ensures that even if one region experiences issues, operations can continue unhindered in another region. However, orchestrating a region switch involves a complex interplay of various AWS services such as compute, databases, and DNS. Traditionally, this coordination required extensive scripting, frequent testing, and manual data gathering to ensure compliance and successful recovery. The ARC Region Switch aims to simplify this process by providing a centralized, automated solution to manage recovery tasks across AWS services and accounts.
Introducing the ARC Region Switch
The ARC Region Switch is a fully managed and highly available tool that offers a reliable framework for orchestrating region switches. Its design is based on a Regional data plane architecture, which means that recovery plans are executed from the region being activated, thereby eliminating dependencies on the impacted region. This architectural choice enhances the resilience of the recovery process by ensuring that execution is independent of the region being switched from.
Crafting a Recovery Plan with ARC Region Switch
The core functionality of the ARC Region Switch lies in its ability to create comprehensive recovery plans. These plans outline the specific steps required to switch applications between regions and are composed of execution blocks, each representing an action on AWS resources. At launch, the ARC Region Switch supports nine types of execution blocks, including:
- ARC Region Switch Plan Execution Block: Orchestrates the order in which multiple applications switch to the desired region by referencing other region switch plans.
- Amazon EC2 Auto Scaling Execution Block: Scales Amazon EC2 compute resources in the target region to match a specified percentage of the source region’s capacity.
- ARC Routing Controls Execution Block: Alters routing control states to redirect traffic using DNS health checks.
- Amazon Aurora Global Database Execution Block: Manages database failover or switchover with varying degrees of data loss for Aurora Global Database.
- Manual Approval Execution Block: Introduces approval checkpoints in the recovery workflow for team review and approval.
- Custom Action AWS Lambda Execution Block: Adds custom recovery steps by executing Lambda functions in either the activating or deactivating region.
- Amazon Route 53 Health Check Execution Block: Redirects application traffic during failover based on DNS configuration.
- Amazon Elastic Kubernetes Service (Amazon EKS) Resource Scaling Execution Block: Scales Kubernetes pods in the target region to match a specified percentage of the source region’s capacity.
- Amazon Elastic Container Service (Amazon ECS) Resource Scaling Execution Block: Scales ECS tasks in the target region by matching a specified percentage of the source region’s capacity.
The ARC Region Switch continuously validates these plans by checking resource configurations and AWS Identity and Access Management (IAM) permissions every 30 minutes. During execution, it monitors the progress of each step and provides detailed logs for tracking.
Balancing Cost and Reliability
One of the standout features of the ARC Region Switch is its flexibility in preparing standby resources. Organizations can configure the desired percentage of compute capacity in the destination region during recovery, allowing for efficient resource allocation. For critical applications expecting a surge in traffic during recovery, scaling beyond 100 percent capacity might be necessary. However, it’s crucial to understand that using scaling execution blocks does not guarantee capacity, as actual resource availability depends on conditions in the destination region at the time of recovery. Regular testing of recovery plans and maintaining appropriate service quotas are recommended to achieve the best possible outcomes.
Monitoring and Managing Recovery Plans
The ARC Region Switch includes a global dashboard that enables users to monitor the status of region switch plans across their enterprise and regions. Additionally, a regional executions dashboard is available to display executions within the current console region, ensuring availability during operational events.
Moreover, the ARC Region Switch allows resources to be hosted in an account separate from the one containing the region switch plan. This cross-account functionality is facilitated by the executionRole, which assumes the crossAccountRole to access necessary resources. Recovery plans can also be centralized and shared across multiple accounts using AWS Resource Access Manager (AWS RAM), streamlining management across the organization.
A Step-by-Step Guide to Implementing ARC Region Switch
To help users understand how to implement the ARC Region Switch, let’s walk through a demo involving three main steps: creating a plan, defining a workflow, and configuring triggers.
Step 1: Create a Plan
Begin by navigating to the Application Recovery Controller section of the AWS Management Console. Select "Region switch" from the navigation menu and choose "Create Region switch plan." After naming the plan, specify a multi-region recovery approach (active/passive or active/active). In active/passive mode, two application replicas are deployed in two regions, with traffic routed to the active region only. The passive region replica can be activated by executing the region switch plan. Set the primary and standby regions, and optionally enter a desired recovery time objective (RTO) to gain insights into execution times.
Assign a plan execution IAM role to allow the region switch to call AWS services during execution. Ensure that the chosen role has the necessary permissions and refer to the IAM permissions section of the documentation for guidance.
Step 2: Create a Workflow
Once the plan evaluation status notifications turn green, proceed to create a workflow. Select "Build workflows" to begin. Workflows can be constructed using execution blocks that run sequentially or in parallel, orchestrating the order in which multiple applications or resources recover into the activating region. For this demo, the graphical editor is used to design the workflow, although the workflow can also be defined in JSON for automation and storage alongside infrastructure as code (IaC) project files.
The ARC Region Switch conducts evaluations every 30 minutes to validate recovery strategies. This proactive validation checks IAM permissions and resource states, ensuring that all workflow actions will succeed when executed. However, regular testing in real-world scenarios is recommended to verify effectiveness, understand recovery times, and ensure team familiarity with recovery procedures.
Step 3: Create a Trigger
Triggers define the conditions for activating workflows and can be expressed as a set of CloudWatch alarms. While alarm-based triggers are optional, manual triggers can also be used with the region switch. From the region switch page in the console, navigate to the "Triggers" tab and select "Add triggers." Define triggers for each region in the plan by selecting the appropriate alarms and their states.
After setting up the plan, workflow, and triggers, it’s crucial to test the execution of the plan to switch regions. Execute the plan from the region being activated, utilizing the data plane in that specific region.
Pricing and Availability
The ARC Region Switch is available in all commercial AWS regions at a cost of $70 per month per plan. Each plan can include up to 100 execution blocks, or users can create parent plans to orchestrate up to 25 child plans.
In summary, the ARC Region Switch represents a significant advancement in multi-region application resilience. By providing a centralized, automated solution for orchestrating region switches, AWS empowers organizations to confidently manage cross-region recovery operations. As businesses continue to rely on the cloud for critical applications, tools like the ARC Region Switch will play an essential role in ensuring seamless operations and minimizing downtime in the face of regional disruptions. To learn more, visit the official AWS ARC Region Switch page.
For more Information, Refer to this article.
































