4

Optimization and Management in Azure

Introduction

Cloud architectures with virtualization create new challenges for management and optimization. Instead of a traditional IT approach of dedicating specific computing resources to specific applications and overprovisioning to ensure enough compute power or storage, the cloud shares resources dynamically. This opens the door to flexibility, scalability, and infrastructure efficiency. However, it also increases the complexity of monitoring, tuning, identifying and resolving issues, and maximizing cost-effectiveness.

In this chapter, we look at management, first in Azure-only resources, then in hybrid environments that can leverage Azure management tools. We continue with a discussion about diagnosing service problems in Azure and how to get support. Finally, because the cloud and Azure specifically must also deliver on the promises of affordability compared to other architectures, we finish the chapter by presenting Azure opportunities for cost savings and optimization.

Managing and optimizing your Azure resources

When you choose Azure, you open the door to many different cloud opportunities. Scalable, pay-as-you-go virtual machines (VMs), geo-localized data storage, websites, queue management, and artificial intelligence are just some of Azure's capabilities. The control plane of Azure provides multiple tools to help you do this.

Azure Resource Manager

As a component of the Azure control pane, Azure Resource Manager is the deployment and management service for Azure. It lets you create, update, and delete resources in your own Azure space. Examples of Azure resources are VMs, storage accounts, web apps, databases, and virtual networks. As Azure resources proliferate, you can save time and effort by applying management actions across groups of resources, as well as to individual ones. These group-oriented actions can have multiple advantages:

  • Prevention of security issues such as connections from multiple locations or from suspicious IP addresses by applying access control to all services in a group
  • Logical organization of all the resources in a user subscription by applying tags to those resources
  • Deployment of resources in the correct sequence by defining the dependencies between them
  • Clarification of the billing at an organizational level by viewing costs for a group of resources, each with the same tag

You can also use declarative templates that specify what you want to happen (the result) to resources, instead of spelling out each individual step or command to detail how you want those things to happen (the method). Thus, for example, you can define a declarative template to immediately create a storage account for VM disks faster, more easily, and more reliably than trying to list all the necessary steps to make and configure the account.

Azure provides four levels of scope: management groups, subscriptions, resource groups, and resources. You can apply management settings at any of these levels of granularity to determine how broad or focused the effect is. Lower levels inherit settings applied at higher levels.

In addition, governance functions in Azure show how IT projects contribute to meeting business objectives. These functions help your organization achieve its goals through effective and efficient use of IT.

Azure Automation

Azure Automation provides a cloud-based automation and configuration service for consistent management across Azure and non-Azure environments such as on-premises datacenters. It comprises process automation, update management, and configuration features. Azure Automation provides complete control during deployment, operations, and decommissioning of workloads and resources.

You can use Azure Automation to automate manual, long-running, error-prone, and repetitive tasks that are commonly performed in a cloud and enterprise environment. This automation lets you save time. It also increases the reliability of regular administrative tasks with the option of automatically performing them at regular intervals. You can automate processes using runbooks, or automate configuration management using Desired State Configuration (DSC) (see the following section).

Some common scenarios for automation are as follows:

  • Build/deploy resources: Deploy VMs across a hybrid environment using runbooks and Azure Resource Manager templates, with possible integration into development tools such as Jenkins and Azure DevOps
  • Configure VMs: Assess and configure Windows and Linux machines as required for the infrastructure and application
  • Monitor: Identify changes in machines that may cause problems, then remediate or escalate to management systems
  • Protect: Quarantine a VM if a security alert is raised, and set in-guest requirements
  • Govern: Implement role-based access control for teams, and recover unused resources

Configuration management

Azure Automation DSC is a cloud-based solution for PowerShell DSC that provides services for enterprise environments. You can manage your DSC resources in Azure Automation and apply configurations to virtual or physical machines from a DSC Pull Server in the Azure cloud. The solution offers a range of reports, including reports that inform you of important events such as nodes moving into non-compliance. You can monitor and automatically update machine configuration across physical and virtual machines, both Windows and Linux, in the cloud and on-premises.

The Automation gallery contains runbooks and modules that accelerate the integration and authoring of processes from the PowerShell gallery and Microsoft Script Center.

Storage management

Azure Storage Explorer lets users manage the contents of their storage accounts. With this standalone application, you can upload, download, and manage blobs, files, queues, tables, and Cosmos DB entities, and manage VM disks.

You can use Azure Storage Explorer to work with Azure Resource Manager or classic storage accounts, to work with Azure Storage data on Windows, macOS, and Linux, and to manage and configure cross-origin resource sharing (CORS) rules. Data lake management is a further possibility.

Azure Storage Explorer provides several options for connection to storage accounts, including:

  • Connection to storage accounts associated with your Azure subscriptions
  • Connection to storage accounts and services shared from other Azure subscriptions
  • Connection to and management of local storage by using the Azure Storage Emulator

Using Azure Storage Explorer with Azure File Storage

Azure File Storage provides file shares in the cloud using the standard Server Message Block (SMB) protocol (both SMB 2.1 and SMB 3.0). The Azure File Storage service enables the quick and cost-effective migration to Azure of legacy applications that rely on file shares. With Azure File Storage, you can make data available publicly or store it privately.

Delegated access to resources in a storage account is possible by using a shared access signature (SAS). With a SAS and without needing to share account access keys, you can give permission to a client for access to specific objects in your storage account for a specified period.

Azure Data Studio

Azure Data Studio is a tool for the management of SQL Server databases, including Azure SQL Database and Azure SQL Data Warehouse systems. Previously called SQL Operations Studio, Azure Data Studio provides an up-to-date editor experience with IntelliSense, code snippets, source control integration (Git), an integrated terminal, and built-in charting of query result sets and customizable dashboards. In Azure, code snippets can generate the correct SQL syntax to create databases, tables, views, stored procedures, users, logins, roles, and so on, and to update existing database objects.

Extending management beyond Azure

Through tools and technology for management, automation, and governance, Azure provides solutions for users to create new applications and recreate existing environments in the cloud to the same levels of governance and regulatory compliance as on-premises deployments. In addition, users benefit from the agility of the cloud in expanding and redeploying compute and storage resources, while Azure resource management tools help them remain efficient, effective, and compliant.

However, users often want to combine Azure and other environments in a hybrid cloud solution that suits their needs better. In the next section, we see how Azure management tools can encompass or be leveraged by non-Azure resources, including automation for monitoring, updating, and other tasks.

Working with your hybrid cloud strategy

Hybrid cloud configurations can offer the best of both worlds, combining elements such as cloud scalability and cost savings with on-premises control and security. It makes sense to combine cloud and on-premises resource management to avoid administration silos and potential waste, while ensuring suitable protection. You can achieve continuous productivity and efficiency through resilient solutions for centralized management, encompassing cloud and on-premises systems. Azure cloud and on-premises Windows servers can be managed in this way using tools such as Windows Admin Center and Hybrid Runbook Worker.

Using local and hybrid management services with Windows Admin Center

Windows Admin Center is a browser-based management toolset for on-premises Windows servers, with access to Azure services. While it can be used to manage Windows servers on private networks that are not connected to the internet, it also has multiple points of integration with Azure services such as Azure Active Directory, Azure Backup, and Azure Site Recovery. Windows Admin Center thus simplifies and extends the management of Windows systems across on-premises and Azure environments. It also manages different generations of Windows Server and Windows 10 systems, and more via the Windows Admin Center gateway installed on Windows Server or Windows 10.

Windows Admin Center contains many tools that should already be familiar to users managing Windows servers and clients. It works with solutions such as System Center and Azure management and security to accomplish single machine management actions. The gateway can be made available through a company firewall for secure management of these resources over the internet using Microsoft Edge or Google Chrome. Role-based access control gives fine-grained control over which administrators can access which management features. Local groups and local domain-based Active Directory are options for gateway authentication, as is cloud-based Azure Active Directory.

Azure hybrid services are available to Windows servers as individual physical servers and VMs, as well as clusters. On-premises Windows Server deployments can then benefit from cloud services that include:

  • Azure Site Recovery for the protection of VMs with cloud-based disaster recovery. Workloads running on VMs can be replicated to protect business-critical operations if disaster strikes. Windows Admin Center facilitates the setup and replication of VMs on Hyper-V servers and clusters for increased resiliency through the disaster recovery service of Azure Site Recovery.
  • Azure Monitor with advanced analytics and machine learning for tracking events across applications, infrastructure, and networks. With Azure Monitor, users can monitor server status, events, and performance; set up email alerts; and see apps, services, and systems connected to any server.
  • Azure Network Adapter for easy Azure network connectivity. On-premises servers can connect securely to an Azure Virtual Network via an Azure Network Adapter.
  • Azure Update Management for keeping VMs up to date. This tool enables update and patch management for different servers and VMs from one location, whether these servers or VMs are on-premises, on Azure, or hosted by other cloud service providers. Users can rapidly check on available updates, plan update installations, and review and verify update installation success.
  • Azure Backup for backing up Windows Server. To guard against damage through data accidents, corruption, and attacks, users can back up their Window Server machines and VMs to Azure.
  • Azure File Sync for syncing on-premise files servers with the cloud. File syncing can obviate the need for server backup on-premises. Files can also be synced across multiple servers by using multi-site sync.
  • Azure Active Directory (Azure AD) authentication for an additional layer of security for Windows Admin Center. Azure AD authentication also enables access to Azure AD's security features, such as conditional access and multi-factor authentication.

These services offer the additional advantages of simple setup and a server-centric view for administrators, being directly integrated into Windows Admin Center. The Azure hybrid services tool in Windows Admin Center brings the integrated Azure services together into a centralized hub to facilitate service discovery in both on-premise and hybrid environments.

When users connect with the Azure hybrid services tool to a server with Azure services already enabled, they benefit from a centralized admin experience for all the services enabled on a server. Users can quickly access the tool required in the Windows Admin Center toolset, connect to the Azure portal for more extensive management of those Azure services, and consult the online documentation.

Windows Admin Center also allows management of Azure VMs, not just on-premises servers. Users can manage VMs in Azure by connecting their Windows Admin Center gateway to their Azure VNet. They can then use the simplified tools of Windows Admin Center.

While Windows Admin Center meets many common needs, it is not designed to replace all legacy Microsoft Management Console (MMC) tools. For example, Windows Admin Center is complementary to Remote Server Administration Tools (RSAT), at least until all RSAT management capabilities are surfaced in Windows Admin Center. Similarly, Windows Admin Center and System Center Virtual Machine Manager (SCVMM) are complementary. Although Windows Admin Center can replace MMC snap-ins and recreate a comparable server administration experience, it is not intended to replace SCVMM monitoring capabilities.

Automating resources on-premises and in the cloud by using Hybrid Runbook Worker

With Azure Automation, users can automate manual and repetitive tasks by using runbooks. Using Windows PowerShell or Windows PowerShell Workflow, users can program and deploy automation logic of their choice in these runbooks.

However, runbooks running on the Azure cloud platform may not have access to resources that are on-premise or in other clouds. To extend the reach of runbooks, Azure Automation provides the Hybrid Runbook Worker feature to run runbooks directly on systems that are managing local or other non-Azure resources. The runbooks are stored and managed in Azure Automation, being delivered afterward to the designated systems, which are known as the Hybrid Runbook Workers.

The structure of the runbooks in Azure Automation and in Hybrid Runbook Workers is the same. What differentiates one kind of runbook from the other is that Azure Automation runbooks manage resources in the Azure cloud, while Hybrid Runbook Worker runbooks manage resources local to the Hybrid Runbook Worker or in the environment local to the resources to be automated.

The installation of a runbook is often most simply and reliably done through the automation of the configuration of a Windows system. Manual installation and configuration are also possible. Users with Linux machines can run a Python script to install a runbook agent on their systems.

In Azure Automation, the RunOn option allows the specification of a Hybrid Runtime Worker Group. The runbook concerned is then retrieved and deployed by one of the members of this group. If the RunOn option is not used, the runbook is simply run in Azure Automation.

Because runbooks on a Hybrid Runbook Worker access resources outside Azure, their authentication differs from that of runbooks running within and authenticating to Azure resources. Runbooks outside Azure can provide their own authentication to local resources. When running on a local Windows system, they will typically run in the context of the local system account. When the on-premises system is Linux-based, the special user account nxautomation is used.

Runbooks on a Hybrid Runbook Worker can use managed identities when configuring authentication to Azure resources. Alternatively, users can specify a RunAs account to create a context for all runbooks. However, using managed identities for Azure resources offers certain advantages over RunAs accounts. Users do not need to export or renew a RunAs certificate that must then be imported into the Hybrid Runbook Worker. They are not required to write their runbook code to handle the runbook connection object either.

Hybrid Runbook Worker for updates and monitoring

By enabling the Update Management solution, you can automatically configure a Windows computer to support runbooks for update management. This applies to Windows computers connected to your Azure Log Analytics space. These computers then become Hybrid Runbook Workers. This allows you to extend your existing infrastructure and apply updates from a reliable, secure, and centralized location, creating a more integrated hybrid cloud/on-premises infrastructure. On the other hand, the Windows computer concerned is not automatically registered with any existing Hybrid Worker Groups in the user's Automation account.

The Microsoft Monitoring Agent can be installed to connect computers with Azure Monitor logs. When the agent is installed on an on-premise computer and connected to the user's workspace, it will automatically download the relevant Hybrid Runbook Worker components. These components include the HybridRegistration PowerShell module, which in turn contains the Add-HybridRunbookWorker cmdlet. Running this cmdlet installs the runbook environment on the computer and registers it with Azure Automation.

Building on and extending hybrid cloud management possibilities

Automation of processes on-premises or in a non-Azure cloud environment can be copied from or modeled on successful Runbook Worker deployment in Azure. The resulting Hybrid Runbook Workers are only constrained by the limits of the resources on the Hybrid Runbook Worker itself. Thus, the Hybrid Runbook Workers are free from certain limits that are imposed on Azure sandboxes, such as disk space, memory, network sockets, or running time.

Windows Admin Center was also designed to be extensible. Microsoft makes a software development kit (SDK) available for Microsoft and third-party developers to create their own Windows Admin Center tools and solutions and build on what is currently available.

What if something goes wrong?

When business or other key workloads run on Azure, users need to know that their Azure resources are available and working properly. Conversely, if there is a problem, they must be alerted. Data must be available to verify that service level agreements (SLAs) are being respected. Scheduled maintenance on Azure resources by Azure needs to be communicated and integrated into customers' planning.

In the next section, we consider the wide variety of support tools available for Azure for global, personalized, and individual resources. Resource status, service health, alerts, and integration with Azure Monitor (described earlier in this chapter) are also discussed, as well as the practical aspects of getting support directly from Microsoft.

The next section in this chapter deals with the optimization of budgets and cost savings, including analysis and management of costs and Azure solutions to help you make your cloud spend go further.

Azure cost savings – visibility, accountability, and optimization

Enterprises often look to cloud services as a way of reducing IT costs. However, any cost savings will also depend on how enterprises manage their costs and optimize their cloud spending. As with business activities, different departments will need to collaborate for effective cloud cost management. IT, finance, and different levels of management are all likely to be involved to correctly analyze costs, control them, and prepare future budgets as needed.

Azure offers a range of different tools to help enterprises manage their costs. A pragmatic business approach and common sense are also important. Finance departments should understand where cloud costs are generated and how cloud spend is trending, but so too should cloud IT teams.

  1. Visibility

    Cost analysis can help users and stakeholders to explore and break down cloud costs. Cost aggregation can show them where the largest amount of funds is being spent and what to expect in the future. Information on accumulated costs over time can let them track costs by month, quarter, or year against budgets.

  2. Accountability

    Identifying specific entities that must fund the use of given resources encourages efficiency and cost-effectiveness, as well as helping to avoid unwanted surprises in billing.

  3. Optimization

    Cost analysis can help organizations plan for better resource usage (right-sizing), highlight wasted resources, and improve plans for cost estimations.

Azure Cost Management

Azure Cost Management helps you to plan cloud resource consumption while paying attention to costs. It enables you to analyze costs effectively and optimize your cloud spend. With Azure Cost Management, you can see trends and patterns in usage and cost for your organization, with analytics to explore this data. It makes reports available to you on usage-based costs for Azure services and offerings from Marketplace third-party offerings.

The cost data for the reports takes current organizational prices and other Azure discounts into account. The reports also help you to assess possible spending anomalies, using Azure management groups, budgets, and recommendations to help you see possible opportunities for cost reduction. You can use the predictive analytics of Azure Cost Management to manage or plan for costs into the future. In addition, the Azure portal and Azure APIs let you automatically export and integrate cost data with other systems and processes with periodic reports.

Starting to optimize your cloud investment

Azure tools and a methodical approach to cost management can help you optimize your cloud spend. While Azure already facilitates the construction and deployment of cloud solutions, it is still important to ensure that those solutions are tuned for cost-effectiveness.

Good cost management starts before budgets are used. It depends on having suitable tools, assigning accountability for costs, and optimizing expenditure. These principles should be understood and can be applied by at least three groups.

  • IT teams that manage cloud resources daily must adjust their activities in a timely way to generate the most value to the business for the budgets they use for those cloud resources
  • The finance department must check requests for budget against financial goals and forecasts of cloud expenditure
  • Finally, managers must ensure that cloud spending and results are in line with the business goals of the organization

Using scopes for Azure cost management

Most Azure resources are deployed in resource groups, which are part of subscriptions. Authorized Azure users view and manage the cost aspects of Azure resources via scopes, which are nodes in the Azure resource hierarchy. For cost management, Microsoft defines two roles:

  • Billing data management (invoices and payments, for example)
  • Cloud services management (governance of policy and costs, for example)

Azure resource management is also done via scopes but uses Azure Role-Based Access Control (RBAC). The two types of scopes are called billing scopes and RBAC scopes to differentiate them. RBAC scopes do not differ according to the Azure subscription type, but billing scopes may vary to extend from single resource groups to entire billing accounts.

The following Azure scopes apply per subscription for cost management by user and group:

  • Owner: Authorized to create, change, or delete subscription budgets
  • Contributor and Cost Management contributor: Authorized to create, change, or delete their own budgets, and change the budget amount for budgets of others
  • Reader and Cost Management reader: Authorized to view specific budgets

Cost management life cycle

Cost management is an iterative process encompassing the four activities below with several stages that form a loop or virtuous circle. This life cycle should be known and applied by all the people or teams involved in cloud cost management.

Planning

As the saying goes, "Plans are worthless, but planning is everything." Any plan for using cloud resources to meet business goals is only an approximation because circumstances can alter rapidly. However, by continually considering changes in the goals and situation of the enterprise, plans can be updated as often as needed to remain realistic. Key questions to ask regularly and frequently are:

  • What business goals and challenges must my enterprise meet?
  • How is cloud usage likely to evolve if those goals and challenges are to be met?

The answers to these questions will help you to identify the Azure resources and infrastructure that best suit your enterprise.

Monitoring

As you implement your plan, you need to know how much your enterprise is spending and on which cloud resources. Underused resources should be used better or canceled, waste must be eliminated, and opportunities to save money without impacting business goals must be maximized.

Accountability

Financial accountability is the responsibility for the way the budget is used and managed. This responsibility must be clearly and precisely assigned. Costs incurred can then be attributed to specific projects or departments. Spending efficiency can be monitored effectively.

Overspending can be identified down to the project or resource level and appropriate measures taken to bring costs into line again or to justify new budgets corresponding to the value gained for the enterprise.

Optimization

Spending optimization can be accomplished in two ways. The first is by checking performance, goal achievement, or other relevant results against financial outlay. The smaller the outlay for a given result, the better the spending is optimized. The second way is to leverage purchase and licensing optimization and infrastructure changes (see below).

Analyzing and managing your costs

Once an Azure solution has been implemented, it is important to know how costs vary over time.

Organizing and tagging resources

Tags are an effective solution for financial accountability. They allow you to attribute cost to specific projects or teams, and group costs together for further analysis. Alternatively, Enterprise Agreement customers can define separate subscriptions for departments or projects, which also helps promote accountability and individual efforts for cost reduction. Subscriptions and resource groups can be useful ways of organizing and attributing costs to different parts of your organization.

Analysis of usage cost

Regular analysis of costs compared to usage can be useful for spotting usage trends and monitoring the evolution of costs for specific projects and teams. Key points can include:

  • Estimated costs for the month and comparison with the corresponding budget
  • Identification of spend anomalies, showing costs that fall outside a reasonable range and any other exceptional costs
  • Reconciliation of invoices to highlight any unexpected cost increase or changes in spending trends
  • Chargeback to internal consumers with a breakdown of charges per project, department, or other entity

Billing data can be automatically exported into other applications, such as a financial system or a data visualization dashboard. Instead of manually retrieving files, users can configure automated exports to Azure Storage with automatic integrations of the Azure Storage data into other systems.

Budget creation

Good estimates, spending pattern analysis, and forecasts are the ingredients for effective budgeting. With Azure Budgets, you can define budgets according to cost or usage with a wide range of limits and alerts. An action can be automatically triggered when a specific budget threshold is reached. For example, VMs can be shut down or infrastructure can be moved to a different pricing level. As budgets are used (budget burn-down), data can be reviewed, and changes made as needed.

Optimizing costs

Optimization of costs comes from maximizing the efficiency of resources and removing those that are not generating value for your enterprise. This includes resources that have been deployed for a project of a fixed duration and that have not been spun down or canceled after the project finishes. An enterprise simulation or system test run over a weekend, for instance, may require considerable compute and storage resources. However, a test team may assume that the operations team will adjust resource levels on the following Monday, whereas the operations team may be unaware of the weekend exercise. Azure Cost Management can help rectify such situations.

Azure Advisor

This service offers different functionalities, including the identification of VMs that are using CPU or network resources at a low level. When you know which machines these are, you can stop them or resize them according to the cost forecast for keeping them in operation. If reserved instance purchases can help you reduce your costs, Azure Advisor can provide recommendations for such purchases based on the previous 30 days of your VM usage.

VM right-sizing

It is important to select the correct size of VMs for your cloud workloads. VM sizing is an important factor in determining overall Azure cost. The number of VMs required in Azure may also be different to the number deployed in an on-premises datacenter. Individual VM size and overall quantity should therefore be calculated to correspond to the compute requirements for the workloads to be run in Azure.

Azure discounts

Volume discounts either in terms of quantity or usage are a common feature of business agreements. Azure takes the same approach, offering cost savings to customers accordingly.

Azure Reservations

Receive a discount on your Azure services by purchasing Azure Reservations. Cost savings can be significant compared to pay-as-you-go prices for VMs, SQL database compute resources, and additional Azure services. You can improve budgeting with a single upfront payment, making it easy to calculate your investments or you can lower your upfront cash outflow with monthly payment options at no additional cost. You can purchase one-year or three-year term Azure Reservations.

Buying an Azure Reservation can be the most cost-effective choice for customers with VMs, Azure Cosmos DB, or SQL databases that are in operation over long periods. For example, without a reservation and with a requirement for five instances of a service, a customer will pay standard pay-as-you-go rates. But by buying a reservation for those resources, the customer immediately benefits from the reservation discount, thus saving money compared to the pay-as-you-go rates.

Azure Hybrid Benefit

The Azure Hybrid Benefit program offers cost savings if you already have on-premises deployments with Windows Server or SQL Server licenses. The Windows Server benefit means that each license includes the operating system for up to two VMs. Users then pay only the compute costs, with the base compute rate being equal to the Linux rate for VMs. Similarly, an existing SQL Server license can bring significant savings on vCore-based SQL database options, such as SQL Server in Azure VMs and SQL Server Integration Services.

Azure Reserved VM Instances

With Azure Reserved VM Instances (RIs), you can reserve VMs in advance for cost savings when combining Azure RIs with Azure Hybrid Benefit. Further advantages of RIs include the exchange or cancellation of reservations and prioritized compute capacity in Azure regions.

RIs also provide a feature called instance size flexibility, which automatically applies the RI savings to any VM that you use within the same region and within the same Azure RI VM group. Instance size flexibility allows you to meet changing needs and realize applicable cost savings without being locked into a specific VM size.

Automated RI management means that Azure can automatically apply RIs to other VM sizes in the same group and region. Advantages such as these apply to both Windows and Linux VMs in Azure.

Reaping the benefits of cost management

When Azure costs are correctly managed, budgetary limits and goals can be adhered to. Financial accountability can be ensured through correct and timely cost data that allows comparison with financial goals. Insights from those data can help identify poor cost-efficiencies, such as underused resources. They can help identify options for improving efficiencies or changes for cost optimization. Azure enables robust cost management processes with recommendations, actions, and verification that changes made are producing the expected cost benefits.

When Azure costs are correctly managed, budgetary limits and goals can be adhered to. Financial accountability can be ensured through correct and timely cost data that allows comparison with financial goals. Insights from this data can help identify poor cost efficiencies, such as underused resources. They can help identify options that will improve efficiency or suggest changes for cost optimization. Azure enables robust cost management processes with recommendations, actions, and verification that changes made are producing the expected cost benefits.

Diagnosing service problems in Azure and getting support

Azure provides a range of tools to help users monitor and manage Azure service situations. Azure Service Health groups together three services to help users see overall Azure status, customized reports on asset groups that affect customers, and detailed information on individual assets. Issues detected by Azure Service Health tools can trigger alerts via text or voice messages, emails, automated responses using Azure-or user-created runbooks, or actions within Azure or within other preferred resources management applications.

Global level status (Azure status)

Azure status information helps users see at a glance the status or impacts on services they use. This overview of the health of all Azure services is part of Azure Service Health. It can also be consulted by any visitor to the Microsoft public Azure status page.

Personalized service status (Azure Service Health)

Naturally, users also want to know about the health status of Azure services and regions, as it applies specifically to their resources. Users can access Azure Service Health in order to view communications about outages, scheduled maintenance actions, and other information relating to health and service impacts, according to the services and resources currently used by that user. Users can set up Service Health alerts to notify them over their preferred communications media when the Azure services and regions they use risk impact from service problems, schedule maintenance, or other changes.

Individual asset status (Azure Resource Health)

As an authorized Azure user, you can obtain information from Azure Resource Health on a specific cloud resource such as an individual VM. Via Azure Monitor, you can set up alerts to warn you of changes in the availability of your cloud resources. The combination of Azure Resource Health and Azure Monitor can provide you with better information on a minute-by-minute basis, allowing you to speedily ascertain if an issue is related to an event on the Azure platform or has been caused by a problem in your own environment.

Azure status updates and history

The Azure status page is updated dynamically as changes occur in the health of Azure services. You can also define the rate at which this page is refreshed with new data, with running information on the last time the page was updated. Azure status and service health change information is also available via an RSS feed. The Azure status history page shows older events up to 90 days in the past, with preliminary root cause, mitigation, and next steps information.

Overview of Azure Service Health

Azure Service Health gives users a dashboard that they can customize to track the health of their Azure services in the regions where they use them. Users can monitor active events such as ongoing service issues, scheduled maintenance for the short term, or other health advisories that concern them. Users can also use the dashboard to create and manage alerts to proactively warn them of service problems that affect them. Events that become inactive are stored in the health history for a maximum of 90 days.

Azure Service Health events

Azure Service Health monitors three kinds of health events that may affect your resources:

  1. Service issues

    These are continuing problems that may currently affect your resources. The ST view shows you when the problem started, and the services and regions that are affected. This view also makes available the latest update on actions from Azure to resolve the problem.

    The Potential Impact tab displays the resources that you own and that could be affected by the problem. This information is also available as a CSV list for download and sharing.

  2. Scheduled maintenance

    This is maintenance in the short term that could have an impact on the availability of your resources.

  3. Health advisories

    These concern changes in Azure services that you should know about, such as the deprecation of Azure features or a usage quota that has been exceeded.

A link for a given issue is available for use in a preferred problem management system. Sharing with others that do not have access to the Azure portal is possible through the download of PDF (and in some cases CSV) files. Users can also pin a personalized health map to their dashboard to show their business-critical subscriptions, regions, and resource types via a filter on Service Health.

The Service Health page provides links for Microsoft support, including cases where a resource remains in an unsatisfactory state even after an issue has been resolved.

Azure Service Health alert configuration

You can use the integration of Service Health with Azure Monitor to receive email, text message, and webhook notification alerts when changes or incidents affect your resources. To receive these alerts, configure an activity log alert for the service health event of interest to you, then use an action group to route the alert to people who need to know about the alert. An action group is a definition of the actions to be taken if an alert is triggered.

Azure Resource Health

Azure Resource Health reports on the existing and historical health of your resources. These reports enable you to diagnose and obtain support for service issues that impact your Azure resources. Whereas Azure status is a "broad brush" report on service problems affecting Azure users in general, Resource Health offers a personalized dashboard to show you specific resource health. For example, Resource Health makes it easy to check if SLAs have been respected by displaying all the instances of unavailability of your resources due to Azure service issues.

Resource health evaluation

Azure Resource Health uses signals from various Azure services to evaluate the health of a resource. The resource may be a VM, SQL database, web application, or any other instance of an Azure service. If Resource Health finds the resource to be unhealthy, it analyzes more information to find the cause of the problem. It also reports on actions by Microsoft to remedy the problem and suggests actions for the user to fix the problem too.

Resource health status

The health status of a resource may be shown as any of the following:

  • Available

    The Available status indicates that no events affecting the health of the resource have been affected. A Recently resolved notification is displayed up to 24 hours after a resource recovers from unscheduled downtime.

  • Unavailable

    The Unavailable status indicates that a continuing platform or non-platform event (refer to the Platform and non-platform events section) impacting the health of the resource has been detected by the service.

  • Unknown

    If Azure Resource Health has not received information about a resource within the last 10 minutes, it displays the status as Unknown. This may be an important event for subsequent troubleshooting. The status may change to Available after a few minutes, if the resource is operating as expected. Otherwise, problems using the resource may indicate that it is being impacted by an event in the platform.

  • Degraded

    The Degraded status indicates the detection of a performance loss for a resource, even if the resource can still be used. Individual resources have their own criteria for reporting a Degraded status.

Platform and non-platform events

Multiple components of the Azure infrastructure trigger platform events, whether as scheduled events or unplanned incidents (an unexpected host reboot, for instance). Azure Resource Health provides additional information about the event and recovery from the event. It also enables users to contact Microsoft Support, with or without an active support agreement.

Non-platform events are caused by users, for instance, halting a VM or reaching the limit for Redis connections to Azure Cache for Redis.

Reporting an incorrect status

If you consider a health status for a resource to be wrong, you can use Report incorrect health status to report this to Microsoft. You can also contact Microsoft support from Azure Health Monitor if an Azure issue is impacting you.

Integration with Azure Monitor

Azure Monitor collects monitoring (telemetry) data from different on-premise and Azure sources. It also receives log data from management tools like those in Azure Security Center and Azure Automation and can receive health status information from Azure Service Health. Azure Monitor aggregates and stores the telemetry data in a log data store configured for optimal performance and cost-effectiveness.

Users can analyze data, configure alerts, and obtain end-to-end views of their applications via Azure Monitor. They can leverage insights from machine learning to accelerate the identification and resolution of issues. Azure Monitor supports .NET, Java, Node.js, and other popular languages and frameworks.

Azure Service Health can thus be integrated into the centralized, scalable management system of Azure Monitor that unifies operational telemetry and provides advanced tools for improved availability and performance.

Using Azure Monitor capabilities, Azure Service Health can then also be integrated with DevOps processes and tools such as Azure DevOps, Jira, and PagerDuty, as well as with other user management tool favorites such as Grafana, IBM Radar, InfluxDB, SignalFx, and Splunk.

Getting support from Microsoft

Azure users can create and manage support requests via the Azure portal. For example, click on ? in the top-right corner and select New Support Request to create a support request.

The support request experience has been designed to be streamlined, integrated, and efficient for users. A wizard helps users by simplifying the procedure, maintaining the resource context (no need to switch to a context other than that of the resource), and collecting the key information needed for efficient issue resolution. The key information allows the wizard to route the support request to the most suitable support engineer for the issue, so that issue diagnosis and resolution can begin as soon as possible.

Based on the problem category and type selected by the user, Microsoft can also provide contextual self-help information for users to address their issues immediately by themselves. If the recommended solutions do not remedy the issue, the process continues through to the creation of a support request and its transmission to the Microsoft support team.

RBAC for support requests

Azure RBAC lets you define highly granular management access. The Azure portal at portal.azure.com uses this RBAC to authorize different levels or scopes of support request creation and management. For example, scopes may extend to a resource, a resource group, or an entire subscription. Users, groups, and applications can have access via the appropriate RBAC role to the level or scope appropriate for them.

For example, a resource group owner with read permissions at the subscription scope can manage all resources in the resource group. These resources might include VMs, web applications and sites, and subnets. However, this resource group owner cannot create a support request for a VM resource in the resource group. To do this, the resource group owner must first be granted write permission at the subscription scope. Alternatively, the role of the resource group manager could be defined to include specific Microsoft Support authority at the subscription scope.

Support effectiveness

The previous section showed how users can benefit from a range of tools in Azure to monitor and manage Azure service resources. These tools range from global, publicly available information (Azure status), through service-and resource-specific notifications (Azure Service Health and Azure Resource Health), to comprehensive integration and management possibilities via Azure Monitor and other service management systems. Users can receive notifications through a variety of channels and responses to service health alerts can be automated using Azure Automation.

Support is user-friendly and efficient thanks to the use of support links and a support request wizard. At the same time, access to support requests is both flexible and secure using Azure RBAC to authorize user actions at the appropriate levels and within the appropriate scope.

Summary

This chapter has explored the management and optimization of resources relating to Azure from different angles, including availability, efficiency, security, and cost-effectiveness. We also discussed hybrid cloud environments, showing how Azure solutions can be used to enhance the management of resources in non-Azure installations. Azure cost-saving and budgeting optimizations were also addressed. We finished this chapter by presenting monitoring and issue resolution from different points of view, including resource health.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.66.156