Azure Service Health

az-900mixed

Management and governance

Azure Service Health

Short Summary

Azure Service Health helps you see Azure platform events that could affect your subscriptions, services, and regions. You’ll learn how it differs from the public Azure Status page (global view) and Azure Monitor (your resource and app telemetry). You’ll also learn how to set up alerts so you get notified automatically when incidents, maintenance, or advisories impact you.

Learning Objectives

By the end of this lesson, you will be able to:

  • Define Azure Service Health and what “personalized impact” means
  • Identify the main event types shown in Service Health (service issues, planned maintenance, health advisories)
  • Distinguish Azure Service Health from Azure Status, Azure Resource Health, and Azure Monitor
  • Configure a high-level alerting flow using Service Health notifications and action groups

Core Concepts

What Azure Service Health tells you

Azure Service Health is designed to answer this question:

“Is Azure (the platform) having an issue or change that affects me?”

It focuses on platform events (Microsoft-side incidents and changes) and shows personalized impact for what you use (your subscriptions, services, and regions), instead of a global “everything everywhere” view.

What you typically see in Service Health

Service Health commonly highlights these event types:

  • Service issues: incidents/outages affecting Azure services
  • Planned maintenance: scheduled platform work that could impact availability or performance
  • Health advisories: important notices that may require your attention (for example, mitigations or recommended actions)

Azure Status vs Service Health vs Resource Health

Azure Service Health documentation groups these related experiences:

  • Azure Status: the public global status page for broad outages across regions and services
  • Service Health: the personalized view of issues/maintenance/advisories that may affect what you run
  • Azure Resource Health: the resource-level view that shows the health of specific resources (for example, one Virtual Machine (VM))

A quick rule:

  • If you can’t sign in to the portal or want a broad picture, start with Azure Status.
  • If you want “does this affect my subscriptions/regions/services?”, use Service Health.
  • If you’re troubleshooting one resource, use Azure Resource Health.

Azure Service Health vs Azure Monitor

A crucial boundary:

  • Azure Service Health is about Azure platform events that might affect you.
  • Azure Monitor is about your telemetry (metrics and logs) from resources and applications, such as VM Central Processing Unit (CPU) usage or application response time.

Service Health won’t replace your monitoring dashboards. It complements them by telling you when the platform itself is the likely cause.

Why alerts matter (and how they work at a high level)

Service Health becomes most useful when you configure alerts.

At a high level:

  1. Azure posts Service Health notifications to your subscription’s Azure Activity Log.
  2. You create Service Health alert rules for the event types you care about.
  3. You connect alerts to an action group (email, Short Message Service (SMS), webhook, and other targets).

That’s how you move from “someone should check the portal” to “the right people get notified automatically.”

Practical Understanding

Practical Situation 1: “Is Azure having an incident that affects me?”

A workload is behaving strangely and you suspect the problem might be on the Azure platform side rather than your code or configuration.

How to think about it: Check Azure Service Health for service issues, planned maintenance, or advisories that impact the services/regions you use. This helps you separate “Azure-side impact” from “my workload telemetry.”

Common misunderstanding: “If my app is slow, Service Health will show my app metrics.” App and resource performance belongs in Azure Monitor; Service Health is about platform events.

Practical Situation 2: “I want alerts when Microsoft posts maintenance or incidents”

You don’t want to rely on someone remembering to open the portal. You want email/SMS notifications when events affect your subscriptions.

How to think about it: Create a Service Health alert rule and connect it to an action group. The alert rule controls which Service Health events trigger notifications; the action group controls how and who gets notified.

Common misunderstanding: “Service Health will notify me by default.” You typically need to configure alert rules and an action group.

Practical Situation 3: “Public status page vs personalized view”

You see a public page showing a global Azure issue, but you’re not sure whether it impacts your specific environment.

How to think about it: Use the public Azure Status page to understand the broad situation, then use Azure Service Health to confirm whether your subscriptions, services, and regions are impacted.

Common misunderstanding: “The status page and Service Health are the same thing.” The status page is global; Service Health is personalized.

Practical Situation 4: “A specific resource looks unhealthy”

One VM (or another single resource) is unreachable and you need to know whether Azure considers that specific resource impacted.

How to think about it: Use Azure Resource Health to see the health history and status of that specific resource. Use Service Health to understand broader platform events that may be related.

Common misunderstanding: “Service Health shows the health of individual resources.” Resource Health is the resource-level view; Service Health is the platform-event view.

Common Pitfalls

  • Mistake: Confusing Azure Service Health with the public Azure Status page. Correction: Azure Status is global; Azure Service Health is personalized to your subscriptions, services, and regions.

  • Mistake: Expecting Azure Service Health to show application performance or resource metrics (like VM CPU). Correction: Use Azure Monitor for metrics/logs; use Service Health for Azure platform incidents, maintenance, and advisories.

  • Mistake: Assuming Service Health automatically fixes platform incidents. Correction: Service Health informs you and may provide guidance; you still decide what operational actions to take.

  • Mistake: Forgetting to configure alerts and action groups. Correction: Create Service Health alert rules and connect an action group so notifications happen automatically.

Check Your Understanding

  1. In your own words, explain the difference between “Azure platform health” and “my workload telemetry.”
  2. Describe a situation where you would check Azure Service Health before checking logs and metrics.
  3. Describe a situation where Azure Monitor is the right tool even if there’s a known Azure incident.
  4. Write a simple 3-step checklist to ensure your team gets notified about Service Health events.
  5. Explain why the Azure Status page can be useful, but still insufficient for day-to-day operational awareness.

Further Reading