Aws cloudwatch alarms

Aws cloudwatch alarms DEFAULT

AWS::CloudWatch::Alarm

Indicates whether actions should be executed during any changes to the alarm state. The default is TRUE.

Required: No

Type: Boolean

Update requires: No interruption

The list of actions to execute when this alarm transitions into an ALARM state from any other state. Specify each action as an Amazon Resource Name (ARN). For more information about creating alarms and the actions that you can specify, see PutMetricAlarm in the Amazon CloudWatch API Reference.

Required: No

Type: List of String

Maximum:

Update requires: No interruption

The description of the alarm.

Required: No

Type: String

Minimum:

Maximum:

Update requires: No interruption

The name of the alarm. If you don't specify a name, AWS CloudFormation generates a unique physical ID and uses that ID for the alarm name.

Important

If you specify a name, you cannot perform updates that require replacement of this resource. You can perform updates that require no or some interruption. If you must replace the resource, specify a new name.

Required: No

Type: String

Minimum:

Maximum:

Update requires: Replacement

The arithmetic operation to use when comparing the specified statistic and threshold. The specified statistic value is used as the first operand.

You can specify the following values: , , , or .

Required: Yes

Type: String

Allowed values:

Update requires: No interruption

The number of datapoints that must be breaching to trigger the alarm. This is used only if you are setting an "M out of N" alarm. In that case, this value is the M, and the value that you set for is the N value. For more information, see Evaluating an Alarm in the Amazon CloudWatch User Guide.

If you omit this parameter, CloudWatch uses the same value here that you set for , and the alarm goes to alarm state if that many consecutive periods are breaching.

Required: No

Type: Integer

Minimum:

Update requires: No interruption

The dimensions for the metric associated with the alarm. For an alarm based on a math expression, you can't specify . Instead, you use .

Required: No

Type: List of Dimension

Maximum:

Update requires: No interruption

Used only for alarms based on percentiles. If , the alarm state does not change during periods with too few data points to be statistically significant. If or this parameter is not used, the alarm is always evaluated and possibly changes state no matter how many data points are available.

Required: No

Type: String

Minimum:

Maximum:

Update requires: No interruption

The number of periods over which data is compared to the specified threshold. If you are setting an alarm that requires that a number of consecutive data points be breaching to trigger the alarm, this value specifies that number. If you are setting an "M out of N" alarm, this value is the N, and is the M.

For more information, see Evaluating an Alarm in the Amazon CloudWatch User Guide.

Required: Yes

Type: Integer

Minimum:

Update requires: No interruption

The percentile statistic for the metric associated with the alarm. Specify a value between p0.0 and p100.

For an alarm based on a metric, you must specify either or but not both.

For an alarm based on a math expression, you can't specify . Instead, you use .

Required: No

Type: String

Pattern:

Update requires: No interruption

The actions to execute when this alarm transitions to the state from any other state. Each action is specified as an Amazon Resource Name (ARN).

Required: No

Type: List of String

Maximum:

Update requires: No interruption

The name of the metric associated with the alarm. This is required for an alarm based on a metric. For an alarm based on a math expression, you use instead and you can't specify .

Required: No

Type: String

Minimum:

Maximum:

Update requires: No interruption

An array that enables you to create an alarm based on the result of a metric math expression. Each item in the array either retrieves a metric or performs a math expression.

If you specify the parameter, you cannot specify , , , , , , or .

Required: No

Type: List of MetricDataQuery

Update requires: No interruption

The namespace of the metric associated with the alarm. This is required for an alarm based on a metric. For an alarm based on a math expression, you can't specify and you use instead.

For a list of namespaces for metrics from AWS services, see AWS Services That Publish CloudWatchMetrics.

Required: No

Type: String

Minimum:

Maximum:

Pattern:

Update requires: No interruption

The actions to execute when this alarm transitions to the state from any other state. Each action is specified as an Amazon Resource Name (ARN).

Required: No

Type: List of String

Maximum:

Update requires: No interruption

The period, in seconds, over which the statistic is applied. This is required for an alarm based on a metric. Valid values are 10, 30, 60, and any multiple of 60.

For an alarm based on a math expression, you can't specify , and instead you use the parameter.

Minimum: 10

Required: No

Type: Integer

Update requires: No interruption

The statistic for the metric associated with the alarm, other than percentile. For percentile statistics, use .

For an alarm based on a metric, you must specify either or but not both.

For an alarm based on a math expression, you can't specify . Instead, you use .

Required: No

Type: String

Allowed values:

Update requires: No interruption

The value to compare with the specified statistic.

Required: No

Type: Double

Update requires: No interruption

In an alarm based on an anomaly detection model, this is the ID of the function used as the threshold for the alarm.

Required: No

Type: String

Minimum:

Maximum:

Update requires: No interruption

Sets how this alarm is to handle missing data points. Valid values are , , , and . For more information, see Configuring How CloudWatchAlarms Treat Missing Data in the Amazon CloudWatchUser Guide.

If you omit this parameter, the default behavior of is used.

Required: No

Type: String

Minimum:

Maximum:

Update requires: No interruption

The unit of the metric associated with the alarm. Specify this only if you are creating an alarm based on a single metric. Do not specify this if you are specifying a array.

You can specify the following values: Seconds, Microseconds, Milliseconds, Bytes, Kilobytes, Megabytes, Gigabytes, Terabytes, Bits, Kilobits, Megabits, Gigabits, Terabits, Percent, Count, Bytes/Second, Kilobytes/Second, Megabytes/Second, Gigabytes/Second, Terabytes/Second, Bits/Second, Kilobits/Second, Megabits/Second, Gigabits/Second, Terabits/Second, Count/Second, or None.

Required: No

Type: String

Allowed values:

Update requires: No interruption

Sours: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-cw-alarm.html

Amazon CloudWatch Alarms

The new CloudWatch Alarms feature allows you to watch CloudWatch metrics and to receive notifications when the metrics fall outside of the levels (high or low thresholds) that you configure. You can attach multiple Alarms to each metric and each one can have multiple actions.

Here’s how they relate to each other:

A CloudWatch Alarm is always in one of three states: OK, ALARM, or INSUFFICIENT_DATA. When the metric is within the range that you have defined as acceptable, the Monitor is in the OK state. When it breaches a threshold it transitions to the ALARM state. If the data needed to make the decision is missing or incomplete, the monitor transitions to the INSUFFICIENT_DATA state.

Alarms watch metrics and execute actions by publishing notifications to Amazon SNS topics or by initiating Auto Scaling actions. SNS can deliver notifications using HTTP, HTTPS, Email, or an Amazon SQS queue. Your application can receive these notifications and then act on them in any desired way.

Actions can be set for the transition into each of the three states. The actions happen only on state transitions, and will not be re-executed if the condition persists for hours or days.

You can use the fact that multiple actions are allowed for a Alarm to send an email when a threshold is breached. This will allow you to verify that your scaling or recovery actions are triggered when expected and are working as desired.

Next feature: Auto Scaling Suspend / Resume.

— Jeff;

Sours: https://aws.amazon.com/blogs/aws/amazon-cloudwatch-alarms/
  1. Cape girardeau weather
  2. Model rocket kits
  3. Bag talk lyrics

Create a CloudWatch alarm based on a static threshold

  • Open the CloudWatch console at https://console.aws.amazon.com/cloudwatch/.

  • In the navigation pane, choose Alarms, All alarms.

  • Choose Create alarm.

  • Choose Select Metric.

  • (Optional) If you have enabled cross-account functionality in the CloudWatch console and the current account is a monitoring account, under Search Metrics choose a different AWS account that contains the metric that you want the alarm to watch. For more information, see Cross-account cross-Region CloudWatch console.

  • Do one of the following:

    • Choose the service namespace that contains the metric that you want. Continue choosing options as they appear to narrow the choices. When a list of metrics appears, select the check box next to the metric that you want.

    • In the search box, enter the name of a metric, dimension, or resource ID and press Enter. Then choose one of the results and continue until a list of metrics appears. Select the check box next to the metric that you want.

  • Choose the Graphed metrics tab.

    1. Under Statistic , choose one of the statistics or predefined percentiles, or specify a custom percentile (for example, ).

    2. Under Period, choose the evaluation period for the alarm. When evaluating the alarm, each period is aggregated into one data point.

      You can also choose whether the y-axis legend appears on the left or right while you're creating the alarm. This preference is used only while you're creating the alarm.

    3. Choose Select metric.

      The Specify metric and conditions page appears, showing a graph and other information about the metric and statistic you have selected.

  • Under Conditions, specify the following:

    1. Enter a name and description for the alarm. The name must contain only ASCII characters.

    2. For Whenever is, specify whether the metric must be greater than, less than, or equal to the threshold. Under than..., specify the threshold value.

    3. Choose Additional configuration. For Datapoints to alarm, specify how many evaluation periods (data points) must be in the state to trigger the alarm. If the two values here match, you create an alarm that goes to state if that many consecutive periods are breaching.

      To create an M out of N alarm, specify a lower number for the first value than you specify for the second value. For more information, see Evaluating an alarm.

    4. For Missing data treatment, choose how to have the alarm behave when some data points are missing. For more information, see Configuring how CloudWatch alarms treat missing data.

    5. If the alarm uses a percentile as the monitored statistic, a Percentiles with low samples box appears. Use it to choose whether to evaluate or ignore cases with low sample rates. If you choose ignore (maintain alarm state), the current alarm state is always maintained when the sample size is too low. For more information, see Percentile-based CloudWatch alarms and low data samples.

  • Choose Next.

  • Under Notification, select an SNS topic to notify when the alarm is in state, state, or state.

    To have the alarm send multiple notifications for the same alarm state or for different alarm states, choose Add notification.

    To have the alarm not send notifications, choose Remove.

  • To have the alarm perform Auto Scaling, EC2, or Systems Manager actions, choose the appropriate button and choose the alarm state and action to perform. Alarms can perform Systems Manager actions only when they go into ALARM state. For more information about Systems Manager actions, see see Configuring CloudWatch to create OpsItems from alarms and Incident creation.

  • When finished, choose Next.

  • Enter a name and description for the alarm. The name must contain only ASCII characters. Then choose Next.

  • Under Preview and create, confirm that the information and conditions are what you want, then choose Create alarm.

  • Sours: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/ConsoleAlarms.html

    Create a CloudWatch alarm for an instance

    You can create a CloudWatch alarm that monitors CloudWatch metrics for one of your instances. CloudWatch will automatically send you a notification when the metric reaches a threshold you specify. You can create a CloudWatch alarm using the Amazon EC2 console, or using the more advanced options provided by the CloudWatch console.

    To create an alarm using the CloudWatch console

    For examples, see Creating Amazon CloudWatch Alarms in the Amazon CloudWatch User Guide.

    You can edit your CloudWatch alarm settings from the Amazon EC2 console or the CloudWatch console. If you want to delete your alarm, you can do so from the CloudWatch console. For more information, see Editing or Deleting a CloudWatch Alarm in the Amazon CloudWatch User Guide.

    Document Conventions

    Graph metrics

    Create alarms that stop, terminate, reboot, or recover an instance

    Sours: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-cloudwatch-createalarm.html

    Alarms aws cloudwatch

    Why did my CloudWatch alarm trigger when its metric doesn't have any breaching data points?

    My Amazon CloudWatch alarm changed to the ALARM state. When I check the alarm metric, I don't see any breaching data points. However, the event history for the alarm shows the breaching data point. Why did my CloudWatch alarm trigger when its metric doesn't have any breaching data points?

    Short description

    CloudWatch alarms evaluate metrics based on data points available at a specific moment. Each subsequent alarm evaluation might use different aggregated data points because new values continue to flow into the CloudWatch metric. You might be unable to see a breaching data point that triggered your alarm if that data hasn't flowed into the metric yet. When you review the event history later, you can see the complete set of data points, which have now flowed into the metric.

    Resolution

    Find a breaching data point

    To find a breaching data point in your CloudWatch alarm metric's graph, change the Statistic to Maximum/Minimum.

    Example alarm configuration:

    • Standard resolution alarm (evaluates the metric every minute)
    • Metric is CPUUtilization
    • Threshold is 65%
    • Statistic is Average
    • Period is 60 seconds
    • Evaluation Period is 1
    • Detailed Monitoring is enabled for the monitored Amazon Elastic Compute Cloud (Amazon EC2) instance

    When the example alarm evaluation period 12:00:00 - 12:01:00 UTC starts, the following values were received by the metric:

    The average of those values is 66.934, which breaches the threshold of 65%. This breach triggers a change to the ALARM state. The alarm's event history lists the aggregated values exceeding the threshold as the reason for the state change.

    When the alarm is evaluated again later, additional values have flowed in for the minute 12:00:00 - 12:01:00 UTC. For example:

    The average including the new values is 48.864, which doesn't breach the threshold of 65%. The alarm now changes to the OK state. The alarm's event history lists the aggregated values being below the threshold as the reason for the state change.

    You might not see the breaching data point in your CloudWatch metric's graph now, even though the alarm triggered. If you view the CPUUtilization metric's graph, the Average is listed as 48.864 (not 66.934). All relevant samples for evaluation have now flowed into the metric.

    If you change the CloudWatch metric graph's Statistic to Maximum, you can see the breaching data point 95.473 at 12:00:00 UTC.

    Note: If your alarm is configured to trigger when data falls below the threshold, change the CloudWatch metric graph's Statistic to Minimum.

    Configure an "M out of N" alarm

    To prevent an alarm from changing to the ALARM state, configure an "M out of N" alarm where Evaluation Period and Datapoints to Alarm have different values. This configuration makes alarms evaluate more aggregated data points and changes the alarm state only if at least a certain number of data points (M) is breaching in a given set of data points (N). For more information, see Create a CloudWatch alarm based on a static threshold and Configuring how CloudWatch alarms treat missing data.

    Example alarm configuration:

    • Standard resolution alarm (evaluates the metric every minute)
    • Metric is CPUUtilization
    • Threshold is 65%
    • Statistic is Average
    • Period is 120 seconds
    • Evaluation Period is 2 out of 3
    • Detailed Monitoring is enabled for the monitored Amazon EC2 instance

    Note that the example alarm configuration is similar to the previous example. However, the evaluation period checks 2 out of 3 available data points before triggering the alarm. The period is also reduced because of the increased evaluation period.

    When the alarm period starts at 12:00:00 UTC, the following values were received by the metric:

    CloudWatch looks for data points that are older than 12:00:00 UTC because of the increased evaluation period:

    The aggregated data point at 12:00:00 UTC breaches the threshold. However, the alarm remains in the OK state and doesn't change to the ALARM state. This behavior happens because only one out of three data points breach the threshold, whereas two out of three are required to trigger the alarm.


    Sours: https://aws.amazon.com/premiumsupport/knowledge-center/cloudwatch-trigger-metric/
    AWS - CloudWatch Metrics, Alarms, Pricing, Events, Detailed Monitoring

    Using Amazon CloudWatch alarms

    You can create both metric alarms and composite alarms in CloudWatch.

    • A metric alarm watches a single CloudWatch metric or the result of a math expression based on CloudWatch metrics. The alarm performs one or more actions based on the value of the metric or expression relative to a threshold over a number of time periods. The action can be sending a notification to an Amazon SNS topic, performing an Amazon EC2 action or an Auto Scaling action, or creating an OpsItem or incident in Systems Manager.

    • A composite alarm includes a rule expression that takes into account the alarm states of other alarms that you have created. The composite alarm goes into ALARM state only if all conditions of the rule are met. The alarms specified in a composite alarm's rule expression can include metric alarms and other composite alarms.

      Using composite alarms can reduce alarm noise. You can create multiple metric alarms, and also create a composite alarm and set up alerts only for the composite alarm. For example, a composite might go into ALARM state only when all of the underlying metric alarms are in ALARM state.

      Composite alarms can send Amazon SNS notifications when they change state, and can create Systems Manager OpsItems or incidents when they go into ALARM state, but can't perform EC2 actions or Auto Scaling actions.

    You can add alarms to CloudWatch dashboards and monitor them visually. When an alarm is on a dashboard, it turns red when it is in the state, making it easier for you to monitor its status proactively.

    An alarm invokes actions only when the alarm changes state. The exception is for alarms with Auto Scaling actions. For Auto Scaling actions, the alarm continues to invoke the action once per minute that the alarm remains in the new state.

    An alarm can watch a metric in the same account. If you have enabled cross-account functionality in your CloudWatch console, you can also create alarms that watch metrics in other AWS accounts. Creating cross-account composite alarms is not supported. Creating cross-account alarms that use math expressions is supported, except that the , , and functions are not supported for cross=account alarms.

    Note

    CloudWatch doesn't test or validate the actions that you specify, nor does it detect any Amazon EC2 Auto Scaling or Amazon SNS errors resulting from an attempt to invoke nonexistent actions. Make sure that your alarm actions exist.

    Metric alarm states

    A metric alarm has the following possible states:

    • – The metric or expression is within the defined threshold.

    • – The metric or expression is outside of the defined threshold.

    • – The alarm has just started, the metric is not available, or not enough data is available for the metric to determine the alarm state.

    Evaluating an alarm

    When you create an alarm, you specify three settings to enable CloudWatch to evaluate when to change the alarm state:

    • Period is the length of time to evaluate the metric or expression to create each individual data point for an alarm. It is expressed in seconds. If you choose one minute as the period, the alarm evaluates the metric once per minute.

    • Evaluation Periods is the number of the most recent periods, or data points, to evaluate when determining alarm state.

    • Datapoints to Alarm is the number of data points within the Evaluation Periods that must be breaching to cause the alarm to go to the state. The breaching data points don't have to be consecutive, they just must all be within the last number of data points equal to Evaluation Period.

    In the following figure, the alarm threshold for a metric alarm is set to three units. Both Evaluation Period and Datapoints to Alarm are 3. That is, when all existing data points in the most recent three consecutive periods are above the threshold, the alarm goes to state. In the figure, this happens in the third through fifth time periods. At period six, the value dips below the threshold, so one of the periods being evaluated is not breaching, and the alarm state changes back to . During the ninth time period, the threshold is breached again, but for only one period. Consequently, the alarm state remains .

    When you configure Evaluation Periods and Datapoints to Alarm as different values, you're setting an "M out of N" alarm. Datapoints to Alarm is ("M") and Evaluation Periods is ("N"). The evaluation interval is the number of data points multiplied by the period. For example, if you configure 4 out of 5 data points with a period of 1 minute, the evaluation interval is 5 minutes. If you configure 3 out of 3 data points with a period of 10 minutes, the evaluation interval is 30 minutes.

    Note

    If data points are missing soon after you create an alarm, and the metric was being reported to CloudWatch before you created the alarm, CloudWatch retrieves the most recent data points from before the alarm was created when evaluating the alarm.

    Alarm actions

    You can specify what actions an alarm takes when it changes state between the OK, ALARM, and INSUFFICIENT_DATA states. The most common type of alarm action is to notify one or more people by sending a message to an Amazon Simple Notification Service topic. For more information about Amazon SNS, see What is Amazon SNS?.

    Alarms based on EC2 metrics can also perform EC2 actions, such as stopping, terminating, rebooting, or recovering an EC2 instance. For more information, see Create alarms to stop, terminate, reboot, or recover an EC2 instance.

    Alarms can also perform actions to scale an Auto Scaling group. For more information, see Step and simple scaling policies for Amazon EC2 Auto Scaling.

    You can also configure alarms to create OpsItems in Systems Manager Ops Center or create incidents in AWS Systems Manager Incident Manager. These actions be performed only when the alarm goes into ALARM state. For more information, see Configuring CloudWatch to create OpsItems from alarms and Incident creation.

    Configuring how CloudWatch alarms treat missing data

    Sometimes, not every expected data point for a metric gets reported to CloudWatch. For example, this can happen when a connection is lost, a server goes down, or when a metric reports data only intermittently by design.

    CloudWatch enables you to specify how to treat missing data points when evaluating an alarm. This helps you to configure your alarm so that it goes to state only when appropriate for the type of data being monitored. You can avoid false positives when missing data doesn't indicate a problem.

    Similar to how each alarm is always in one of three states, each specific data point reported to CloudWatch falls under one of three categories:

    • Not breaching (within the threshold)

    • Breaching (violating the threshold)

    • Missing

    For each alarm, you can specify CloudWatch to treat missing data points as any of the following:

    • – Missing data points are treated as "good" and within the threshold,

    • – Missing data points are treated as "bad" and breaching the threshold

    • – The current alarm state is maintained

    • – If all data points in the alarm evaluation range are missing, the alarm transitions to INSUFFICIENT_DATA.

    The best choice depends on the type of metric. For a metric that continually reports data, such as of an instance, you might want to treat missing data points as , because they might indicate that something is wrong. But for a metric that generates data points only when an error occurs, such as in Amazon DynamoDB, you would want to treat missing data as . The default behavior is .

    Choosing the best option for your alarm prevents unnecessary and misleading alarm condition changes, and also more accurately indicates the health of your system.

    How alarm state is evaluated when data is missing

    Whenever an alarm evaluates whether to change state, CloudWatch attempts to retrieve a higher number of data points than the number specified as Evaluation Periods. The exact number of data points it attempts to retrieve depends on the length of the alarm period and whether it is based on a metric with standard resolution or high resolution. The time frame of the data points that it attempts to retrieve is the evaluation range.

    Once CloudWatch retrieves these data points, the following happens:

    • If no data points in the evaluation range are missing, CloudWatch evaluates the alarm based on the most recent data points collected. The number of data points evaluated is equal to the Evaluation Periods for the alarm. The extra data points from farther back in the evaluation range are not needed and are ignored.

    • If some data points in the evaluation range are missing, but the total number of existing data points that were successfully retrieved from the evaluation range is equal to or more than the alarm's Evaluation Periods, CloudWatch evaluates the alarm state based on the most recent real data points that were successfully retrieved, including the necessary extra data points from farther back in the evaluation range. In this case, the value you set for how to treat missing data is not needed and is ignored.

    • If some data points in the evaluation range are missing, and the number of actual data points that were retrieved is lower than the alarm's number of Evaluation Periods, CloudWatch fills in the missing data points with the result you specified for how to treat missing data, and then evaluates the alarm. However, all real data points in the evaluation range are included in the evaluation. CloudWatch uses missing data points only as few times as possible.

    Note

    A particular case of this behavior is that CloudWatch alarms might repeatedly re-evaluate the last set of data points for a period of time after the metric has stopped flowing. This re-evaluation might cause the alarm to change state and re-execute actions, if it had changed state immediately prior to the metric stream stopping. To mitigate this behavior, use shorter periods.

    The following tables illustrate examples of the alarm evaluation behavior. In the first table, Datapoints to Alarm and Evaluation Periods are both 3. CloudWatch retrieves the 5 most recent data points when evaluating the alarm, in case some of the most recent 3 data points are missing. 5 is the evaluation range for the alarm.

    Column 1 shows the 5 most recent data points, because the evaluation range is 5. These data points are shown with the most recent data point on the right. 0 is a non-breaching data point, X is a breaching data point, and - is a missing data point.

    Column 2 shows how many of the 3 necessary data points are missing. Even though the most recent 5 data points are evaluated, only 3 (the setting for Evaluation Periods) are necessary to evaluate the alarm state. The number of data points in Column 2 is the number of data points that must be "filled in", using the setting for how missing data is being treated.

    In columns 3-6, the column headers are the possible values for how to treat missing data. The rows in these columns show the alarm state that is set for each of these possible ways to treat missing data.

    Data points# of data points that must be filledMISSINGIGNOREBREACHINGNOT BREACHING

    0 - X - X

    0

    0 - - - -

    2

    - - - - -

    3

    Retain current state

    0 X X - X

    0

    - - X - -

    2

    In the second row of the preceding table, the alarm stays even if missing data is treated as breaching, because the one existing data point is not breaching, and this is evaluated along with two missing data points which are treated as breaching. The next time this alarm is evaluated, if the data is still missing it will go to , as that non-breaching data point will no longer be in the evaluation range.

    The third row, where all five of the most recent data points are missing, illustrates how the various settings for how to treat missing data affect the alarm state. If missing data points are considered breaching, the alarm goes into ALARM state, while if they are considered not breaching, then the alarm goes into OK state. If missing data points are ignored, the alarm retains the current state it had before the missing data points. And if missing data points are just considered as missing, then the alarm does not have enough recent real data to make an evaluation, and goes into INSUFFICIENT_DATA.

    In the fourth row, the alarm goes to state in all cases because the three most recent data points are breaching, and the alarm's Evaluation Periods and Datapoints to Alarm are both set to 3. In this case, the missing data point is ignored and the setting for how to evaluate missing data is not needed, because there are 3 real data points to evaluate..

    Row 5 represents a special case of alarm evaluation called premature alarm state. For more information, see Avoiding premature transitions to alarm state.

    In the next table, the Period is again set to 5 minutes, and Datapoints to Alarm is only 2 while Evaluation Periods is 3. This is a 2 out of 3, M out of N alarm.

    The evaluation range is 5. This is the maximum number of recent data points that are retrieved and can be used in case some data points are missing.

    Data points# of missing data pointsMISSINGIGNOREBREACHINGNOT BREACHING

    0 - X - X

    0

    0 0 X 0 X

    0

    0 - X - -

    1

    - - - - 0

    2

    - - - X -

    2

    Retain current state

    In rows 1 and 2, the alarm always goes to ALARM state because 2 of the 3 most recent data points are breaching. In row 2, the two oldest data points in the evaluation range are not needed because none of the 3 most recent data points are missing, so these two older data points are ignored.

    In rows 3 and 4, the alarm goes to ALARM state only if missing data is treated as breaching, in which case the two most recent missing data points are both treated as breaching. In row 4, these two missing data points that are treated as breaching provide the two necessary breaching data points to trigger the ALARM state.

    Row 5 represents a special case of alarm evaluation called premature alarm state. For more information, see the following section.

    Avoiding premature transitions to alarm state

    CloudWatch alarm evaluation includes logic to try to avoid false alarms, where the alarm goes into ALARM state prematurely when data is intermittent. The example shown in row 5 in the tables in the previous section illustrate this logic. In those rows, and in the following examples, the Evaluation Periods is 3 and the evaluation range is 5 data points. Datapoints to Alarm is 3, except for the M out of N example, where Datapoints to Alarm is 2.

    Suppose an alarm's most recent data is , with four missing data points and then a breaching data point as the most recent data point. Because the next data point may be non-breaching, the alarm does not go immediately into ALARM state when the data is either or and Datapoints to Alarm is 3. This way, false positives are avoided when the next data point is non-breaching and causes the data to be or .

    However, if the last few data points are , the alarm goes into ALARM state even if missing data points are treated as missing. This is because alarms are designed to always go into ALARM state when the oldest available breaching datapoint during the Evaluation Periods number of data points is at least as old as the value of Datapoints to Alarm, and all other more recent data points are breaching or missing. In this case, the alarm goes into ALARM state even if the total number of datapoints available is lower than M (Datapoints to Alarm).

    This alarm logic applies to M out of N alarms as well. If the oldest breaching data point during the evaluation range is at least as old as the value of Datapoints to Alarm, and all of the more recent data points are either breaching or missing, the alarm goes into ALARM state no matter the value of M (Datapoints to Alarm).

    High-resolution alarms

    If you set an alarm on a high-resolution metric, you can specify a high-resolution alarm with a period of 10 seconds or 30 seconds, or you can set a regular alarm with a period of any multiple of 60 seconds. There is a higher charge for high-resolution alarms. For more information about high-resolution metrics, see Publishing custom metrics.

    Alarms on math expressions

    You can set an alarm on the result of a math expression that is based on one or more CloudWatch metrics. A math expression used for an alarm can include as many as 10 metrics. Each metric must be using the same period.

    For an alarm based on a math expression, you can specify how you want CloudWatch to treat missing data points for the underlying metrics when evaluating the alarm.

    Alarms based on math expressions can't perform Amazon EC2 actions.

    For more information about metric math expressions and syntax, see Using metric math.

    Percentile-based CloudWatch alarms and low data samples

    When you set a percentile as the statistic for an alarm, you can specify what to do when there is not enough data for a good statistical assessment. You can choose to have the alarm evaluate the statistic anyway and possibly change the alarm state. Or, you can have the alarm ignore the metric while the sample size is low, and wait to evaluate it until there is enough data to be statistically significant.

    For percentiles between 0.5 (inclusive) and 1.00 (exclusive), this setting is used when there are fewer than 10/(1-percentile) data points during the evaluation period. For example, this setting would be used if there were fewer than 1000 samples for an alarm on a p99 percentile. For percentiles between 0 and 0.5 (exclusive), the setting is used when there are fewer than 10/percentile data points.

    Common features of CloudWatch alarms

    The following features apply to all CloudWatch alarms:

    • There is no limit to the number of alarms that you can create. To create or update an alarm, you use the CloudWatch console, the PutMetricAlarm API action, or the put-metric-alarm command in the AWS CLI.

    • Alarm names must contain only ASCII characters.

    • You can list any or all of the currently configured alarms, and list any alarms in a particular state by using the CloudWatch console, the DescribeAlarms API action, or the describe-alarms command in the AWS CLI.

    • You can disable and enable alarms by using the DisableAlarmActions and EnableAlarmActions API actions, or the disable-alarm-actions and enable-alarm-actions commands in the AWS CLI.

    • You can test an alarm by setting it to any state using the SetAlarmState API action or the set-alarm-state command in the AWS CLI. This temporary state change lasts only until the next alarm comparison occurs.

    • You can create an alarm for a custom metric before you've created that custom metric. For the alarm to be valid, you must include all of the dimensions for the custom metric in addition to the metric namespace and metric name in the alarm definition. To do this, you can use the PutMetricAlarm API action, or the put-metric-alarm command in the AWS CLI.

    • You can view an alarm's history using the CloudWatch console, the DescribeAlarmHistory API action, or the describe-alarm-history command in the AWS CLI. CloudWatch preserves alarm history for two weeks. Each state transition is marked with a unique timestamp. In rare cases, your history might show more than one notification for a state change. The timestamp enables you to confirm unique state changes.

    • The number of evaluation periods for an alarm multiplied by the length of each evaluation period can't exceed one day.

    Note

    Some AWS resources don't send metric data to CloudWatch under certain conditions.

    For example, Amazon EBS might not send metric data for an available volume that is not attached to an Amazon EC2 instance, because there is no metric activity to be monitored for that volume. If you have an alarm set for such a metric, you might notice its state change to . This might indicate that your resource is inactive, and might not necessarily mean that there is a problem. You can specify how each alarm treats missing data. For more information, see Configuring how CloudWatch alarms treat missing data.

    Document Conventions

    Use metrics explorer to monitor resources by their tags and properties

    Setting up an SNS topic

    Sours: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/AlarmThatSendsEmail.html

    Similar news:

    Customize Amazon CloudWatch alarm notifications to your local time zone – Part 1

    by Ahmed Magdy Wahdan | on | in Amazon CloudWatch, Amazon EventBridge, Amazon Simple Notification Service (SNS), AWS CloudFormation, AWS Lambda, Expert (400), Management & Governance, Monitoring and observability | Permalink |  Share

    This two-part series discusses how to customize Amazon CloudWatch alarm notifications to your local time zone. Part 1 covers customizing using CloudWatch Events rule. Part 2 covers customizing using Amazon SNS.

    You can use Amazon CloudWatch to set alarms and automate actions based on predefined thresholds or machine learning algorithms that identify anomalous behavior in your metrics. For example, you can start Amazon EC2 Auto Scaling automatically or stop an instance to reduce billing overages.

    You can customize CloudWatch alarm notifications as appropriate for your scenarios. For example, if you create a metric filter on CloudWatch Logs and want to get the event cause in the notification message, you can use CloudWatch Logs Customize alarms. If you want to attach graphs for your alarm metrics where the alarm was triggered, you can use the GetMetricWidgetImage action to retrieve the metric at the alarm state change. You can also change the time zone of your alarm notifications to your local time zone.

    There are two ways to get details about an alarm state change:

    You can parse the alarm state change details to create a customized notification message.

    In this post, I show you how to use an AWS Lambda function to change the alarm notification to your local time zone using CloudWatch Events rule.

    Overview

    CloudWatch uses Coordinated Universal Time (UTC) when returning timestamps for an alarm. It can be time-consuming and confusing to convert this information to your local time zone. In this blog post, I describe a solution that includes a Python Lambda function with the Python Time Zone library. The way in which the Lambda function parses the alarm state change event depends on which method you use: a CloudWatch Events rule or an Amazon SNS subscription. You can edit the Lambda function environment variable to change the time zone based on the alarm Region. Instead of having a fixed time zone, you can get support for multiple local time zones by editing the Lambda function code to change the time zone based on the alarm Region.

    Customize CloudWatch alarms by using a CloudWatch Events rule

    You can use a CloudWatch Events rule that matches on alarm evaluation changes and then triggers a Lambda function that parses the alarm event and creates a customized notification.

    Pros:

    • Faster notification because there is no additional Amazon SNS layer to invoke the Lambda function.
    • Can match on all alarms in the AWS Region. Although you need to create a rule that matches all Regions for your alarms and a Lambda function in each Region.

    When the alarm state changes, a rule matches on the state transition and invokes a Lambda function that creates a customized message. The message is sent to an SNS topic.

    Customize CloudWatch alarms by using a Lambda function subscribed to an SNS topic

    You can use a Lambda function subscribed to an SNS topic, where the Lambda function parses the alarm event and creates a customized notification.

    Pros:

    • You can create one Lambda function in any AWS Region and subscribe it to the SNS topic that the alarm will trigger. Note that you will incur additional charges for using two SNS topics, check Amazon SNS pricing.

    When the alarm state changes, the alarm triggers an SNS topic to which the Lambda function is subscribed. SNS invokes the Lambda function that creates the customized message. The message is sent to the SNS topic.

    For information about the Amazon SNS method, see the Customize Amazon CloudWatch alarm notifications to your local time zone- Part 2 blog post.

    Walkthrough

    To customize your alarm using a CloudWatch Events rule:

    • Create an Amazon SNS topic and use your email address to subscribe to it.
    • Install the PyTZ library and package with custom functions as AWS Lambda layer. With layers, you can use libraries in your function without needing to include them in your deployment package.
    • Create AWS Lambda execution role. This is the AWS Identity and Access Management (IAM) role that AWS Lambda assumes when it runs your function.
    • Create a Lambda function with code using custom functions in the layer, add environment variables with your time zone, time zone abbreviation, and SNS topic ARN.
    • Create a CloudWatch Events rule pattern to match on your CloudWatch alarm state. Changes and set the created Lambda function as target for the rule.

    If you prefer to use a CloudFormation template to create these resources, launch the following stack.

    Launch Stack button

    Prerequisites

    If you want to follow along make sure you have access to the AWS Management Console with the proper IAM permissions required to create CloudWatch Events rule, AWS Lambda layer, AWS Lambda function, AWS Lambda execution role and Amazon SNS topic.

    Create an SNS topic

    Create an SNS topic and use your email address to subscribe to it. Later you’ll add the ARN of this SNS topic to the Lambda function environment variable.

    To create SNS topic

    1. Sign in to the Amazon SNS console, and from the left navigation pane, choose Topics.
    2. On the Topics page, choose Create topic.
    3. By default, the console creates a FIFO topic. Choose Standard.
    4. In the Details section, enter a name for the topic, such as NotificationSNSTopic.
    5. Scroll to the end of the form and choose Create topic.

    To create a subscription to the topic

    1. In the left navigation pane, choose Subscriptions.
    2. On the Subscriptions page, choose Create subscription.
    3. On the Create subscription page, choose the Topic ARN field to see a list of the topics in your AWS account.
    4. Choose the topic that you created in the previous step.
    5. For Protocol, choose Email.
    6. For Endpoint, enter an email address that can receive notifications.
    7. Choose Create subscription.
    8. Check your email inbox and choose Confirm subscription in the email from AWS Notifications. The Sender ID is usually [email protected]
    9. Amazon SNS opens your web browser and displays a subscription confirmation with your subscription ID.

    Create a Lambda layer package

    Install the PyTZ library dependencies for your Lambda function.

    1. Install the libraries in a package directory with the pip’s –target
    1. Under the python directory, create changeAlarmToLocalTimeZone.py file that contains three functions:
    • getAllAvailableTimezones to print all available time zones.
    • searchAvailableTimezones to print time zones that match the sub string.

    For example, TimeZone.SearchAvailableTimezones(‘sy’)
    Returns:
    Matched zone: Antarctica/Syowa
    Matched zone: Australia/Sydney

    • changeAlarmToLocalTimeZone to change the local time zone of the alarm.

    Create a deployment package from the installed libraries and the .lib file under the python directory.

    To create a Lambda layer

    1. In the AWS Lambda console, open the Layers page and choose Create layer.
    2. Enter a name and optional description for your layer.
    3. To upload your layer code, do one of the following, and then choose Create.
      • To upload a .zip file from your computer, choose Upload a .zip file, choose your .zip file, and then choose Open.
      • To upload a file from Amazon Simple Storage Service (Amazon S3), choose Upload a file from Amazon S3. For Amazon S3 link URL, enter a link to the file.
      • (Optional) For Compatible runtimes, choose up to five runtimes.
      • (Optional) For License, enter any required license information.

    Create AWS Lambda execution role

    By default, when you create a function in the console, AWS Lambda creates an execution role. You can also create an execution role in the IAM console.

    To create an execution role in the IAM console

    1. Open the IAM Roles page and choose Create role.
    2. Under Common use cases, choose Lambda.
    3. Choose Next: Permissions.
    4. Choose Next: Tags.
    5. Choose Next: Review.
    6. Enter a name and description for the role and choose Create role.
    7. Open the role and add an inline policy.
    8. On the JSON tab, add the following permission. Enter your values for Region ID, Account ID, Lambda function Name, and SNS Topic ARN.
    1. Review the policy.
    2. Enter a name for the policy and then choose Create policy.

    Create a Lambda function

    1. Open the AWS Lambda console and choose Create a function.
    2. Select Author from scratch.
    3. Under Basic information, enter a name for the function. For Runtime, confirm that Python 3.8 is selected.
    4. Under Change default execution role, select Use an existing role, and then choose the Lambda execution role.
    5. Choose Create function.
    6. On the Configuration tab, in Designer, choose Layers.
    7. Choose Add a layer, select Custom layers, choose the created layer, and then choose Add.
    8. On the Configuration tab, in Designer, choose the Lambda function name.
    9. In Environment variables, choose Edit and then add three variables:
      • In Key, enter NotificationSNSTopic. In Value, enter the SNS topic ARN.
      • In Key, enter TimeZoneCode. In Value, enter your time zone code (for example, Australia/Sydney).
      • In Key, enter TimezoneInitial. In Value, enter the abbreviation for your time zone (for example, AEST). Time zones are often named by how many hours they are different from UTC time, so for example, you can also enter UTC+11.
    10. Choose Save.
    11. Overwrite your function code in the embedded editor with the following code.
      1. Choose Deploy.

      The Lambda function will receive the following event JSON from the CloudWatch Events rule:

      Create a CloudWatch Events rule

      Create a CloudWatch Events rule with the following custom event pattern and your Lambda function as a target.

      1. Open the CloudWatch console, choose Rules, and then choose Create rule.
      2. For Event source, choose Event pattern, and then choose Custom event pattern. To match on specific alarm state transitions, add the ARNs of your alarms. To match on an alarm state, in Value, specify the state (for example, ALARM, OK, INSUFFICIENT_DATA).
      1. For Targets, choose Add target, and then choose Lambda function.
      2. For Function, choose the Lambda function you created.
      3. Choose Configure details. For Rule definition, enter a name and description for the rule. The rule name must be unique in your selected AWS Region.
      4. Choose Create rule.

      Output result

      When you use the Australia/Sydney time zone (AEST):

      Original message

      Original CloudWatch alarm notification message using UTC time zone

      Customized CloudWatch alarm notification

      Cutomized CloudWatch alarm notification message using local time zone (AEST)

      Clean up

      To avoid ongoing charges to your account, delete the resources you created in this walkthrough.

      If you are using AWS CloudFormation template, then delete the stack.

      Conclusion

      In this blog post, I’ve shown you how to use a CloudWatch Events rule and Lambda function to customize CloudWatch alarm notification to the local time zone of your resources, logs, and metrics. For more information, check Amazon CloudWatch Alarms documentation.

       

       

      About the Author

      Ahmed Magdy Wahdan

      Magdy is a Cloud Support Engineer and CloudWatch SME for Amazon Web Services. He helps global customers design, deploy, and troubleshoot large-scale networks built on AWS. He specializes in CloudWatch, Elastic Load Balancing, Auto Scaling, and Amazon VPC. In his spare time, he loves to free-dive and make desserts.

       

      TAGS: Amazon CloudWatch alarms, Amazon Cloudwatch Events

      Sours: https://aws.amazon.com/blogs/mt/customize-amazon-cloudwatch-alarm-notifications-to-your-local-time-zone-part-1/


      28897 28898 28899 28900 28901