Monitoring Disk Performance In Aws: Key Strategies

how to monitor disk performance in aws

Monitoring disk performance in AWS is crucial for maintaining reliable and high-performing solutions. Amazon EC2 and CloudWatch offer various tools and features to help users collect, view, and analyse different types of metrics and logs related to disk performance. By understanding the available tools and metrics, users can effectively monitor their disk performance, ensuring optimal resource utilisation and application performance. This includes monitoring system status and instance status checks, as well as leveraging CloudWatch alarms, events, and logs for deeper insights. Additionally, the CloudWatch agent allows for the collection of internal system-level metrics, providing a comprehensive view of disk performance in AWS.

Characteristics Values
Monitoring tools Amazon EC2, Amazon CloudWatch, AWS Systems Manager, CloudWatch agent, SSM Agent, AWS CloudTrail logs
Monitoring data Disk space usage, CPU utilization, network bytes in/out, disk performance, storage performance, logs
Monitoring frequency Amazon EC2 provides instance metrics every 5 minutes. Detailed monitoring can be enabled for up to 1-minute intervals.
Alerts and notifications Amazon CloudWatch alarms, Amazon EventBridge events, Amazon Simple Notification Service
Automation AWS Systems Manager Run Command, CloudWatch alarms, EventBridge events
Security AWS IAM roles provide necessary permissions and access control for monitoring tools and users
Performance optimization EBS-optimized instances provide dedicated bandwidth to EBS volumes, ensuring steady network performance
Volume types Solid-state drives (SSD), hard disk drives (HDD), General Purpose (gp2), Provisioned IOPS (io1), Throughput Optimized (st1), Cold HDD (sc1)
Volume performance metrics IOPS, throughput, block size, VolumeReadOps, VolumeWriteOps, VolumeReadBytes, VolumeWriteBytes, VolumeThroughputPercentage, VolumeConsumedReadWriteOps
Latency metrics VolumeTotalReadTime, VolumeTotalWriteTime

shundigital

Using AWS Systems Manager to monitor disk usage

Monitoring disk performance in AWS is crucial for maintaining the reliability and performance of your instances and solutions. One way to do this is by using AWS Systems Manager, which provides a centralized way to collect, manage, and visualize system data. Here's a step-by-step guide on how to use AWS Systems Manager to monitor disk usage:

Step 1: Understand AWS Systems Manager

AWS Systems Manager, also known as EC2 Systems Manager, is a powerful tool that helps you manage and automate various tasks related to your Amazon EC2 instances. It provides a centralized interface for collecting and analyzing system data, including disk usage. By using Systems Manager, you can gain deeper insights into the performance and health of your EC2 instances.

Step 2: Install and Configure the CloudWatch Agent

To monitor disk usage, you'll need to install and configure the CloudWatch Agent on your EC2 instances. The CloudWatch Agent is a monitoring service that collects system-level metrics, including disk utilization. You can use the AWS Systems Manager Run Command to deploy the CloudWatch Agent across all your instances. This command uses the Systems Manager agent, which is installed by default on each AWS instance.

Step 3: Create IAM Roles and Attach Them to EC2 Instances

Create two IAM roles: CloudWatchAgentAdminRole and CloudWatchAgentServerRole. The first role has permissions to write the CloudWatch agent configuration to the System Manager Parameter Store, while the second role has read-only permissions. Attach these roles to your EC2 instances to provide the necessary permissions for the monitoring process.

Step 4: Run the CloudWatch Agent Configuration Wizard

Once the CloudWatch Agent is installed, run the CloudAgent configuration wizard to create the agent configuration. This wizard will guide you through capturing the specific metrics you want to monitor, such as disk usage and memory utilization. The configuration is unique to the operating system type, so be sure to run the wizard on both Linux and Windows instances if you have a mixed environment.

Step 5: Review and Modify the Agent Configuration

The agent configuration is stored in the Systems Manager Parameter Store. You can review and modify this configuration to capture additional metrics if needed. Go to the System Manager console, click on "Parameter Store," and locate the parameter created by the CloudWatch agent configuration program. From here, you can review and edit the configuration parameters to ensure you're collecting the right data.

Step 6: Start the CloudWatch Agent and Use the Configuration

In this step, you'll instruct the CloudWatch Agent to start using your agent configuration stored in the System Manager Parameter Store. Open the System Manager console, specify "Run Command," and follow the steps to select the appropriate document name prefix, command, and configuration source. Choose your target servers, and run the command to configure the CloudWatch Agent.

Step 7: Review the Data Collected by the CloudWatch Agents

Finally, you can review the data collected by the CloudWatch Agents in the AWS Management Console. Go to CloudWatch, click on "Metrics," and look for the custom namespace for CWAgent. Here, you'll be able to see the metrics captured by the CloudWatch Agent, including disk usage and other performance indicators. You can also create CloudWatch Alarms to alert you if certain metrics go beyond a specified threshold.

By following these steps, you can effectively use AWS Systems Manager to monitor disk usage and gain valuable insights into the performance and health of your EC2 instances. This information can help you optimize your applications and ensure they run efficiently.

shundigital

Monitoring Amazon EC2 and Amazon EBS

There are three complementary sets of data that provide visibility on storage performance available for Amazon EC2:

Instance metrics generated by EC2

By default, Amazon EC2 provides instance metrics every 5 minutes. These metrics are available through Amazon CloudWatch and include storage performance. There is no charge for this data. Detailed monitoring can be enabled to collect metrics more frequently, at up to 1-minute intervals. This has no cost, but using the metrics with Amazon CloudWatch is charged. All of these metrics come from the EC2 data collected at the hypervisor level and are published under the AWS/EC2 namespace. The full list of available metrics is in the EC2 documentation.

Custom metrics from the OS level

These can be collected at a more granular level, up to 1-second intervals. This is done by enabling the CloudWatch agent in the EC2 instance. The CloudWatch agent can collect system-level metrics as well as logs from EC2 instances and from on-prem systems. The CloudWatch Agent can run on Windows, Linux, or Mac, and with x86-64 or ARM64 architecture. It can also collect metrics and logs from custom applications or services using the StatsD and collected protocols. This data can be seen in Amazon CloudWatch under the custom namespace.

EBS CloudWatch metrics

This data is generated from the EBS storage and is published under the AWS/EBS namespace. This data is from the EBS subsystem of EC2, unlike the previous data sets that are seen through the hypervisor or OS. This data is on individual volumes while the previous data is aggregated at the instance level for instance and EBS storage. Consequently, this data set provides deeper insight into usage and issues, including latency at the volume level. The granularity of this data is 1 minute and there is no charge.

The EC2 metrics and CloudWatch agent metrics provide information on health and performance at the application and system level. The EBS CloudWatch metrics provide performance details specific to EBS volumes. Since EBS volumes are the mainstay for high-performance applications, a combined viewing of these data sets can help correlate application and storage performance. The data is especially helpful to identify IOPS bottlenecks and latencies.

Amazon CloudWatch can also be used to create thresholds to identify any issues with the monitored metrics. This function can be automated to raise alerts and remediations.

shundigital

Collecting metrics with CloudWatch agent

Collecting metrics with the CloudWatch agent is a versatile process that can be done in several ways. The CloudWatch agent can be installed on Amazon EC2 instances and on-premises servers, including those in a hybrid environment or not managed by AWS. The supported operating systems for the CloudWatch agent include various versions of Ubuntu Server, CentOS, Red Hat Enterprise Linux (RHEL), Debian, SUSE Linux Enterprise Server (SLES), Oracle Linux, AlmaLinux, Rocky Linux, macOS, and Windows Server.

The CloudWatch agent allows users to collect internal system-level metrics from Amazon EC2 instances across different operating systems. It can also retrieve custom metrics from applications or services using the StatsD and collectd protocols. StatsD is supported on both Linux and Windows Server, while collectd is only supported on Linux servers. The agent can also collect logs from Amazon EC2 instances and on-premises servers running Linux or Windows Server. It is important to note that the CloudWatch agent does not support collecting logs from FIFO pipes.

To collect metrics with the CloudWatch agent, follow these general steps:

  • Create IAM roles or users that enable the agent to collect metrics from the server and, optionally, to integrate with AWS Systems Manager.
  • Download the agent package.
  • Modify the CloudWatch agent configuration file and specify the metrics you want to collect. You can also specify a different namespace if desired.
  • Install and start the agent on your servers. When installing the agent on an EC2 instance, attach the IAM role created in step 1. When installing on an on-premises server, specify a named profile with the credentials of the IAM user created in step 1.

By following these steps, you can effectively collect metrics using the CloudWatch agent. For more detailed instructions and information on specific operating systems, refer to the AWS documentation.

shundigital

Viewing metrics with Amazon CloudWatch

Amazon CloudWatch is a monitoring tool that provides metrics and visibility around Amazon EC2 instances, EBS storage, network, and even applications. It is an AWS monitoring service that offers an at-a-glance view of the state of your Amazon EC2 environment.

To view metrics with Amazon CloudWatch, follow these steps:

  • Open the Amazon EC2 console and select a running Amazon EC2 instance.
  • Select the Monitoring tab to access Amazon CloudWatch metrics for that instance.
  • To understand what percentage of the disk is being used, or what percentage of memory is being used, install a CloudWatch agent on the instance. This agent can capture internal performance metrics, including system-level metrics and logs from the server. For Windows, the agent captures Windows performance monitor counters, while for Linux, it captures system-level metrics.
  • Configure the CloudWatch agent to specify the metrics you want to collect.
  • Start the CloudWatch agent and instruct it to use your agent configuration.
  • Review the data collected by the CloudWatch agents in the CloudWatch console. Go to Metrics on the left-hand navigation pane and select the custom namespace for CWAgent.

By following these steps, you can effectively use Amazon CloudWatch to monitor disk performance in AWS, providing valuable insights into your EC2 instances and EBS storage.

shundigital

Monitoring EBS latency

To effectively monitor EBS latency, you can utilise the following metrics and tools:

VolumeTotalReadTime and VolumeTotalWriteTime:

These metrics measure the total time taken for all read and write operations completed within a specified period. By calculating the average of these metrics, you can determine the average time per operation. Additionally, the sum of these metrics provides the total time of all completed operations. Monitoring these metrics can help identify potential bottlenecks and performance issues.

VolumeAvgReadLatency and VolumeAvgWriteLatency:

These metrics allow you to monitor the average latency for read and write operations on an EBS volume. By comparing the average latency with your desired performance requirements, you can identify areas for improvement.

VolumeQueueLength:

The volume queue length represents the number of pending I/O requests for a device. It is essential to calibrate the queue length correctly with I/O size and latency to avoid bottlenecks. A low queue length and a high number of available IOPS can help maintain high IOPS while keeping latency low.

CloudWatch Alarms:

Amazon CloudWatch provides alarms that can monitor a single metric over a specified time period. You can set up alarms to watch EBS latency metrics and receive notifications or take automated actions when the metrics exceed predefined thresholds.

Detailed Performance Statistics:

By accessing detailed performance statistics for EBS volumes attached to Nitro-based EC2 instances, you can derive average latency and IOPS. These statistics help identify areas where you may need to increase provisioned IOPS or throughput limits to optimise application performance.

Custom Metrics with CloudWatch Agent:

The CloudWatch agent can be installed on EC2 instances to collect custom metrics at a granular level, up to 1-second intervals. This allows you to monitor EBS latency at a very detailed level and set up automated responses to certain conditions, such as low free disk space.

By utilising these tools and metrics, you can effectively monitor EBS latency, identify performance bottlenecks, and optimise your application's performance.

Frequently asked questions

Amazon provides various tools to monitor Amazon EC2, including the Amazon EC2 and CloudWatch console dashboards, which offer an overview of the state of your Amazon EC2 environment.

You can use AWS Systems Manager, which doesn't require opening inbound ports or managing public IP addresses, SSH keys or certificates. You can also use Amazon CloudWatch, which is an AWS monitoring service.

You can download and install the CloudWatch agent manually using the command line, or you can integrate it with SSM. The general flow involves creating IAM roles, downloading the agent package, modifying the configuration file, and installing and starting the agent on your servers.

There are three main categories of metrics: IOPS and throughput (completed read/write operations and bytes read/written), volume status and performance (latency, queue length, idle time, burst balance), and disk space usage.

Written by
Reviewed by
Share this post
Print
Did this article help you?

Leave a comment