If you are using Elastic Cloud Compute (EC2) services, you are almost certainly using EBS as well. Despite this prevalence, many adopters struggle to get the best performance out of their EBS configurations. Elastic Block Store (EBS) is a key component of AWS and is used for a wide range of purposes. Including big data analysis, data storage, and the operation of virtual machines.

Tips for Optimizing AWS EBS

Whether you’re configuring EBS for the first time or fine-tuning your current configuration, the following tips can help you gain performance and decrease costs.

1. Use the correct volume size

EBS volumes can be dynamically resized during operation so there is little reason to pay for space you aren’t using. When initializing your volumes, try to avoid overprovisioning. Periodic audits can help clean up these orphaned volumes and reduce unnecessary costs.

Additionally, the maximum size of an EBS volume is 16 TiB, but only if you avoid using Master Boot Record (MBR) partitioning, which is required for Windows boot volumes.

If you can stick to Linux boot volumes, and GUID Partition Table (GPT) partitioning, you can make sure you have access to the full size. And if you have disabled the auto-deletion of attached volumes when you delete EC2 instances, keep in mind that your volumes will remain, but will be unusable unless attached to another instance.

2. Take advantage of RAID

RAID is an architecture that uses mirrored volumes to either eliminate single points of failure or increase performance through workload distribution. One feature of EBS volumes that you should consider taking advantage of is the ability to create a Redundant Array of Independent Disks (RAID).

Although you can use any RAID configuration that your OS supports, AWS specifically recommends sticking to RAID 0 or RAID 1. These configurations can be easily created via EBS snapshots, which allow you to quickly duplicate volumes.
  • RAID 0: which is used to distribute workloads and boost performance, can be helpful when you are unable to gain additional performance by changing volume types.
  • RAID 1: which provides data redundancy, is typically less useful as AWS already provides significant data duplication features but might be beneficial for critical applications and data.

3. Tag your volumes

If you skipped tagging on initial set-up or are using a resource type that doesn’t allow it, it is possible and recommended to go back and apply tags. Tagging your EBS volumes at creation will make them significantly easier to manage and will allow you to take advantage of numerous other features, including metrics tracking and automation of EBS snapshots.
This process can be tedious or impossible if done manually but if performed through Lambda functions or custom scripts in combination with an SDK it should be manageable. With tags, you will be able to more easily search for specific resources, perform cost analysis, and autoscale your resources.

4. Understand burst credits

When a volume is initialized, you automatically start with enough credits to support an increase of up to 3000 I/O Operations per Second (IOPS) for 30 minutes. Burst credits are used to temporarily boost performance during periods of high activity.
You can also simply increase your volume size if it is currently under 1 TiB. Once volumes reach this size, they are not restricted by burst credits and can take advantage of the maximum offered performance.

However, you should only do this if the increase in storage cost is worth the performance boost. Should you find that a lack of burst credits is frequently affecting your productivity, consider using the previous mentioned RAID 0 configuration, which will grant additional burst credit pools and allow you to distribute IOPS.

After that, your performance will drop to baseline performance and you will not regenerate credits until your activity decreases. To avoid this, set up burst credit monitoring from the start, and respond accordingly when they start to drop. Not many are aware of this function of AWS, at least until the first time they experience an unexpected outage.

5. Watch your metrics

You’ll want to look at a breakdown of the metrics available to understand the specific ones to watch according to your goals and concerns. Collected metrics and volume status information are accessible via CloudWatch console, API, or through the CLI, but keep in mind that there is a cost for using them. 

You should also take advantage of built-in alert capabilities which can notify you when volumes drop below a specified performance level or if they fail. Taking advantage of the metrics that AWS automatically collects for you is a good way of monitoring your configuration and provides key information for assessing ROI.

Doing so will help you determine if your volumes are sized correctly, if you could benefit from a caching layer or load balancer, and if you’re using the best volume types. You’ll use a combination of disk I/O, disk activity, and latency metrics.