How to reduce RDS storage size in an automated way

August 10, 2023

Introduction

Amazon RDS is a relational database service available on Amazon Web Services. It is essentially a managed database server with a volume for the data. Both cost money. Having a large dataset is not the only reason have big volume: the volume size determines the number of I/O operations (IOPS) the volume can do. If you have high load on your RDS instance you may very well run out of IOPS which can result in a disaster. Scaling up the volume size is easy to avoid such disasters, but it is much more difficult to reduce RDS storage size. This is because downsizing RDS storage size is not officially supported by AWS. If the load on your RDS instance goes down (e.g. due to refactoring), you may end up having a database with an oversized volume that costs lots of money for no reason.

Reduce RDS storage manually

The basic process for reducing RDS storage is the following:

Create a new RDS instance with a smaller volume
Stop all applications that use the old RDS instance
Dump all databases from the old RDS instance
Restore all databases to the new RDS instance
Update all applications to use the new RDS instance
Hope for the best

This is not a particularly fast nor easy process. If you want to try it out yourself, please refer to this article.

That said, we recommend automating the process. The easiest way might be to use our rds-resize.py script. For details see below.

Reduce RDS storage in an automated way

When we had to resize RDS one of the requirements was that the RDS downsizing process must be testable outside of a production environment. The process also had to be reliable and had to produce consistent results. Therefore our only option was to automate the process, which we would have done anyways given our policy of always building infrastructure as code.

So, to reduce RDS storage size we wrote the rds-resize script. It automates all the steps above except those that start and stop applications that use RDS. It has some rough edges still and only supports PostgreSQL, but it seems to do its job very reliably. We base this claim on a dozen or so RDS downscalings done on live systems, some of which are critical production systems. On a high level the script takes these steps:

Verify that the databases are not in use. If they are, exit without any action.
Create new RDS instance. This step may be skipped when testing dump and restore procedures.
Dump globals (e.g. roles) from the old RDS instance
Restore globals to the new RDS instance
Restore credentials on the new RDS instance. They can't be dumped and restored due to security considerations.
Dump databases from the old RDS instance
Restore databases to the new RDS instance
Sanity check the new database. This includes table and session count comparison for old and new databases. If the counts do not match then something might be wrong.

The rds-resize Git repository includes a Podman container configuration. It allows you to launch the RDS resizing environment with minimal effort, assuming you have Podman installed. This is the case with RHEL 8 and 9 as well as Fedora. Docker will likely work ok as a Podman substitute, but we have not tested it.

If you encounter any issues with rds-resize.py please open a GitHub issue or better yet, create pull request. Happy resizing!

#aws #rds

Samuli Seppänen

Author archive

Did you like the article?

Share it with others

Privacy policy

Puppeteers Oy, c/o LOV co-working space, Uudenmaankatu 1, 20500 Turku, Finland. Y-tunnus: 2919313-3 / VAT ID: FI29193133

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.