
It is trivial to downsize an AWS RDS instance. It is equally trivial to upsize AWS RDS volumes. However, to downsize AWS RDS volume you need to jump through lots of hoops.
So, why would you ever want to downsize an AWS RDS volume? One reason might be that you increased the volume size simply to get more burst credits for the volume. Then you did refactoring that reduced the need to have that many credits and want to save money. Or maybe you genuinely had a huge amount of data on your RDS instance volume, removed some of it but got stuck with a huge volume.
One way to shrink an RDS volume is by using Boto3 in python to spin up a new RDS instance. Basically you create a copy of the RDS instance but use a smaller volume on the new one. Here is an example of using Python to shrink and RDS instance that uses postgres as a backend.
Initial Setup
This Python script uses a bunch of libraries which are imported at the top of the python script. You can store all the configuration items in some yaml file but be sure to .gitignore it as it will have some secret information in it.
from subprocess import Popen, PIPE
import json
import yaml
import psycopg2
import boto3
import botocore.exceptions
import sys
import os
import logging
def run_process(cmd: list[str], use_shell: bool = False):
logging.debug(f'launching process with cmd: {cmd}')
if use_shell:
cmd = ' '.join(cmd)
p = Popen(cmd, stdout=PIPE, stderr=PIPE, shell=use_shell)
stdout, stderr = p.communicate()
if stdout:
logging.info(stdout)
if stderr:
logging.error(stderr)
exit_code = p.wait()
if exit_code:
logging.warning(f'process command {cmd} exited with {exit_code}')
logging.debug('process closed')
The run_process
function is just a helper function to reuse some code that will handle processing a psql command and logging it with the logging library in python.
Dump Databases
First step will be to manually stop all services with a connection to the databases. Then the script connects to all databases and dumps all data that is of interest to you. The dumps will be restored later on the new RDS instance we will be creating.
You can check if a database is in use with something like:
def check_db_in_use(db_name: str) -> bool:
# Connect to the database
conn = psycopg2.connect(
host=master_rds_address,
database='postgres',
user=psql_admin,
password=psql_password
)
cur = conn.cursor()
# Execute a query to check if the database is in use
cur.execute(f"""
SELECT count(*) FROM pg_stat_activity
WHERE datname = '{db_name}' AND state = 'active';
""")
result = cur.fetchone()[0]
# Close the cursor and connection
cur.close()
conn.close()
# Check the result
if result > 0:
return True
else:
return False
The command to dump a psql database is something like pg_dump -U {admin} -h {host} -F c -f {dump_file} {db_name}
. The -F c is telling it to dump in a custom format. You can dump the database with a function like so:
# Example of a database dump function
def dump_db(db_name: str, dump_file: str = ''):
if not dump_file:
dump_file = f'./dump/{db_name}.dump'
cmd = [
'pg_dump', '-U', psql_admin, '-h',master_rds_address, '-F', 'c', '-f', dump_file, db_name
]
run_process(cmd)
You will also need to get the globals which have information like what roles get what permissions. The command generally looks like pg_dumpall -U {admin} -h {host} -f {dump_file} --no-role-passwords -g
. The --no-role-passwords
is to get around permissions with dumping account passwords, which will need to be set later, and the -g
is global. The code would look something like:
def dump_globals(dump_file: str = './dump/globals.sql'):
cmd = [
'pg_dumpall', '-U', psql_admin, '-h', master_rds_address, '-f', dump_file, '--no-role-passwords', '-g'
]
run_process(cmd)
Downsize AWS RDS volume
This function is in charge of creating a new RDS instance based on the one we want to shrink. Technically speaking this is not downsizing, because that's at this point impossible. You will notice a dictionary being used called config
which can be loaded elsewhere in the program and be the object of the config file used.
The general logic is as follows:
- Get master RDS instance we are cloning with
rds.describe_db_instances
as described here - Create a new dictionary with all the properties you want to copy
- Change the allocated_storage and the max_allocated_storage to the smaller target sizes wanted
- Create the new RDS Instance with
rds.create_db_instance
as shown here - Get the new instance address and return it to be referenced later
def create_rds(db_identifier: str) -> str:
global master_rds_address
allocated_storage = config['allocated_storage']
max_allocated_storage = config['max_allocated_storage']
master_db_identifier = config['master_rds_identifier']
result = rds.describe_db_instances(DBInstanceIdentifier=master_db_identifier)
master_db_stats = result['DBInstances'][0]
master_rds_address = master_db_stats['Endpoint']['Address']
vpc_security_group_ids = []
for group in master_db_stats['VpcSecurityGroups']:
vpc_security_group_ids.append(group['VpcSecurityGroupId'])
db_subnet_group_name = master_db_stats['DBSubnetGroup']['DBSubnetGroupName']
new_db_stats = {
'DBName': master_db_stats['DBName'],
'AllocatedStorage': allocated_storage,
'MaxAllocatedStorage': max_allocated_storage,
'DBInstanceIdentifier': db_identifier,
'DBInstanceClass': master_db_stats['DBInstanceClass'],
'MasterUserPassword': psql_password,
'DBSubnetGroupName': db_subnet_group_name,
'Engine': master_db_stats['Engine'],
'EngineVersion': master_db_stats['EngineVersion'],
'MasterUsername': master_db_stats['MasterUsername'],
'AvailabilityZone': master_db_stats['AvailabilityZone'],
'PreferredMaintenanceWindow': master_db_stats['PreferredMaintenanceWindow'],
'BackupRetentionPeriod': master_db_stats['BackupRetentionPeriod'],
'VpcSecurityGroupIds': vpc_security_group_ids,
'AutoMinorVersionUpgrade': master_db_stats['AutoMinorVersionUpgrade'],
'CopyTagsToSnapshot': master_db_stats['CopyTagsToSnapshot'],
'DeletionProtection': master_db_stats['DeletionProtection'],
'EnableCloudwatchLogsExports': master_db_stats['EnabledCloudwatchLogsExports']
}
rds.create_db_instance(**new_db_stats)
# Wait for the instance to become available
waiter = rds.get_waiter('db_instance_available')
waiter.wait(DBInstanceIdentifier=db_identifier)
result = rds.describe_db_instances(DBInstanceIdentifier=db_identifier)
new_db_stats = result['DBInstances'][0]
return new_db_stats['Endpoint']['Address']
Restore Globals
Next we run the command: psql -U {admin} -h {host} -d postgres < {dump_file}
to restore from the dump file created above. Notice the run_process
has the shell=False
param being set to True. This is because of the <
causing an issue with the normal use of the Popen
function.
Code should look something like:
def restore_globals(new_rds_address: str, restore_file: str = './dump/globals.sql'):
if not os.path.exists(restore_file):
logging.error(f'file {restore_file} does not exist!')
return
cmd = [
'psql', '-U', psql_admin, '-h', new_rds_address,
'-d', 'postgres', '<', restore_file
]
run_process(cmd, True)
Restore Databases
Then we use the command pg_restore -U {admin} -h {host} -F c --create -d {database} {dump_file}
to restore from the dump file created before.
The code:
def restore_db(db_name: str, new_rds_address: str, restore_file: str = ''):
if not restore_file:
restore_file = f'./dump/{db_name}.dump'
if not os.path.exists(restore_file):
return
cmd_restore_db = [
'pg_restore', '-U', psql_admin, '-h', new_rds_address, '-F', 'c', '--create', '-d', 'postgres', restore_file
]
run_process(cmd_restore_db)
Restore Service Passwords
The last couple steps is to restore the passwords since the globals were dumped without passwords and to allow the users to login. The LOGIN privilege is removed by default for security reasons and needs to be re-enabled. The code:
def restore_password(user: str, password: str, new_rds_address: str):
sql_reset = f"ALTER USER {user} WITH PASSWORD '{password}';"
sql_login = f"ALTER ROLE {user} LOGIN;"
cmd = [
'psql', '-U', psql_admin, '-h', new_rds_address,
'-d', 'postgres', '-c', sql_reset
]
cmd2 = [
'psql', '-U', psql_admin, '-h', new_rds_address,
'-d', 'postgres', '-c', sql_login
]
run_process(cmd)
run_process(cmd2)
Downsize AWS RDS volume: final thoughts
After the changes are made you would want to test the data restored properly and test services using the new RDS instance. If everything is good then eventually the old RDS can be retired.
If you're interested in the above script let us know and we'll be happy to push it to our GitHub.