Learnings from Running Amazon Aurora MySQL in Production
Using Aurora MySQL is a no brainer decision when running workloads in the Production environment. The Overhead of running a database is greatly reduced due to the managed services offered by cloud providers, especially with AWS. Some of the learning from running Aurora MySQL in Production
Fine Tuning the Application
- Ensure connection pooling is used optimally. Opening and closing the connection should be done with utmost care as this can kill the database.
- Timeout for queries has to be kept configured depending on the requirement. Long-running queries can bring the database to its knees.
- Fine-tune queries for better performance. Avoid queries with like, distinct that hit the database performance
- Analyze the slow queries using the Slow Query Log and refine accordingly.
- Find commands that can help to kill the long-running queries in case some of them end up running
- Sharding can be considered to split data across different databases for better performance. This can be considered when the total number of rows in a table crosses a few million. This can slow the database during fast deletes and inserts.
- Sharding can be done in the same DB instance with different databases. Sharding can be done based on any type of attribute. Depending on the geography, customer/client id, language or any other parameter as well.
Scaling
- Use Read Replicas to improve the read throughput.
- Read Replicas endpoint based on the application use-case can be considered so that the same replica is not hammered with too many requests.
- Multiple Writer setup for multi-region configuration.
- Vertical scaling can be considered only when none of the other options brings in the required performance.
Redundancy
- Multi-AZ configuration for better availability
- Multi-Region configuration with multiple writers and replication setup.
- Aurora has a Global Database concept that can also be used as it provides better convenience and ease of setup/ configuration.
Backup and Replication
- Schedule regular backups and ensure backups are available in a different region.
- Replication latency should be kept to a minimum.
Alerts and Monitoring
- Leverage Aurora Insights Dashboard
- Helps to view the Query Performance
- Database level performance can be ascertained
- DB Timeouts and Deadlocks, slow queries can be determined
- CPU usage and query latency be easily
- Configure High CPU Utilization alert and DB Node restart alerts as part of monitoring. There are other useful database metrics available that can be used for better observability.
- Understand Slow Query Logs - how they can be viewed and downloaded to analyze the bottlenecks.
Aurora Serverless
- Aurora Serverless is the new kid in the block and can be considered based on the required elasticity
- Scaling requirements are handled by Aurora itself
- Charges based on the running time of the CPU
- Instead of using database connections, Serverless also provides an API that can be used to interact with the database.
These are some of the aspects that needed attention while running Amazon Aurora MySQL in Production. I will keep this posted updated with any new interesting findings that I come across with running Aurora MySQL on Production.