Cloudforums.net was built with the goal of delivering a modern cloud application on a reasonable budget while utilizing enhanced AWS services and automation using terraform.
Our Discourse VPC on AWS includes public and private subnets for communication with web (external) and db (internal). We are utilizing an EC2 instance for our web/application server and Amazon RDS for our database. Communication between Web/App is done through a private subnet.
We are using strict security groups between EC2 instances and services. No non web traffic is allowed inbound our public instances. A WAF is being used to filter web requests, so malicious requests like SQL Injection or XSS are blocked .We are also using industry best practice for MFA and SSH through jumpcloud.
You may wonder how the DNS is configured automatically on our terraform scripts when public IP address changes. As part of the terrafrom script, the public IP is pulled from the instance and an A record is created for the domain through Route 53
While our installation is not HA due to costs, we spin up a new installation fairly quickly in the event of a disaster. Our entire installation process is scripted from terraform so we can have the site in a different region in the matter of minutes.
Scenario 1 - EC2 Availability zone is down
This one is quite easy to fix and our database is in a different region. We would just run a terraform apply to spin up the instance in another region. The instance would spin up and automatically attach to the RDS service
Scenario 2 - RDS Service goes down.
This one is a little less convenient. Since we are just a for fun forum, it was not cost effective for us to us HA with our RDS service. Our strategy is to take regular backups. Right now the backup interval is hourly so we could potentially lose data. We do have a terraform script for spinning up RDS as well. So we can actually get this up quickly. The question will be how much data did we potentially lose? An hour should be the maximum.