We can follow these best practices to build a resilient system in AWS:
- Backup: We need a useful and fast, backup and restore strategy for our data. The backup and restore process should be automated.
- Reboot: Since nodes crash and new nodes restart in AWS, it is good to build threads that automatically resume on reboot of the node.
- Re-sync: The system in AWS cloud should be able to re-sync itself by reloading messages from queues.
- Images: We need to maintain pre-configured and pre-optimized virtual images to restore the system. Also these images should be pre-configured to restart processes on reboot automatically.
- In-memory sessions: Wherever possible we should minimize the use of in-memory sessions and stateful user context in AWS.