We are looking for an experienced AWS Developer responsible for making our app more scalable and reliable. We are currently running our services on EC2 machines using Auto Scaling Groups.
Currently, our monitoring is not working well, so you are going to be responsible for setting up a monitoring stack. Those metrics are also going to be used for service capacity planning.
Our deployment model needs an update. It is currently not possible to do automatic rollbacks, and every time a new version is deployed to our production servers, we experience some short downtime. The current CI/CD pipeline is also not reliable, and we are looking to migrate to the AWS CI/CD stack.
Responsibilities
Understand the current application infrastructure and suggest changes to it.
Define and document best practices and strategies regarding application deployment and infrastructure maintenance.
Migrate our infrastructure with zero downtime to a highly available, scalable one.
Set up a monitoring stack.
Define service capacity planning strategies.
Implement the applications CI/CD pipeline using the AWS CI/CD stack.
Write infrastructure as code using CloudFormation or similar.
Skills
Experience with the core AWS services, plus the specifics mentioned in this job description.
Good background in Linux/Unix administration.
Experience with Docker and Kubernetes.
Strong notions of security best practices (. using IAM Roles, KMS, etc.).
Experience with monitoring solutions such as CloudWatch, Prometheus, and the ELK stack.
Previous exposure to large-scale systems design.
Ability to troubleshoot distributed systems.
Knowledge of writing infrastructure as code (IaC) using CloudFormation or Terraform.
Experience with building or maintaining cloud-native applications.
Past experience with the serverless approaches using AWS Lambda is a plus. For example, the Serverless Application Model (AWS SAM).