3+ years of experience with AWS services including SQS, S3, Step Functions, EFS, Lambda, and OpenSearch.
Strong experience in API integrations, including experience working with large-scale API endpoints.
Proficiency in PySpark for data processing and parallelism in large-scale ingestion pipelines.
Experience with AWS OpenSearch APIs for managing search indices.
Terraform expertise for automating and managing cloud infrastructure.
Hands-on experience with AWS SageMaker, including working with machine learning models and endpoints.
Strong understanding of data flow architectures, document stores, and journal-based systems.
Experience in parallelizing data processing workflows to meet strict performance and SLA requirements.
Familiarity with AWS tools like CloudWatch for monitoring pipeline performance.
Additional Preferred Qualifications:
Strong problem-solving and debugging skills in distributed systems.
Prior experience in optimizing ingestion pipelines with a focus on cost-efficiency and scalability.
Solid understanding of distributed data processing and workflow orchestration in AWS environments.
Soft Skills:
Strong communication and collaboration skills, with the ability to work effectively with cross-functional teams.
Ability to work in a fast-paced environment and deliver high-quality results under tight deadlines.
Analytical mindset, with a focus on performance optimization and continuous improvement.