Key Responsibilities
Acquire, ingest, and process data from multiple sources and systems
Create and enhance data collection pipeline scripts (DAGs) and ETL (Extract, Transform, Load)
Create and support internal APIs and products that enable workflows.
Support feature requests from analysts for tools/dashboards
Create and update documentation as needed.
Desired Skills
Ability to build and optimize data sets, data pipelines and architectures
Ability to perform root cause analysis on external and internal processes and data to identify opportunities for improvement and answer questions
Excellent analytic skills associated with working on unstructured datasets
Ability to build processes that support data transformation, workload management, data structures, dependency and metadata
Experience with auto scaling, performance testing and capacity planning
Experience owning infrastructure in production, as well as designing and creating build/deploy & monitoring systems
Exceptional analytical and problem-solving skills
Acute attention to detail
Excellent collaboration skills
Technology Skills
Proficient with Python. Comfortable with object-oriented programming, building data pipelines and automated unit testing
Proficient with SQL and familiarity with ORM technologies like SQLAlchemy/Django ORM
Proficient with Linux/Bash scripting for working with EC2/Batch servers
Familiarity with a cloud technology like AWS/Google Cloud/Digital Ocean etc
Familiarity with version control such as Git/SVN/Mercurial etc
Familiarity with CI/CD pipelines like Jenkins/Code Deploy/etc
Familiarity with container technologies like Docker for containerizing production code
Familiarity with IaC tools like Terraform for automating cloud deployments
Preferred Skills
AWS Certification