Key Responsibilities
ï‚·Acquire, ingest, and process data from multiple sources and systems
ï‚·Create and enhance data collection pipeline scripts (DAGs) and ETL (Extract, Transform, Load)
ï‚·Create and support internal APIs and products that enable workflows.
ï‚·Support feature requests from analysts for tools/dashboards
ï‚·Create and update documentation as needed.
Desired Skills
ï‚·Ability to build and optimize data sets, data pipelines and architecturesÂ
ï‚·Ability to perform root cause analysis on external and internal processes and data to identify opportunities for improvement and answer questionsÂ
ï‚·Excellent analytic skills associated with working on unstructured datasetsÂ
ï‚·Ability to build processes that support data transformation, workload management, data structures, dependency and metadata
ï‚·Experience with auto scaling, performance testing and capacity planning
ï‚·Experience owning infrastructure in production, as well as designing and creating build/deploy & monitoring systems
ï‚·Exceptional analytical and problem-solving skills
ï‚·Acute attention to detail
ï‚·Excellent collaboration skills
Technology Skills
ï‚·Proficient with Python. Comfortable with object-oriented programming, building data pipelines and automated unit testing
ï‚·Proficient with SQL and familiarity with ORM technologies like SQLAlchemy/Django ORM
ï‚·Proficient with Linux/Bash scripting for working with EC2/Batch servers
ï‚·Familiarity with a cloud technology like AWS/Google Cloud/Digital Ocean etc
ï‚·Familiarity with version control such as Git/SVN/Mercurial etc
ï‚·Familiarity with CI/CD pipelines like Jenkins/Code Deploy/etc
ï‚·Familiarity with container technologies like Docker for containerizing production code
ï‚·Familiarity with IaC tools like Terraform for automating cloud deployments
Preferred Skills
ï‚·AWS Certification