Experience and Skills:
● 3-8 years of relevant experience
● Expert user of Python & Presto SQL
● Working experience in, the Hadoop ecosystem, Hive, Kubernetes
● Usage of various machine learning or statistical libraries, frameworks like
PySpark
Roles & Responsibilities:
● Data Engineering and Technical Delivery
● Prepare data for analysis using Presto SQL or domain-specific tool (example:
Omniture for
Digital), visualizing the data and executing to specifications
● Web Scraping using Python to get basic datasets from popular websites (.:
LinkedIn) as required, Parsing JSON objects to get the data in tabular format
● Good knowledge of databases/SQL, relevant tools like R or Python, Omniture
(if digital)
● Experience with frameworks like PySpark to handle large data
● Shows drive to increase the breadth & depth of tools and systems creating
Data schemas, building the pipelines, collecting data, and moving it into
storage.
● Preparing the data as part of ETL or ELT processes.
● Stitch the data together with scripting languages and often work with DBA’s
to construct datastores or data models.
● Ensure data is available for ready to use and use framework and microservices
to serve up the data
● Design, build and optimize applications’ containerization and orchestration
with Docker and
Kubernetes
● Stakeholder Engagement
● Grasp requirements on call and deliver to specification; Present to Senior
Management &Leadership
● Present findings to team lead/managers and to external stakeholders
● Drive stakeholder engagements by driving complex analytical projects
including bottoms-up projects
● Develop executive presentations with guidance