Job Description:
We are looking for Data Engineers to join our team. You will use various methods to transform raw data into useful data systems. For example, youll create algorithms and conduct statistical analysis. Overall, youll strive for efficiency by aligning data systems with business goals. To
succeed in this position, you should have strong analytical skills and the ability to combine data from different sources. Data engineer skills also include familiarity with several programming languages and knowledge of machine learning methods.
Job Requirements:
Participate in the customers system design meetings and collect the functional/technical requirements.
Build up data pipelines for consumption by the data science team.
Skillful in ETL process and tools.
Clear understanding and experience with Python and PySpark or Spark and SCALA, with HIVE, Airflow, Impala, and Hadoop and RDBMS architecture.
Experience in writing Python programs and SQL queries.
Experience in SQL Query tuning.
Experienced in Shell Scripting (Unix/Linux).
Build and maintain data pipelines in Spark/Pyspark with SQL and Python or SCALA.
Knowledge of Cloud (Azure/AWS/GCP, etc..) technologies is additional.
Good to have knowledge of Kubernetes, CI/CD concepts, Apache Kafka
Suggest and implement best practices in data integration.
Guide the QA team in defining system integration tests as needed.
Split the planned deliverables into tasks and assign them to the team.
Needs to Maintain/Deploy the ETL code and follow the Agile methodology
Needs to work on optimization wherever applicable.
Good oral, written and presentation skills.
Preferred Qualifications:
Degree in Computer Science, IT, or a similar field; a Masters is a plus.
Hands-on experience with Python and Pyspark Or
Hands-on experience with Spark and SCALA.
Great numerical and analytical skills.
Working knowledge of cloud platforms such as MS Azure, AWS, etc..