Job Location: Pune (Work from Office)
Employment Type: Full-Time
Experience: 4+ Years

Job Overview:

We are looking for a skilled Data Engineer to design, build, and optimize scalable data pipelines and systems. The ideal candidate will have hands-on experience with data lake development, orchestration tools, and big data technologies. A strong foundation in Python and SQL is essential, along with expertise in optimizing data processing frameworks.

Key Responsibilities:

  • Design and implement robust data pipelines to support analytics and reporting.
  • Develop and maintain data lakes for efficient data storage and processing.
  • Optimize Spark/PySpark and Hive applications for performance and scalability.
  • Implement and manage workflow orchestration using Apache Airflow.
  • Collaborate with cross-functional teams to ensure data quality, reliability, and accessibility.


Required Skills:

  • Programming: Advanced proficiency in Python.
  • SQL Expertise: Strong ability to write and optimize complex SQL queries.
  • Big Data Tools: Hands-on experience with Spark and Hive.
  • Data Lakes: Proven experience in data lake development.
  • Orchestration: Experience with Apache Airflow or similar tools.


Preferred Skills (Good to Have):

  • Familiarity with Trino or AWS Athena.
  • Experience with Snowflake.
  • Understanding of data quality frameworks and best practices.
  • Knowledge of file storage solutions like Amazon S3.

Qualifications:

  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field.
  • Demonstrated ability to design and optimize scalable data solutions.
  • Strong problem-solving and analytical skills.


    Apply for this position

    Allowed Type(s): .pdf, .doc, .docx