Application deadline date has been passed for this Job.
Mitesh Patel
Job Overview
Responsibilities
- Develop and maintain scalable data pipelines using Python, Spark/PySpark, and SQL.
- Design and implement efficient data ingestion processes from various sources, including Kafka.
- Build and maintain ETL processes to transform and cleanse data.
- Collaborate with data scientists and analysts to understand data requirements and ensure data quality.
- Troubleshoot and resolve data pipeline issues.
Essential Skills
- Strong proficiency in Python programming.
- Extensive experience with Spark/PySpark for large-scale data processing.
- Solid SQL skills for working with relational databases.
- Hands-on experience with Kafka or similar streaming technologies.
Good to have
- Experience with data modeling tools like Erwin.
- Basic knowledge of cloud platforms, preferably AWS.
- Experience with workflow orchestration tools like Airflow.
- Familiarity with containerization technologies (Docker, Kubernetes).
- Version control experience with Git.