Job Description
Hiring for our client firm based in Lahore Islamabad Karachi
Job Nature: Hybrid
Experience: 5 years or more
Responsibilities:
- Design and develop robust, scalable, high-performance data pipelines and ETL processes to extract, transform, and load data from various sources into our data warehouse or data lake.
- Collaborate with stakeholders to understand their data requirements, and design and implement appropriate data models and database schemas to support their needs.
- Optimize data pipelines and ETL processes for performance and efficiency, ensuring timely and accurate data delivery to end-users.
- Monitor, troubleshoot, and resolve issues related to data quality, data consistency, and data integrity, ensuring the reliability and correctness of our data systems.
- Implement and maintain data governance practices and policies, ensuring compliance with data privacy and security regulations.
- Collaborate with data scientists and analysts to provide them with the necessary data infrastructure and tools for conducting advanced analytics and deriving insights.
- Stay up to date with the latest trends and technologies in data engineering and recommend innovative solutions to improve data engineering processes and systems.
- Document data engineering processes, data flows, and system architectures to ensure knowledge sharing and maintain an up-to-date repository of technical documentation.
- Work closely with cross-functional teams, including software engineers and infrastructure teams, to optimize data infrastructure and ensure its seamless integration with other systems.
Requirements:
- 5+ years of proven experience in Data warehouse.
- Bachelor’s degree in computer science, Engineering, or a related field.
- Proven experience as a Data Engineer or in a similar role, working with large-scale data processing and ETL pipelines.
- Strong programming skills in languages such as Python, Java, or Scala, with experience in data manipulation and processing frameworks like Apache Spark.
- Experience with SQL and database technologies (e.g., relational databases, SQL queries, data modeling).
- Proficiency in working with big data technologies such as Hadoop, and Hive and knowledge of distributed systems and cloud computing platforms (e.g., AWS, Azure, GCP).
- Familiarity with data integration and workflow management tools such as Apache Airflow.
- Knowledge of data warehousing concepts and experience with data warehousing solutions is highly desirable.
- Strong analytical and problem-solving skills, with the ability to analyze complex data-related issues and propose effective solutions.
- Excellent communication and collaboration skills, with the ability to work effectively in a cross-functional team environment.
- Attention to detail and a strong commitment to delivering high-quality work within established timelines.