Pyspark Developer

Job Description

This role is responsible for developing and implementing data warehouse solutions for company-wide applications and managing large structured and unstructured data sets. The candidate will be expected to analyze complex customer requirements and work with the data warehouse architect to gather, define, and document data transformation rules and implementation. Data Engineers will be expected to have a broad understanding of the data acquisition and integration space and be able to weigh the pros and cons of different architectures and approaches.

You will have a chance to learn and work with multiple technologies and Thought Leaders in the domain.

Responsibility:

Translate the business requirements into technical requirements
ETL development using native and 3rd party tools and utilities
Write and optimize complex SQL logic to PySpark.
Design and develop code, scripts, and data pipelines that leverage structured and unstructured data
Data ingestion pipelines and ETL processing, including low-latency data acquisition and stream processing
Design and develop processes/procedures for integration of data warehouse solutions in an operative IT environment.
Monitoring performance and advising any necessary changes.
Create and maintain technical documentation that is required in supporting solutions.

Skills and Qualifications:

The candidate should have 2-3 yearsof experience in the BI/Data Engineer field with PySpark
Good understanding and basic working experience with at least one cloud service provider: AWS, Azure, or Google Cloud, and their native tools. (Azure Preferred)
Hands-on experience with at least one ETL tool like SSIS or Data Factory (Preferred).
Strong RDBMS concepts and SQL development skills
Knowledge of data modeling and data mapping
Experience with Data Integration from multiple data sources
Good Data warehouse & ETL concepts
Understanding of one or more business areas and industries: Telecom, Retail, Financial, etc.
knowledge of Big Data technologies such as Pig, Hive, Spark, Kafka, Nifi (Good to have)
Experience in any of one development or scripting languages e.g. Java, Groovy, or Python (Preferred)
Good understanding of Agile delivery methodologies.
Training/Certification on any cloud service will be a plus.
Good communication and analytical skills
Ability to work in a dynamic and collaborative team environment, demonstrating excellent interpersonal skills

Job Description

Job Summary