Responsible for the building, deployment, and maintenance of mission critical analytics solutions that process data quickly at big data scales.
Contributes design, code, configurations, and documentation for components that manage data ingestion, real time streaming, batch processing, data extraction, transformation, and loading across multiple data storages.
Owns one or more key components of the infrastructure and works to continually improve it, identifying gaps and improving the platform’s quality, robustness, maintainability, and speed.
Cross-trains other team members on technologies being developed, while also continuously learning new technologies from other team members.
Interacts with engineering teams and ensures that solutions meet customer requirements in terms of functionality, performance, availability, scalability, and reliability.
Performs development, QA, and dev-ops roles as needed to ensure total end to end responsibility of solutions.
Contribute in CoE activities and community building, provide excellence in exercise and best practices.
Requirements:
3+ years of experience coding and building ETL pipelines in SQL with solid CS fundamentals including data structure and algorithm design preferably in SSIS and Azure Data Factory.
2+ years contributing to production deployments of large backend data processing and analysis systems.
1+ years of experience working with a combination of any of the following technologies: Hadoop, Map Reduce, Pig, Hive, Impala, Spark, Kafka, Storm, SQL data warehouses and NoSQL data warehouses (such as HBase and Cassandra).
1+ years of experience in Azure data platforms: Azure Data Factory, Azure Synapse, ADLS Gen 2, Azure SQL, MongoDB, Azure Databricks, Delta Lake.
Knowledge of any SQL and MPP databases (e.g., Azure SQL DWH, PDW, Oracle Exadata, Vertica, Netezza, Greenplum, Aster Data).
Knowledge of professional software engineering best practices for the full software.
Knowledge of Data Warehousing, design, implementation and optimization.
Knowledge of Data Quality testing, automation and results visualization.
Knowledge of BI reports and dashboards design and implementation.
Knowledge of development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations.
Experience participating in an Agile software development team, e.g., SCRUM.
Experience designing, documenting, and defending designs for key components in large- distributed computing systems.
A consistent track record of delivering exceptionally high-quality software on large, complex, cross-functional projects.
Demonstrated ability to learn new technologies quickly and independently.
Ability to handle multiple competing priorities in a fast-paced environment.
Undergraduate degree in Computer Science or Engineering from a top CS program required. Masters preferred.
Experience with supporting data scientists and complex statistical use cases highly desirable.