The primary responsibilities of our data engineer include designing, implementing, maintaining, and continually improving our data infrastructure. They must collaborate with data Analysts and data scientists to ensure high-quality and easily accessible data while maintaining security and acceptable performance levels. They should possess the vision to design a roadmap for incremental, dynamic, high fault-tolerance, and high-performance pipelines in collaboration with platform developers.
Design and implement data architecture to meet evolving business needs.
Design, implement, and maintain scalable batch data pipelines (SSIS, Airflows).
Design, implement, and maintain scalable real-time data pipelines is a plus (Kafka, Debezium, and Spark).
Skills:
Proficiency in writing complex queries and familiarity with SQL objects like Stored Procedures, Functions, etc., to implement business logic in pipelines.
Experience in performance tuning of SQL Server and queries.
Strong programming skills in Python, Java, and C#.
Experience in designing, deploying, and managing containerized applications using Docker.
Experience in object and block storage like Minio and Ceph is a plus.
Familiarity with monitoring systems such as Grafana, Prometheus, and Exporters.