مهندس DevOps

مادیران تهران

منتشر شده 2 ماه پیش

Job Description

We're seeking a versatile DevOps/SRE engineer to optimize and enhance the reliability of our credit sales system. This hybrid role will bridge the gap between development and operations, focusing on both system performance and operational efficiency. You'll work hands-on to troubleshoot and resolve performance bottlenecks, implement infrastructure automation, and ensure the system is reliable, scalable, and easily maintainable.

Key Responsibilities:

System Performance and Reliability:

  • Ensure high availability, reliability, and optimal performance of our platforms.
  • Identify and resolve performance bottlenecks and system failures.
  • Implement best practices for disaster recovery and business continuity.

Monitoring and Alerting:

  • Develop and maintain comprehensive monitoring, logging, and alerting systems.
  • Proactively detect and address issues before they impact users.

Infrastructure Optimization:

  • Optimize infrastructure for efficient resource utilization and cost-effectiveness.
  • Design scalable systems that can handle increased load and growth.

Automation and CI/CD:

  • Automate deployment, configuration management, and other repetitive tasks.
  • Design, implement, and manage CI/CD pipelines to streamline code integration and delivery.

Collaboration and Mentoring:

  • Collaborate closely with development and operations teams to ensure seamless integration and deployment of new features.
  • Mentor and train team members to enhance their DevOps and SRE skills.

Security and Compliance:

  • Ensure systems are secure and compliant with relevant regulations and standards.
  • Implement security best practices across all stages of development and deployment.

Incident Response:

  • Lead incident response efforts to quickly resolve outages and performance issues.
  • Conduct post-mortems to identify root causes and implement preventive measures.

System Performance Optimization:

  • Identify and address performance bottlenecks in the credit sales system, including those arising from third-party integrations.
  • Analyze system metrics and logs to identify areas for improvement.
  • Implement performance-tuning strategies and best practices.

Infrastructure Automation:

  • Design and implement infrastructure automation using tools like Ansible, Terraform, or similar technologies.
  • Automate deployment, configuration, and scaling processes to improve efficiency and reduce manual errors.

System Reliability:

  • Develop and maintain monitoring and alerting systems to detect and respond to incidents proactively.
  • Participate in incident management and troubleshooting, minimizing downtime and impact on users.
  • Implement disaster recovery strategies and ensure system resilience.

Continuous Integration and Deployment (CI/CD):

  • Design and implement CI/CD pipelines to streamline development and deployment processes.
  • Automate testing and quality assurance to ensure the system's reliability and performance.

Collaboration:

  • Work closely with the development team to improve code quality, optimize deployments, and foster a culture of continuous improvement.
  • Collaborate with stakeholders to understand requirements and prioritize tasks.

Qualifications:

Experience:

  • 5+ years of experience in a combined SRE/DevOps role or similar.

Technical Skills:

  • Strong knowledge of Microsoft technologies (e.g., .NET, MVC).
  • Proficiency with cloud platforms (e.g., AWS, Azure, Google Cloud).
  • Experience with containerization and orchestration tools (e.g., Docker, Kubernetes).
  • Expertise in CI/CD tools (e.g., Jenkins, GitLab CI, Azure DevOps).
  • Solid understanding of monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).|
  • Familiarity with infrastructure as code (IaC) tools (e.g., Terraform, Ansible).
  • Strong grasp of networking, security, and system architecture.

Problem-Solving:

  • Exceptional analytical and problem-solving skills with a proactive approach to identifying and resolving issues.
  • Excellent analytical and problem-solving skills with the ability to diagnose complex issues and develop effective solutions.

Communication:

  • Excellent communication and interpersonal skills, with the ability to collaborate effectively with cross-functional teams.
  • Strong communication and collaboration skills to effectively work with developers, stakeholders, and third-party vendors.

Adaptability:

  • Ability to thrive in a fast-paced environment and adapt to changing priorities.

Certifications:

  • Relevant certifications (e.g., AWS Certified DevOps Engineer, Certified Kubernetes Administrator, Google Professional Cloud DevOps Engineer) are a plus.
  • Experience: Proven track record in DevOps or SRE roles, demonstrating expertise in both system administration and software development practices.

Technical Skills:

  • Strong proficiency in troubleshooting and performance optimization for web applications, preferably using MVC .NET and Microsoft technologies.
  • Experience with infrastructure automation tools (e.g., Ansible, Terraform).
  • Expertise in CI/CD pipelines and tools (e.g., Jenkins, GitLab CI).
  • Familiarity with cloud infrastructure and technologies (e.g., Azure, AWS).
  • Scripting skills (e.g., Python, PowerShell) for automation and tooling.
  • Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).

Nice to Have:

  • Experience with financial systems or credit sales platforms.
  • Familiarity with Agile development methodologies.
  • Experience with SRE best practices and methodologies.
  • Relevant certifications (e.g., DevOps Engineer, SRE certifications from Google, AWS, or Microsoft).

برای مشاهده‌ی شغل‌هایی که ارتباط بیشتری با حرفه‌ی شما دارد،