Companies you'll love to work for

Site Reliability Engineer

Cognism

Cognism

Software Engineering
Croatia
Posted on Thursday, June 13, 2024

Cognism is a market leader in international sales intelligence. Access to our premium data, has helped a wide variety of global revenue teams change their approach to prospecting, resulting in predictable and prosperous outcomes.

Following multiple successful funding rounds and the acquisition of Mailtastic (2020), an email signature solution provider, and Kaspr (2022), a Paris-based sales prospecting tool, there has never been a more exciting time to join us.

As we grow, one of our main objectives is to continue hiring individuals, who are both a professional and cultural fit for our Company. Our values are at the core of everything we do!

Our people;

  • Are Nice!
  • Are Collaborative. We’re in this together!
  • Are Solution-Focused. For every problem, we’ve got a solution!
  • Are Understanding.
  • Celebrate Individual Contributors.

We are committed to creating a diverse and inclusive global workplace, which encourages you to achieve any goals you may have, while having fun along the way!

Summary:

As a Site Reliability Engineer (SRE) at Cognism, you'll play a crucial role in ensuring the reliability, scalability, and performance of our systems. You'll collaborate with cross-functional teams to design, implement, and maintain our infrastructure, with a focus on automation, monitoring, and incident response.

If you are detail-oriented, with excellent organizational skills and experience in this field, we’d like to hear from you.

Key Responsibilities:

  • Monitoring and Alerting: Implement robust monitoring and alerting solutions to proactively identify and resolve potential issues before they impact the system's performance or availability.

  • Incident Management: Lead incident response efforts, investigate root causes of incidents, and implement preventative measures to minimize future occurrences.

  • Scalability and Performance Optimization: Collaborate with software engineers to optimize application performance and scalability, ensuring seamless operation under varying workloads.

  • Reliability Engineering: Continuously improve system reliability through capacity planning, disaster recovery planning, and fault-tolerant design.

  • Cross-Functional Collaboration: Work closely with development, operations, and QA teams to streamline the deployment process, improve system reliability, and enhance overall product quality.

  • Documentation and Knowledge Sharing: Document system architecture, configurations, and procedures, and actively participate in knowledge-sharing initiatives to foster a culture of learning and collaboration.

Our ideal candidate:

Required:

  • University degree in Computer Science or software engineering related field

  • Proven experience as a Site Reliability Engineer or similar role.

  • Experience with monitoring tools and log analysis (Prometheus, Grafana, ELK stack)

  • Proficiency in using Open Telemetry for distributed tracing and application instrumentation.

  • Experience with the core AWS services (EC2, ECS, EKS, VPC, S3, IAM, Lambda, Route53, CloudFront)

  • Solid understanding of Linux operating systems

  • Knowledge in coding or scripting for systems automation (Python, Bash, etc.)

  • Experience with network and system administration

  • 2+ years of experience working with Cloud Technologies (AWS, Azure, GCP etc.)

  • Proficiency in spoken and written English

Considered a plus:

  • Infrastructure as a code (Terraform)

  • Experience with Java/Scala and Python development

  • Experience with containerization and clustering technologies (Docker, Kubernetes etc.)

  • Familiarity with CI/CD pipelines and associated tools (e.g., GitHub Actions).

  • Experience with REST, micro-services orchestration

We look forward to hearing from you!