top of page

Sertis is a leading Data and AI company based in the heart of Bangkok.


We provide both off-the-shelf and customised solutions for our clients ranging from data infrastructure, BI development, and data-driven business insights to forecasting, optimization, and computer vision. Our expert team of data and AI consultants work closely with clients, across different industries such as retail, manufacturing, banking, energy, airlines, agriculture, and healthcare, to understand their business needs and deliver bespoke solutions using cutting-edge technologies that are just right for them.

Our aim is to be one of the leading Data and AI companies globally, where a diverse mix of talent want to come, stay, and do their best work. We pride ourselves on bringing not only the best, but also nice, talent from around the World. We recognise that our company runs on our people's hard work and dedication while maintaining a culture that encourages learning, growth opportunity, innovative contribution, and a sense of ownership.

About the Job

Our Senior-Lead Site Reliability Engineer will be responsible for improving the efficiency and reliability of our software development and deployment processes, as well as ensuring the availability, performance, and scalability of our systems and services. You will work closely with our Machine Learning, Software Engineering, Quality Assurance, and Data Engineering teams to automate and streamline the build, test, and deployment of our systems and services through automation. Additionally, you will be responsible for designing and building new infrastructure, as well as supporting pre-sales activities and continuously improving our CI/CD pipeline, monitoring and processes.

What are the responsibilities of a Senior-Lead Site Reliability Engineer?
In this role, you will get to:

  • Automate infrastructure provisioning, configuration management, and deployment processes

  • Ensure the availability, performance, and scalability of our systems and services by continuously monitoring and maintaining them. 

  • Implement and maintain SLAs to meet or exceed customer expectations and to ensure that the systems are operating effectively and efficiently

  • Design, build, and maintain the CI/CD pipeline to ensure the efficient and reliable deployment of software releases

  • Develop and maintain runbooks/playbooks, and procedures for responding to incidents and for performing regular maintenance activities

  • Work closely with the different engineering teams to identify and resolve production issues, establish and implement best practices for reliability and performance, improve the overall quality and efficiency of the systems

  • Conduct incident response and post-mortem analysis to identify root causes and prevent future incidents

  • Share your expertise across the team and mentor the junior/mid-level via code reviews, 1:1 sessions, workshops or knowledge sharing sessions, to enhance their technical skills and understanding of best SRE practices

  • Participate in the recruitment in order to evaluate and interview candidates, as well as improving our recruitment processes

Requirements for this role are:

  • 5-8 years of hands-on experience in designing, building, maintaining cloud infrastructure, and applying DevOps and SRE practices in large-scale systems

  • In-depth knowledge about container orchestration principles and techniques, including hands-on experience with Docker and platforms such as Kubernetes, to effectively manage and deploy containerized applications at scale

  • In-depth knowledge of cloud infrastructure and its components, including virtual machine, serverless, storage, networking, and security, with hands-on experience in deploying and managing applications in cloud environments, to ensure optimal utilization and cost-effectiveness of cloud resources

  • Ability to design and build new infrastructure and continuously improve the CI/CD pipelines

  • Strong automation and IaC skills, including experience with tools such as Terraform, AWS CDK, Flux/ArgoCD, Helm, and Gitlab CI

  • Ability to scope projects, define architectures, and choose technologies based on project requirements

  • Experience with monitoring (Prometheus/Grafana, Kibana/Elasticsearch preferred) and defining SLAs

  • A secure by design mindset and understanding of the importance of security in the development and deployment process

  • Strong problem-solving skills and experience in troubleshooting production issues

  • Ability to work collaboratively with different teams

  • Leadership skills and ability to mentor junior and mid-level Site Reliability Engineers

  • Familiarity with Agile, DevOps and SRE best practices and methodologies

  • Proactiveness in keeping up to date with the latest technology and industry trends

  • Excellent communication (written/verbal) skills, and the ability to effectively communicate with technical and non-technical stakeholders

It’s nice if you have experience with:

  • Gathering customer requirements and estimating project scope during pre-sales interactions

  • Working on multiple projects simultaneously

  • Penetration testing tools such as Nessus, Nikto, nmap, etc. and have been using them before

  • Holding certifications of cloud providers or CNCF (such as CKA/CKAD, AWS/GCP/Azure).

  • Optimizing cloud costs through various strategies

  • Utilizing service mesh technologies, such as Istio or Linkerd

  • Implementing canary deployment strategies for testing and deploying new releases in a controlled and safe manner

Who excels in this role? 

  • Someone who loves getting things done!

  • Open-minded - eager to ask for comments/suggestions for improvement

  • Passionate for anything and everything data

  • Able to share and suggest ideas

  • Love doing tons of research

  • Have a can-do and will-do attitude!

  • Ready to tackle any challenges

What are some of the benefits of working at Sertis? 

  • Hybrid working environment

  • Up early or slow starter in the morning? We have flexible office hours

  • Mentorship programs for every level; from executive-level coaching to fresh grad

  • Learning support- to help you build your skillset and grow your career

  • Get to work and learn from the best in the industry, and share your ideas with like-minded individuals

  • ​We cultivate intelligence and learning so that our experts can become community leaders in their respected fields in the tech industry

  • Amazing colleagues to enjoy company social outings, parties, and events

  • Result-oriented workplace; We provide direction, not orders and give you the autonomy to deliver your best work

  • We work at the frontier of innovation in the AI industry

  • Work on meaningful solutions that solve and improve real-life problems and challenges

  • We run like a startup, and embrace the adventure; we focus on getting things done, while still having a down-to-earth and informal culture

​This is your chance to build your career in a growing data-driven industry. 


Senior-Lead Site Reliability Engineer

Data analytics.png

Engineering | Full-time

Be a part of



“ Lay the greatest foundation
for the best AI solution ”

Mask group (9).png

5 steps for interview

*Steps will vary depending on department
*Most if not all interviews may be conducted virtually

Benefits & Perks


Hybrid Work

Good work can be done from anywhere. 

Vector copy.png

Vacation and Personal Leave

We encourage all our teams to take time off, recharge and refocus.

Vector (3).png

Learning Support

We support you to learn, explore and grow your creativity.


Fitness Support

Keeping Sertizens healthy in all areas of life is a priority.


Referral Bonus

Good people know good people! 


Provident Fund

It's important to think about the future. 

bottom of page