Site Reliability Engineer (W/M)

  • EPFL
  • Geneva, Geneve, Switzerland
  • 12/10/2020
Full time Data Science Data Analytics Big Data Data Management Statistics

Job Description

The Ecole polytechnique fédérale de Lausanne (EPFL) is one of the most dynamic university campuses in Europe and ranks among the top 20 universities worldwide. The EPFL employs 6,000 people supporting the three main missions of the institutions: education, research and innovation. The EPFL campus offers an exceptional working environment at the heart of a community of 16,000 people, including over 10,000 students and 3,500 researchers from 120 different countries.

Site Reliability Engineer (W/M)

Your mission :
The aim of the EPFL Blue Brain Project (BBP), a Swiss brain research initiative founded and directed by Professor Henry Markram, is to establish simulation neuroscience as a complementary approach alongside experimental, theoretical and clinical neuroscience to understanding the brain, by building the world’s first biologically detailed digital reconstructions and simulations of the mouse brain.
We are now looking for an experienced Site Reliability Engineer to work on our high-performance computing (HPC) and other mission-critical IT systems.Main duties and responsibilities include :

  • Ensuring reliable product launches and successful periodic upgrades upon our 1200+ node HPC cluster, on-premises cloud and container platforms, large-scale parallel file system, NAS and other IT platforms with the help of modern software development, configuration management, CI/CD and infrastructure-as-code approaches
  • Improving IT service reliability by implementing SRE best practices for availability, performance, emergency response and capacity planning
  • Developing monitoring, logging and metrics tools to embrace and minimize risks
  • Automating IT processes - in order to get rid of toil, technical debt and manual work - using modern software engineering practices
  • Contributing to IT security e.g. by establishing industry best practices with regards to periodic patching and other, proactive IT security measures

Your profile :
We expect you to have strong experience in the following areas

  • Linux (e.g. RedHat/CentOS, Ubuntu) in production server environments
  • Physical server hardware and data centre infrastructure
  • Virtualized and containerized infrastructure
  • Network concepts (e.g. IP routing, DNS, VLANs)
  • Configuration & provisioning tools (e.g. Puppet, Ansible, Chef, Saltstack)
  • Programming and scripting (e.g. Python, bash)

We count as advantage your possible experience with

  • Operating large-scale storage systems (e.g. NetApp, Spectrum Scale), filesystems, data archiving (e.g. TSM) and information lifecycle management (ILM)
  • Operating large scale, Linux-based hardware infrastructure
  • Operating virtualization, cloud and container platforms (e.g. VMware, K8s)
  • Operating data centre networks built on Ethernet or InfiniBand
  • Operating HPC systems and software (e.g. Slurm, cluster managers)
  • Architecting, implementing & monitoring secure IT infrastructure
  • Stakeholder relationships, team leadership & management

Our desired candidate would have

  • Bachelor or Master degree in computer science - or similar degree or working experience
  • Detail-oriented, cautious & professional working practices and attitude
  • Understanding of TCO, compliance and IT governance factors
  • Experience managing and completing large scale IT projects
  • Interest to work in a collaborative and multi-cultural environment
  • Proven ability to work both independently and in team-based environments
  • Fluent communication in English (written and spoken)

We offer :

  • An internationally recognized research project using state-of-the-art HPC infrastructure
  • A dynamic, interdisciplinary and international working environment in picturesque Geneva
  • An opportunity to get your hands dirty with new technologies as they emerge

Start date :
As soon as possible

Term of employment :
Fixed-term (CDD)

Duration :
CDD (renewable) or CDI, negotiable

Contact :
Please provide your CV and also a cover letter (in English) in PDF format.

Remark :
Only candidates who applied through EPFL website or our partner Jobup’s website will be considered