Infomaniak represents 30 years of expertise and more than 290 passionate individuals, with a common ambition: to create an ethical cloud without compromising on ecology, privacy and humanity.
We create data centers that are at the forefront of ecological innovation and develop IaaS, PaaS, and SaaS services entirely hosted and developed in Switzerland for B2B and B2C. Our solutions include a suite of online collaborative applications and cloud hosting, streaming, marketing, and event solutions.
With millions of users and the trust of public and private organizations across Europe — such as RTBF, the United Nations, central banks, more than 3,000 radio and TV stations, and numerous metropolitan areas and security agencies — Infomaniak is an independent company committed to technological independence in Europe, the local economy and a more sustainable digital world for the planet.
Are you ready to join a rapidly growing company, give your best, and grow with us to contribute to the development of an ethical alternative to the web giants? Then we can't wait to meet you!
We are looking for a:
AI DevOps / Infrastructure / Optimisation
Context :
Infomaniak develops an open-source AI platform hosted in its own Swiss data centers. We deploy language models at scale and build intelligent agents for our products (kChat, kMeet, kDrive). We are looking for an AI DevOps engineer to design, implement, and optimize our AI agents, with a focus on quality, reliability, and user experience.
Your responsibilities:
- Deployment & Orchestration: Deploy, maintain and optimize LLMs on Kubernetes while maximizing the efficiency of GPU / Compute resources.
- CI/CD & Automation: Improve and industrialize our GitLab CI pipelines for AI models (build, test, deployment, rollback). Manage deployments via Flux CD (GitOps).
- Monitoring & Observability: Strengthening our Prometheus / Grafana / Victoria Metrics stack for fine visibility on performance, GPU consumption, latency, availability and overall health of AI services.
- Resource optimization: Working on cost and performance efficiency (autoscaling, scheduling, quota management, image optimization, etc.)
- Quality & Reliability: Ensuring the robustness, security, and reproducibility of deployments in a critical environment
The profile that excites us:
- Proficiency in modern serving frameworks (e.g., vLLM, TGI, TensorRT-LLM…)
- Proficiency in GitLab CI (pipelines, runners, variables, integration with Kubernetes).
- Proven experience in Kubernetes (operators, Helm, CRDs, networking, autoscaling).
- Experience with Flux CD (GitOps, HelmReleases, Kustomize, deployment automation).
- Experience with Prometheus / Grafana (dashboards, alerting, exporters).
- Knowledge of GPU infrastructures (NVIDIA, CUDA, GPU scheduling, monitoring).
- A taste for quality, reliability and performance.
- Ability to work in a critical environment (high SLA, high availability).
- Good ability to collaborate with ML and Dev teams.
A plus if you have knowledge in:
- Technical curiosity, a taste for innovative challenges and optimization.
- Open-source contributions or side projects are welcome.
- You enjoy working in a team and demonstrate positive communication skills.
- Your humor, flexibility, and team spirit are essential assets for working in a fun environment.
The technical stack we use
- LangChain
- Pydantic-ai
- vLLM
- FastAPI
- Gitlab
- Sentry
- Qdrant
The position:
- Permanent contract
- Occupancy rate: 80-100%
- Location: Geneva
- Availability: As soon as possible
The steps in the recruitment process:
- An initial technical interview to validate your skills.
- A second interview in our offices