Logo des HTGF

Portfolio Jobs

Browse through our portfolio jobs and select a suitable opportunity.

Our HTGF Portfolio of startups offers excellent opportunities and is always hiring talented people.

0
COMPANIES
0
JOBS

(Senior) Data Manager / Data Governance for AI (f/m/d)

Mbiomics GMBH

Mbiomics GMBH

Software Engineering, Data Science
Munich, Germany
Posted on Oct 18, 2025

Company Summary:

mbiomics was founded in 2020 by an experienced founding team with the vision to deliver effective microbiome-based therapeutics that will revolutionize the treatment of many chronic diseases. We recently closed a Series A funding, which enables us to start the development of game-changing live-bacterial therapeutics (LBT). mbiomics is leveraging its tailored microbiome analysis platform to overcome challenges in LBT development – by providing precision profiling data for improved bacterial consortia selection, patient stratification, and patient monitoring for clinical trials.

Position Summary:

You will be responsible for establishing mbiomics’ data governance framework while also building the data infrastructure to make research and clinical data usable, compliant, and AI/ML-ready. In this role you will set standards, implement cloud-based solutions, and collaborate closely with scientists to improve the capture, organization, and accessibility of our data.

As a growing startup, this position combines research and clinical data governance leadership with hands-on implementation. You will also play a key role in preparing mbiomics for the next generation of AI/ML applications by ensuring data is standardized, annotated, and interoperable, and by enabling teams to make effective use of enterprise AI tools.

Responsibilities/Duties:

Governance & Compliance

  • Define and implement data governance policies (data lifecycle, data security, data integrity, metadata, standards, access, retention, lineage, alignment to infrastructure).
  • Promote FAIR data principles across all R&D teams.
  • Ensure compliance with data protection regulations (e.g., GDPR, EU IA act, HIPAA when applicable).
  • Create governance documentation and ensure adoption of good practices across the company.

Implementation & Infrastructure

  • Build and maintain pipelines/workflows for ingesting, organizing, and validating data (GCP/BigQuery preferred).
  • Automate repetitive tasks such as metadata capture, file organization, and ontology mapping.
  • Put in place a data catalogue and documentation systems for discoverability and reusability.

AI Readiness & Enablement

  • Ensure datasets are properly structured, annotated, and versioned for AI/ML model training.
  • Collaborate with computational biology and data science teams to prepare training datasets.
  • Evaluate and integrate enterprise AI tools (LLM copilots, agents, workflow assistants) to accelerate documentation, validation, and reporting.
  • Help non-programmers use safe, structured workflows with AI assistants for data exploration and reporting.

Collaboration & Cross-Functional Role

  • Work closely with scientists to implement structured data capture at the point of generation.
  • Provide training and support to promote adoption of governance standards and AI-ready practices.
  • Build lightweight dashboards or interfaces for exploration, QC, and usability.
  • Serve as a bridge between research, computational biology, IT, and leadership.
  • Communicate progress, risks, and needs clearly across stakeholders.

Required Skills and Competencies:

  • Strong understanding of data governance frameworks, FAIR principles, and metadata standards.
  • Hands-on experience with cloud-based data management (GCP preferred).
  • Strong skills in Python/SQL for data wrangling and pipeline development.
  • Experience preparing datasets for AI/ML model training.
  • Knowlege of at least one data management framework (e.g., Amsterdam, DAMA-DMBOK)
  • Comfort with enterprise AI tools (LLM copilots, agent frameworks, documentation assistants).
  • Familiarity with modern ML tooling (e.g., PyTorch, HuggingFace, OpenAI API, RAG architectures) as well as traditional approaches.
  • Comfortable working with non-programmers, in research settings with ambiguity and iterative feedback cycles.
  • Excellent communication and training skills for working with non-programmers.
  • Highly organized, detail-oriented, and comfortable balancing strategy with hands-on execution.
  • Desire to pursue professional development by acquiring new skills, presenting work at internal and external venues, and acting on feedback

Preferred

  • Background in life sciences, bioinformatics, or NGS data.
  • Familiarity with workflow/containerization tools (Nextflow, Docker).
  • Experience with clinical or regulatory data management.
  • Knowledge of ontologies and controlled vocabularies in biology.

Nice to Have

  • Familiarity with CDISC

Education and Experience

  • College degree in related field (CS / software engineering / bioinformatics)
  • At least 6-12 months industry experience
  • Experience with data analysis and workflow development

Environment at mbiomics

  • Experience the unique dynamics and spirit of a biotech start-up
  • Our team? A colorful mix of international talents, humor and brilliant minds.
  • We offer flexible working hours and 30 days of vacation.
  • Benefits include our job ticket offer, free coffee, fruit and candy and regular team-events.