Master Thesis - Graph-aware RAG for Innovation Scouting
Master thesis project focused on graph-aware retrieval pipelines and innovation scouting.
MSc Applied Data Science
I build production-ready AI systems that turn noisy data into reliable decisions.
Data Scientist and ML Engineer with a clinical background and strong applied AI focus. I build robust, testable systems from ingestion to deployment, with special emphasis on LLM and retrieval-based architectures in constrained environments.
My recent work combines graph-aware retrieval, evaluation-driven experimentation, and on prem infrastructure operation. I focus on delivering systems that are technically sound and practical for real stakeholders.
Experience and education timeline.
Built reproducible ML pipelines for product categorization and tariff workflow automation.
Mar 2025 - Oct 2025Assisted with clinical data analysis projects.
Sep 2021 - Nov 2021Worked on epidemiological studies and reporting.
Oct 2020 - Jan 2021Provided physiotherapy services across multiple settings.
2016 - 2024Designed and evaluated graph-aware retrieval pipelines, with results presented in technical talks.
2025 - 2026 MayCurrent GPA: 5.6 / 6.
Feb 2023 - PresentBest Graduate 2022, GPA: 5.6 / 6.
2019 - 2022Bachelor foundation in physiotherapy and health sciences.
2013 - 2016A selection of the Technologies I use in production and research workflows.
Core implementation and analytics languages.
Modeling, statistics, and data science frameworks.
Daily development, notebooks, and collaboration tooling.
Databases, orchestration, and distributed data systems.
Cloud infrastructure and business intelligence tooling.
Retrieval and generation methods used in current AI work.
Dense/sparse storage and graph-based retrieval systems.
Efficient adaptation and evaluation workflows.
Observability, quality tracking, and operations metrics.
Secure infrastructure and production deployment patterns.
Orchestration and multimodal processing patterns.
Current focus areas across retrieval systems, secure deployment, and operational AI quality.
Designing graph-enhanced retrieval paths for explainable innovation scouting and more robust context grounding.
Running LLM and RAG workloads on prem infrastructure with controlled model and artifact handling.
Building tool-calling workflows and multimodal pipelines that combine text, image, and audio components.
Tracking latency, throughput, token cost, and retrieval quality to improve reliability in production settings.
For explicit questions please email me.
Try asking: "In which projects can you support us?"
A selected portfolio of my public ML engineering, analytics, and applied research projects during my studies.
Master thesis project focused on graph-aware retrieval pipelines and innovation scouting.

End-to-end multilingual RAG pipeline from data ingestion to answer evaluation.

Enhanced collaborative filtering and content models for better music suggestions.

Explores app features to predict what makes a future hit on the Play Store.

Hands-on lab comparing MongoDB to traditional SQL approaches.

Interactive visual analysis of how location and host attributes impact prices.

Replication of Andersson (2019) using the Synthetic Control Method.

Highlights regional differences in painkiller availability and cost.

Streamlit app showcasing a new metric for ice hockey player performance.

Population-based survey exploring gender-specific fall risk factors.
A compact feed for project notes and engineering write-ups.
2026-02-08
This section will include technical writing on applied ML systems, data pipelines, and model evaluation.
Send a short message and I will get back to you. You can also connect directly via GitHub, LinkedIn, or email.