Internships
Register
Copyright © 2000—2026 JetBrains s.r.o.

Hallucination Detection

Description

The goal of this project is to evaluate how well methods for detecting hallucinations by LLMs generalize across datasets and domains. A key challenge is that many hallucination datasets that are used for tuning hallucination detectors are implicitly tied to specific model versions. These datasets may become outdated as models improve. This project focuses on studying cross-dataset robustness: whether a detector tuned on one dataset remains effective on others.

The primary task is to implement and evaluate a hallucination detector based on geometric methods. The detector should be implemented end-to-end, its parameters tuned on a reference dataset, and then evaluated on one or more additional datasets from the list below. This will require preparation and analysis of data to ensure tuning and evaluation setups that are comparable to each other. All experiments will be conducted with local language models.

Candidate datasets include:

  • HaluEval (published in ACL), 2023;

  • MedHal, 2025;

  • Med-HALT (published in ACL), 2023;

  • MedHallu, 2025;

  • Med-MMHL, 2023;

  • HalluQuestQA (published in ACL), 2025.

The expected outcome is (i) a working, well-documented implementation of a hallucination detector, and (ii) empirical observations on the stability, transferability, and limitations of hallucination detection methods across datasets and domains. Any justified modifications, extensions, or improvements to the baseline method are explicitly encouraged and considered a valuable result of the project.

Requirements

  • Basic proficiency in Python and experience using it for data analysis or machine learning tasks.

  • General familiarity with machine learning and deep learning concepts.

  • Introductory knowledge of language models and common issues such as hallucinations or adversarial attacks on models.

  • Ability and willingness to read research papers and implement methods described in them.

  • Readiness to work with datasets, run experiments, and analyze results.

  • Interest in research-style experimentation and improving existing methods.


The applicant is required to complete a test task described below.

Admission

Internship Projects Summer/Fall 2026

Contact details

internship@jetbrains.com

Preferred internship location

Armenia
Cyprus
Czechia
Germany
Netherlands
Poland
Serbia
Spain
UK

Technologies

Deep learning
Natural languages
Python

Area

Machine Learning

Internship timing preferences

Part-time acceptable
Applications by 16.03.2026
Interview by 17.04.2026
Feedback and final results by 22.04.2026