As the Data Ingestion team, we are responsible for the SDK used to transmit user statistics and for reviewing telemetry-related code within JetBrains. Our primary goal is to ensure that no Personally Identifiable Information (PII) or private data is collected, that users remain strictly anonymous, and that all telemetry functions as intended.
To strengthen our review process, you will implement an LLM-driven validation layer designed to enforce data anonymity and prevent PII leaks. This integration will automatically audit statistics to help us ensure all data remains non-identifiable.
Expand Detection Logic: You will broaden our current list of deanonymization triggers.
Prompt Engineering: You will develop specialized LLM prompts based on these expanded detection criteria.
Mandatory Quality Gates: These LLM checks will be deployed as mandatory quality gates within the IntelliJ IDEA development workflow.
Automated Feedback Loop: If a validation check fails, the system will automatically post a detailed diagnostic report directly to the relevant merge request, ensuring immediate visibility and resolution for the developers.
What you'll learn:
LLM Prompt Engineering
CI/CD Pipeline Logic, GitHub APIs
Must Have:
Good understanding of programming languages and paradigms
Good written and verbal communication skills
Self-motivation with the ability to work independently on research tasks
Nice to Have:
Basic understanding of JVM-based languages (Java, Kotlin)