To ensure a fair and standardized evaluation, all participants must adhere to the following rules:
Any attempt to determine the real-world identity of a speaker, or to link speakers in the challenge data to individuals or records in external datasets, services, or metadata (“data recombination”) is strictly forbidden. This includes (but is not limited to) voiceprint matching against external corpora, cross-dataset record linkage, or the use of personally identifying metadata. Violations result in disqualification and notification to the organizers and hosting institutions.
This challenge evaluates spoken language recognition on the provided splits. Participants must submit for the closed-condition; open-condition submission is optional. For each condition, the final evaluation consists of two tasks: (1) Language identification (35 seen languages)—predicting the language of each utterance, reported as Macro accuracy; (2) Unseen language recognition (40 unseen languages)—enrollment-based: each enrollment ID has 20–65 s of audio; for each trial (enrollment ID, test utterance), output a score; results reported as EER. The exact submission format (e.g., language labels for identification, scores per trial for unseen language recognition) will be specified when the evaluation phase opens. See the Evaluation Plan and Baseline Systems for full detail.
Manual correction or re-labeling of the officially provided challenge data is strictly prohibited.
The use of publicly available, pre-trained models is permitted and encouraged. This includes (but is not limited to):
All pre-trained models used must be explicitly and thoroughly declared in the system description paper, including their source, training data, and how they were integrated into the system.
The use of external, non-speech data for data augmentation (e.g., noise or reverberation from public corpora like MUSAN) is permitted and encouraged, but must be explicitly and thoroughly disclosed in the system description paper.
Each participating team must submit results on the evaluation set via the CodaBench platform. The link to the CodaBench competition will be announced when the evaluation phase opens. Until then, submission format, file names, and exact procedures are not finalized. Please check the Important Dates and this page for updates.
To be eligible for inclusion in the final ranking and challenge results, participants must submit a system description paper to the dedicated Odyssey 2026 workshop.
Each submission must be accompanied by a detailed system description paper in the Odyssey format that specifies:
The exact submission format (file names, structure, and content) will be published when the evaluation phase opens. At that time we will provide:
We will release the evaluation data and the evaluation trial pair lists when the evaluation phase opens. Until then, we do not disclose details about the evaluation set or submission files. Please check back when the evaluation phase begins (see Important Dates).
The CodaBench platform will host the leaderboard and submission system for the TidyLang 2026 Challenge. The competition link is not yet available and will be shared with registered participants when the evaluation phase opens. We will post the link on this website and communicate it via the registration contact information.
Coming soon — stay tuned!