Overview

Following the success of the first and second DIHARD challenges, we are pleased to announce the Third DIHARD Speech Diarization Challenge (DIHARD III). As with other evaluations in this series, DIHARD III is intended to both:

support speaker diarization research through the creation and distribution of novel data sets
measure and calibrate the performance of systems on these data sets

The task evaluated in the challenge is speaker diarization; that is, the task of determining “who spoke when” in a multispeaker environment based only on audio recordings. As with DIHARD I and DIHARD II, development and evaluation sets will be provided by the organizers, but there is no fixed training set with the result that participants are free to train their systems on any proprietary and/or public data. Once again, these development and evaluation sets will be drawn from a diverse sampling of sources including monologues, map task dialogues, broadcast interviews, sociolinguistic interviews, meeting speech, speech in restaurants, clinical recordings, and YouTube videos.

Participation in the evaluation is open to all who are interested and willing to comply with the rules laid out in the evaluation plan. There is no cost to participate and the web interface, data, scoring software, and any baselines are provided free of charge.

Evaluation plan

For all details concerning the overall challenge design, tasks, scoring metrics, datasets, rules, and data formats, please consult the latest version of the official evaluation plan:

Third DIHARD Challenge Evaluation Plan (version 1.2) UPDATED October 30th, 2020
Third DIHARD Challenge Evaluation Plan (version 1.1)
Third DIHARD Challenge Evaluation Plan (version 1.0)

Workshop

The results of the challenge will be presented at a dedicated post-evaluation workshop, to be held on January 23rd, 2021. Due to continued COVID-19 related disruption, this workshop will be entirely virtual. For additional details, please see the workshop website.

Planned schedule

Milestone	Date
development set release	September 21st, 2020
evaluation set release	September 21st, 2020
evaluation server opens	October 30th, 2020
evaluation period ends	~~December 21st, 2020~~ (December 30th, 2020)
abstract submission deadline	~~December 21st, 2020~~ (December 30th, 2020)
workshop	January 23rd, 2021
system descriptions due	~~January 8th, 2021~~ (January 31st, 2021)

Acknowledgments

We would like to thank NIST for hosting DIHARD III through the OpenSAT evaluation series. All evaluation activities (registration, system submission, scoring, and leaderboard display) will be conducted using NIST maintained web-interfaces.

Organizing committee

Neville Ryant, Linguistic Data Consortium, University of Pennsylvania
Kenneth Church, Baidu Research
Christoper Cieri, Linguistic Data Consortium, University of Pennsylvania
Jun Du – University of Science and Technology of China
Sriram Ganapathy – Electrical Engineering Department, Indian Institute of Science
Mark Liberman, Linguistic Data Consortium, University of Pennsylvania

Contact

For additional information, please join our mailing list or email us at dihardchallenge@gmail.com.