Instructions for participation

Registration

To register for the evaluation, participants should email dihardchallenge@gmail.com with the subject line “REGISTRATION” and the following details:

Organization – the organization competing (e.g., NIST, BBN, SRI)
Team name – the name to be displayed on the leaderboard
Tracks – which tracks they will be competing in

Data license agreement

One participant from each site must sign the data license agreement and its addendum and return them to LDC: (1) by email to ldc@ldc.upenn.edu or (2) by facsimile, Attention: Membership Office, fax number (+1) 215-573-2175. They will also need to create an LDC Online user account, which will be used to download the dev and eval releases

Zenodo registration

In order to submit system results, performers will need to create an account with Zenodo.

Paper submission

For challenge participants contributing papers to the Interspeech special session, the deadline for submission of papers is March 23, 2018 (midnight GMT). Please submit your paper by this date via the Interspeech submission system at https://www.softconf.com/i/interspeech2018/. As topic, you should choose ONLY the special session: 13.3 The First DIHARD Speech Diarization Challenge.

IMPORTANT Papers must be registered in the Interspeech submission system by March 16 (midnight GMT). While the title, abstract, authors list, and pdf may all be changed after this date, a version MUST be submitted to the system with the correct topic (13.3 The First DIHARD Speech Diarization Challenge) by March 16.
Papers should not repeat the descriptions of the dataset composition or annotation process, but should cite the evaluation plan:

Ryant, N., Church, K., Cieri, C., Cristia, A., Du, J., Ganapathy, S., and Liberman, M. (2018). First DIHARD Challenge Evaluation Plan. https://zenodo.org/record/1199638.
and the DIHARD and SEEDLingS corpora:

Bergelson, E. (2016). Bergelson Seedlings HomeBank Corpus. doi: 10.21415/T5PK6D.

Ryant et al. (2018). DIHARD Corpus. Linguistic Data Consortium.
Papers may report additional results on other corpora.
Accepted papers may update their results on the development and evaluation sets during the paper revision period (June 3-17), though any such updates will not be reflected on the official challenge leaderboard.

Results submission

The deadline for submission of final system results is March 23 (midnight GMT). Results should be submitted via Zenodo according to the instructions at the submissions page.

The scores for the most recently submitted output for each system for each track will be displayed on the leaderboard, which will update in real time until the challenge submission deadline, at which point it will be frozen.
Teams may submit results from multiple systems, in which case all will be displayed on the leaderboard and included in any post-challenge discussion of results.
If a team wishes to remove a system (e.g., remove a test submission or a mistake) or change its name, they should contact the DIHARD organizers at dihardchallenge@gmail.com. These are the only changes to the leaderboard that will be allowed post-deadline.
Each submitted system must be accompanied by a detailed system description following the provided template. These system descriptions should be submitted as PDFs via dihardchallenge@gmail.com by March 31, 2018 (midnight GMT). Preferably, each team will send one email containing an attachment for each system description.

Rules

The 2018 DIHARD challenge is an open evaluation where the test data is sent to participants, who will process the data locally and submit their system outputs to LDC via Zenodo for scoring. As such, the participants have agreed to process the data in accordance with the following rules:

Investigation of the evaluation data prior to the end of the evaluation is disallowed.
Automatic identification of the domain of the test utterance is allowed.
During the evaluation period, each team may make at most two submissions per day per system. Additional submissions past the first two each day will be ignored.
While most test data is actually, or effectively, unexposed, portions have been exposed in part in the following corpora:

HCRC Map Task Corpus (LDC93S12)
DCIEM Map Task Corpus (LDC96S38)
MIXER6 Speech (LDC2013S03)
NIST SRE10 evaluation data
NIST SRE12 evaluation data

Participants in the 2017 JSALT Summer Workshop would have had access to an earlier version of the following sources:

ADOS
SEEDlingS
YouthPoint

While participants are encouraged to submit papers to the special session at Interspeech 2018, this is not a requirement for participation.