The Second DIHARD Speech Diarization Challenge

DIHARD II is the second in a series of diarization challenges focusing on "hard" diarization; that is, speaker diarization for challenging recordings where there is an expectation that the current state-of-the-art will fare poorly. As with other evaluations in this series, DIHARD II is intended to both:

support speaker diarization research through the creation and distribution of novel data sets
measure and calibrate the performance of systems on these data sets.

Following in the success of the First DIHARD Challenge,
we are pleased to announce This Second DIHARD Challenge (DIHARD II)

The task evaluated in the challenge is speaker diarization; that is, the task of determining "who spoke when" in a multispeaker environment based only on audio recordings. As with DIHARD I, development and evaluation sets are provided by the organizers, but there is no fixed training set with the result that participants are free to train their systems on any proprietary and/or public data. Once again, these development and evaluation sets are drawn from a diverse sampling of sources including monologues, map task dialogues, broadcast interviews, sociolinguistic interviews, meeting speech, speech in restaurants, clinical recordings, extended child language acquisition recordings from LENA vests, and YouTube videos. However, there are several key differences from DIHARD I:

two tracks evaluating diarization of multi-channel recordings have been added; these tracks use recordings of dinner parties provided by the organizers of CHiME-5
the evaluation period has been lengthened (from 4 weeks to 16 weeks)
Jaccard Error Rate replaces mutual information as the secondary metric
baseline systems and results will be provided to participants

The challenge will run from February 14th, 2019 through July 1, 2019 and results will be presented at a special session at Interspeech 2019 in Graz - Austria. Participation in the evaluation is open to all who are interested and willing to comply with the rules laid out in the evaluation plan. There is no cost to participate, though participants are encouraged to submit a paper to the corresponding Interspeech 2019 special session.

For questions not answered in this document or to join the DIHARD mailing list, please contact dihardchallenge@gmail.com
For more information join our mailing list

Evaluation plan

For all details concerning the overall challenge design, tasks, scoring metrics, datasets, rules, and data formats, please consult the latest version of the official evaluation plan:

Second DIHARD Challenge Evaluation Plan (version 1.2) UPDATED June 18th, 2019
Second DIHARD Challenge Evaluation Plan (version 1.1)

Important dates

Event	Date
Registration period	January 30 through March 15, 2019
Launch: release of DIHARD II development and evaluation sets + scoring code	February 28, 2019
Scoring server opens	March 12, 2019
Baselines released	Week of March 11, 2019
Interspeech paper registration deadline	March 29, 2019
Interspeech submission deadline	April 5, 2019
End of challenge/final Interspeech deadline	July 1, 2019
System descriptions due	August 16, 2019
Interspeech 2019 special session	September 15-19, 2019

The deadline for submission of final system outputs is July 1, 2019 midnight.

Organizers

Kenneth Church
(Baidu Research, Sunnyvale, CA, USA)

Christopher Cieri
(Linguistic Data Consortium)

Alejandrina Cristia
(Laboratoire de Sciences Cognitives et Psycholinguistique, ENS, Paris, France)

Jun Du
(University of Science and Technology of China, Hefei, China)

Sriram Ganapathy
(Electrical Engineering Department, Indian Institute of Science, Bangalore, India)

Mark Liberman
(Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA, USA)

Neville Ryant
(Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA, USA)

In collaboration with

the organizers of the CHiME 5 Challenge

Communications Team

Sunghye Cho
(Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA, USA)

Rachid Riad
(Laboratoire de Sciences Cognitives et Psycholinguistique, ENS, Paris, France)

Lei Sun
(University of Science and Technology of China, Hefei, China)

Software

Scoring

The official scoring tool is maintained as a github repo (v1.1.0). To score a set of system output RTTMs sys1.rttm, sys2.rttm, ... against corresponding reference RTTMs ref1.rttm, ref2.rttm, ... using the un-partitioned evaluation map (UEM) all.uem, the command line would be:

                  
$ python score.py -u all.uem -r ref1.rttm ref2.rttm ... -s sys1.rttm sys2.rttm ...

The overall and per-file results for DER and JER (and many other metrics) will be printed to STDOUT as a table. For additional details about scoring tool usage, please consult the documentation for the github repo.

Baseline systems

We provide three software baselines for speech enhancement, speech activity detection, and diarization:

Speech enhancement
The speech enhancement baseline was prepared by Lei Sun and is based on the system used by USTC and iFLYTEK in their submission to DIHARD I:
It is available on github.
Speech activity detection
The speech activity detection baseline uses WebRTC operating on output of audio processed by the speech enhancement baseline and is maintained as part of that github repo.
Diarization
The diarization baseline was prepared by Sriram Ganapathy, Harshah Vardhan MA, and Prachi Singh and is based on the system used by JHU in their submission to DIHARD I with the exception that it omits the Variational-Bayes refinement step:
The x-vector extractor and PLDA parameters were trained on VoxCeleb I and II using data augmentation (additive noise and reverberation), while the whitening transformation was learned from the DIHARD II development set.

The trained system, as well as recipes to produce the baseline results for each track, is available on github.

Instructions

Registration

To register for the evaluation, participants should email dihardchallenge@gmail.com with the subject line "REGISTRATION" and the following details:

Organization – the organization competing (e.g., NIST, BBN, SRI)
Team name – the name to be displayed on the leaderboard; you need to use that same team name when you register for the competition on CodaLab (see under Results submission)
Tracks – which tracks they will be competing in

Data license agreement

One participant from each site must sign the data license agreement and return it to LDC: (1) by email to ldc@ldc.upenn.edu or (2) by facsimile, Attention: Membership Office, fax number (+1) 215-573-2175. They will also need to create an LDC Online user account, which will be used to download the dev and eval releases.

Once the process is complete, this will give you access to all annotation plus the non-CHiME audio.

Participants of tracks 3 and 4 need to apply separately to Sheffield for the CHiME 5 data regardless of whether you participated in CHiME 5. To apply for the multi-channel data, visit

https://licensing.sheffield.ac.uk/i/data/chime5.html

Non-profit organizations should sign the non-commercial license. Everyone else, regardless of use case (even if they are only using the data for non-commercial research), should apply for the commercial license.

Results submission

Account creation

For system submission and scoring, this year we are using an instance of CodaLab hosted at:
Each team should create one (and only one) account, which will then be used for submitting ALL of that team’s results for scoring. In CodaLab, the daily and lifetime submission limits are tied to user accounts, so it is imperative that each team use a SINGLE account to make ALL submissions.
To create an account, navigate to:
and fill out the following fields:
- username -- username you wish to use; this will be displayed on the leaderboard
- email -- the contact email address you provided when registering for DIHARD; if you use a different email, when you later attempt to register for a track your request will not be approved
- password -- password you wish to use for the competition
Accept the terms and conditions, and click Sign Up. A confirmation email will then be sent to the email address that you entered. To activate your account, click on the confirmation link in this email.

Troubleshooting

If you do not see a confirmation email, check that it has not been caught by your email provider’s spam filter. You may find it by searching for the subject line “[CodaLab] Confirm email address for your CodaLab account”
If you still do not see a confirmation email, try prompting CodaLab to resend it:
- Navigate to http://dihard.ldc.upenn.edu/accounts/login/?next=/.
- Enter your email address or username in the Login field.
- Enter your password in the the Password field.
- Click Sign In.
If you still are unable to get a confirmation email, try using a different email address. Please then let us know at dihardchallenge@gmail.com which address you are using so that we may make a note of this on your registration. This will ensure that when you later register for the tracks, your requests are not denied.
Finally, if none of the above work, contact us by email and we will attempt to resolve your issue.

Setting up your team name

In order for your team name to appear next to each submission on the leaderboard, you will need to add it to your CodaLab user profile. Please use the same name you used when registering for the challenge.
Access the User Settings page by selecting Settings from your user menu (always found in the top right of the page with your username).
Scroll down to the Competition settings section and look for the box titled Team name. Enter your team name into this box.
Click Save Changes.

Registering for tracks

Due to limitations of CodaLab, each track has been created as a separate competition. The pages for the four competitions are:

Track 1: http://dihard.ldc.upenn.edu/competitions/73
Track 2: http://dihard.ldc.upenn.edu/competitions/74
Track 3: http://dihard.ldc.upenn.edu/competitions/75
Track 4: http://dihard.ldc.upenn.edu/competitions/76

Before submitting to a track, you will have to register for it via our CodaLab instance. To register, navigate to the competition page of the track, click on the Participate tab, accept the terms and conditions, and click Register. A member of the DIHARD team will then review your registration request and approve it. Upon acceptance you will receive an email titled “Accepted into DIHARD Challenge...”.
IMPORTANT: Your CodaLab account MUST use the same email address that you provided during DIHARD registration. If the addresses differ, your request will be denied.

Results zip archive format

System output for each track should be submitted as a zip file that expands into a single directory of RTTM files containing one RTTM file for each session in that track’s evaluation set. For instance, for tracks 1 and 2 this directory should contain one RTTM file for each FLAC file:

                          
DH_0001.rttm
DH_0002.rttm
...
DH_0194.rttm

                          
S01_U01.rttm
S01_U02.rttm
...
S21_U06.rttm

NOT

Examples of valid zip files for each track:

Track 1: https://coml.lscp.ens.fr/dihard/2019/sample_subs/track1.zip
Track 2: https://coml.lscp.ens.fr/dihard/2019/sample_subs/track2.zip
Track 3: https://coml.lscp.ens.fr/dihard/2019/sample_subs/track3.zip
Track 4: https://coml.lscp.ens.fr/dihard/2019/sample_subs/track4.zip

To validate the RTTMs in your submission before creating the zip file, use the validate_rttm.py script from the dscore repo) with the command:

                                                
python validate_rttm.py rec1.rttm rec2.rttm …

To validate your zip file’s structure, use the validate_submission.py script with the command:
```
                                                
python validate_submission.py track submission.zip
                                            
```
where track is the name of the track you are submitting to (one of “track1”, “track2”, “track3”, “track4”) and submission.zip is the name of your zip file.

Submitting results via CodaLab

Navigate to the competition page for the track you are submitting to and click on the Participate tab. This will bring up a page that allows you to make new submissions and see previous submissions.
In the Method name field, enter the name of the system that you are submitting results for.
Click Submit and select the zip file you wish to submit. This will upload the zip file for processing.

Below the Submit button you will see a table listing all submissions you have made up to the current date with the following information for each:

# -- ordinal number of submission in system; your first submission will be listed as 1
SCORE -- DER for the submission; if the scoring is in progress or failed, this will read "---"
METHOD NAME -- the name of the system that produced the submission
FILENAME -- name of the zip file you submitted
SUBMISSION DATE -- date and time of submission in MM/DD/YYY HH:MM:SS format (all times are UDT)
STATUS -- the current status of your submission, which may be one of

Submitting -- zip file is being uploaded
Running -- upload is successful and scoring script is running
Finished -- scoring script finished successfully and results posted to leaderboard
Failed -- scoring script failed

checkmark -- indicates whether or not submission is on the leaderboard

If scoring failed for your submission, click the + symbol to the right of its entry in the table. This will display the following, which may be used for debugging purposes:

Method name -- the method name you entered into the form
Download your submission -- a download link for the zip file submitted
View scoring output log -- the scoring program’s output to STDOUT
View scoring error -- the scoring program’s output to STDERR
Download output from scoring step -- ignore; downloads a zip file containing files used by CodaLab internally

Leaderboard

After your submission finishes scoring (status “Finished”) it will post to the leaderboard, which is viewable from the Results tab.
The leaderboard lists the most recent submission for each system by each team, ranked in ascending order by DER.
For each submission on the leaderboard, the following fields are displayed:

# -- ranking of system
User -- the username for the account that submitted the result
Entries -- total number of entries by account that submitted result
Date of Last Entry -- date of last entry by user that submitted result in MM/DD/YY format
Team Name -- name of team associated with user that submitted result; this is taken from the Team listed on the user’s profile
Method Name -- the method name entered at submission time
DER -- diarization error rate (in percent) of submission; ranking of this result is indicated in parentheses
JER -- Jaccard error rate (in percent) of submission; ranking of this result is indicated in parentheses

Rules

Each team MUST use a SINGLE account to submit all results
The team name listed in that user’s profile must be identical to the one you registered with.
Each team is limited to 6 submissions per day.
Submissions that are not scored (status shows as “Failed”) do not count against this limit

Paper submission

For challenge participants contributing papers to the Interspeech special session, the deadlines for abstract submission and final paper submission are:

Abstract submission -- March 29, 2019, midnight Anywhere on Earth
Paper submission -- April 5, 2019, midnight Anywhere on Earth
Updates to accepted papers -- July 1, 2019, midnight Anywhere on Earth

Please follow instructions provided on:

https://www.interspeech2019.org/authors/author_resources/

As topic, you should choose ONLY the special session:

13.13 The Second DIHARD Speech Diarization Challenge (DIHARD II)

IMPORTANT: Papers must be registered in the Interspeech submission system by March 29 (midnight Anywhere on Earth). While the title, abstract, authors list, and pdf may all be changed after this date, a version MUST be submitted to the system with the correct topic by midnight on March 29.
Papers should not repeat the descriptions of the tasks, metrics, datasets, or baseline systems, but should cite the challenge paper using the following citation:
All papers MUST cite the DIHARD II and SEEDLingS corpora using the following citations:
- Bergelson, E. (2016). Bergelson Seedlings HomeBank Corpus. doi: 10.21415/T5PK6D.
- Ryant et al. (2019). DIHARD Corpus. Linguistic Data Consortium.

Papers may report additional results on other corpora.

Accepted papers may update their results on the development and evaluation sets during the paper revision period.

System Descriptions

At the end of the evaluation, all participating teams must submit a full description of their system with sufficient detail for a fellow researcher to understand the approach and data/computational requirements. System descriptions should adhere to the format described in Appendix F of the evaluation plan.
System descriptions should be submitted by email to dihardchallenge@gmail.com. Please include the text "SYSTEM DESCRIPTION" in the subject of the email.
The deadline for submitted system descriptions is August 16, 2019, midnight Anywhere on Earth.

Final results

At the conclusion of the evaluation, all final system outputs will be archived by the organizers on Zenodo. This archive will contain RTTM outputs for all systems appearing on the final leaderboard as well as scoring output and associated metadata.

Results

During the evaluation, all results will be displayed on the CodaLab competition leaderboards:

Track 1: http://dihard.ldc.upenn.edu/competitions/73#results
Track 2: http://dihard.ldc.upenn.edu/competitions/74#results
Track 3: http://dihard.ldc.upenn.edu/competitions/75#results
Track 4: http://dihard.ldc.upenn.edu/competitions/76#results

For each track we maintain two leaderboards:

one consisting of results submitted prior to the Interspeech paper deadline on April 5th
one consisting of all results

Results from the baseline system are posted to the leaderboard under team name DIHARD. These results are also available from the challenge paper and the baseline github repo.

FAQ (Frequently Asked Questions)

Must I participate in all tracks in the challenge?
No, researchers may choose to participate in a subset of the tracks. All participants MUST register for at least one of track 1 or track 3 (diarization from reference SAD). Participation in tracks 2 and 4 is optional. For example, you may participate only in track 1; only in track 3; or in tracks 3 and 4. (Other combinations are possible.)
Must I submit a paper to the Interspeech special session?
No, you are not required to submit to the special session in order to participate. Submission to the session is strongly encouraged, but not mandatory.
My team wishes to submit a paper to the Interspeech special session. What should we include?
Papers submitted to the special session should include preliminary results on the development and evaluation sets; these results may be updated during the paper revision period. If they choose to, papers may also report results on other corpora. Papers should not repeat descriptions of the tasks, metrics, datasets, or baseliens, but instead cite the challenge paper. For more details, please consult the paper submission instructions.
Are there any limitations about the training data?
Participants have the freedom to choose their own training data, whether it is publicly available or not. The only exception is that you should not use data that overlaps with the evaluation set. See the rules section of the evaluation plan for a listing of these sources. Please also note that clear descriptions of the data used are required in the final system descriptions document.
My team previously has acquired access to the full SEEDLingS corpus. Can we use this data for training or development?
No, the SEEDLingS data, whether acquired via HomeBank or some other route, is off limits for all purposes. This includes training and tuning, but also acoustic adaptation.
My team participated in DIHARD I. Can we use the DIHARD I development and evaluation sets for training or development?
The DIHARD I evaluation set is off limits for ALL PURPOSES. The DIHARD I development set may be used however you wish, though given that it is a subset of the DIHARD II development set, we expect it to have limited utility.
Can I use the DIHARD II development set to do data simulation and augmentation?
Yes, development data is free to be used in any way you see fit, including for tuning your current diarization system or augmenting training data.
How can I upload the results?
Please see the results submission instructions.
Which files should I submit?
All submissions should consist exclusively of RTTMs output by your system. For tracks 1 and 2 there should be one RTTM per FLAC file in the single channel evaluation set. For tracks 3 and 4, there should be one RTTM per Kinect array in each CHiME-5 evaluation set session. For full details about what to submit and formatting of your submission, please consult the results submission instructions.
For the multichannel tracks (tracks 3 and 4), should we produce one RTTM per Kinect array or one for the entire session
Please refer to the previous question.
For the multichannel tracks (tracks 3 and 4), can we use multiple Kinect arrays to produce each RTTM? That is, could we opt to use audio from arrays U01, U02, and U03 to produce the RTTM for array U01?
Participants should produce ONE RTTM per Kinect array, each the output of the system when considering ONLY the channels from that array. For instance, for session S21 they should produce the following RTTMs:
- S21_U01.rttm -- produced using only the channels from array U01
- S21_U02.rttm -- produced using only the channels from array U02
- S21_U03.rttm -- produced using only the channels from array U03
- S21_U04.rttm -- produced using only the channels from array U04
- S21_U05.rttm -- produced using only the channels from array U05
- S21_U06.rttm -- produced using only the channels from array U06
What should I report in the system descriptions document?
Clear documentation of each system on the final leaderboard is required, providing sufficient detail for a fellow researcher to understand the approach and data/computational requirements. This includes, as mentioned above, explanation of any training data used. For further details, consult the system descriptions instructions.
Are teams with members from multiple organizations allowed?
Yes, teams spanning multiple organizations are allowed, though one person from each organization within the team must sign and return the LDC Data License Agreement. One individual should serve as the team's point of contact for DIHARD, but every organization with access to the data must sign the evaluation agreement.
I attempted to register an account with CodaLab, but am unable to get a confirmation email. What should I do?
Please consult our registration troubleshooting tips.
Is it possible to use the information about the number of speakers being 4 in tracks 3 and 4? This information is available in the CHiME-5 website and other CHiME-5 related publications.
In order to maintain consistency with the single channel tracks, where domain/number of speakers is not known for the evaluation set, using the oracle number of speakers for the CHiME sessions is not allowed.

Contact Us

For more information join our mailing list

or email us at dihardchallenge@gmail.com