Instructions for participants’ submissions to Zenodo
In order to have your system results processed by the scoring server, you will need to create a tarball and upload it to Zenodo. The following instructions will guide you through this process:
-
The DIHARD Challenge has two tracks:
- Track1: Diarization from gold SAD
- Track2: Diarization from scratch
Each participating team is registered for for at least one track. System output for each track should be stored in a directory of RTTM files containing one RTTM file for each FLAC file in the eval set with the basename identical to the FLAC file save for extension. For instance:
DH_0001.rttm
DH_0002.rttm
…
DH_0164.rttm
RTTMs should be present for ALL FLAC files. If any RTTMs are missing your submission will NOT be scored.
- These output directories will be submitted as a .tar.gz archive with the following structure:
track1/
DH_0001.rttm
DH_0002.rttm
…
DH_0164.rttm
track2/
DH_0001.rttm
DH_0002.rttm
…
DH_0164.rttm
If participating in only one track, then only that track’s directory should be present.
- Examples of valid .tar.gz archives for the evaluation set:
-
To validate the RTTMs in your submission before creating the tarball, use the validate_rttm.py script from the dscore repo with the command:
python validate_rttm.py rec1.rttm rec2.rttm ...
- To validate your .tar.gz archive's directory strucure, use the validate_archive.py script with the command:
python validate_archive.py submission.tar.gz
-
Create an account with Zenodo. This step only needs to be done once (i.e., additional submissions are done with the extant account.)
- Upload your .tar.gz archive to the DIHARD Zenodo community.
- For upload type, select Dataset.
- Enter your team name (this is the name you registered the DIHARD challenge under) under the Authors section.
- Enter your system name under Title.
- Enter any additional details you desire under Description.
- Pre-reserve a DOI by clicking Reserve DOI. The DOI will be a string in the format of 10.SERV/DATACITE, where SERV is numeric value for the DOI provider and DATACITE is your data alphanumeric string id (e.g., “10.5281/zenodo.1183345”). Retain this DOI for later as it will be needed for the submission form on the DIHARD website.
- Select “Open Access” for Access right (this is the default).
IMPORTANT: You MUST perform this step as directed. If you select anything other than “Open Access” the cronjobs that download and score submissions will not be able to access your archive.
- Click Save.
- Click Publish.
- Your .tar.gz archive will now be available on Zenodo and publicly viewable. Please be careful as at this point it can only be deleted by emailing Zenodo.
- Submit your DOI via the DIHARD website (details below).
- Fill out the submission form on the DIHARD website
- Navigate to Submission Form.
- Under System Name enter the name of the system you are submitting results for.
- Under DOI enter the FULL DOI you obtained in step 3 (e.g., “10.5281/zenodo.1183345”).
- Under Email address enter the email address that the scoring server should send your scores to.
- Under PIN enter the 5-digit PIN that your team was assigned.
- Wait for your results
- The evaluation results will be sent to you by email (at the email you included on the submission form) and published within one hour on the DIHARD results page.
- If your submission could not be scored (missing track directories, missing RTTMs, invalid RTTMs, etc), an email with the relevant error messages will be sent instead.
-
Upload your .tar.gz archive as new version of an existing Zenodo record
- Navigate to the Zenodo record page for the existing system. To see a list of all records you have created, navigate to https://www.zenodo.org/deposit.
- Click New version.
- Upload the new .tar.gz archive.
- Record the DOI listed under Digital Object Identifier. Each version gets a new DOI.
- Click Save.
- Click Publish.
- Fill out the submission form on the DIHARD website.
- Follows the instructions from step 2, but remember to use the NEW DOI you recorded when uploading the new version.
- Wait for your results
Rules
- Each team is limited to two submissions per system per day, where a day is defined as beginning at midnight GMT. Submissions past the first two will be ignored.
- Submissions that are not scored due to being invalid do not count against this limit.
- In the case of teams registered for both tracks 1 and 2, partial submissions -- that is, submissions containing results for only a single track -- will be scored, though they count the same against this daily limit as a submission containing results for both tracks.
- There is currently no cap on how many total submissions a team may make per day as long as they submit no more than two per system. Please do not abuse this by creating identical or near identical systems to get around the two per day limit. If it is discovered that this is happening, we will be forced to impose a cap on total submissions.