Human in the loop#2443
Conversation
|
@IdirLISN I find this feature a bit confusing. Naming and placementFirst of all, as shown on your screenshot, we have an option "Auto-run submission". When it is disabled, organizers need to validate submissions before sending them:
I understand that this new feature is a variation of that where the validation happens at the compute worker level. However it is confusing to have a completely different naming and checkbox for a variation of the same feature. We could, for instance, have an additionnal option when disabling "auto-run submissions" : "Validate submissions from website" VS "Validate submissions from compute worker", or something like this. Also, the naming "Human in the loop" has different definitions among the community, but is usually used to refer to setup where the evaluation is done by human (not just the pre-run validation). How does it work?Question: concretely, how does the organizers receive and validate the submissions? Is it directly inside the compute worker through command line? How does it work if there are 10 workers in the queue? Did you document this? |
|
This validation procedure is made to secure datasets when they are in compute worker side. The organizer just activate the option in the edit section of the competition and then there is someone from CW side who's going to validate the scoring before sending it. As i progress through the feature i will elaborate the PR to make it clear. Thank you for your review :) |

@ mention of reviewers
@acletournel
@ObadaS
@wlln
@Didayolo
A brief description of the purpose of the changes contained in this PR.
Human in the loop feature (HITL), enables organizers to add this option for their competitions

If this option is enabled, each submission will require a validation on compute worker side.

If no validation occurs, a time out will return a submission failed status
If the compute worker administrator validates the scoring file, the submission returns a scoring to codabench.
Issues this PR resolves
It secures the scoring process.
A checklist for hand testing
Caution
Compute worker changes.
The feature still need some polish.
Checklist