To evaluate a new solution (model), the participants are requested to upload its predictions on the test dataset via the Upload page. See the Data Formats section of the Documentation page for information on the upload format. A score will then be automatically calculated for the solution.
The score reported is based on the weighted average of the absolute error per target (i.e. on the relative radii) across all test set examples and all wavelengths and is given by:
where is the true relative radius and the predicted relative radius of the -th wavelength of the -th test set example and the corresponding weight is given by:
with being the variance of relative stellar flux caused by the observing instrument at the -th wavelength of the -th example and the variation of the relative stellar flux caused by stellar spots in the -th wavelength of the -th example.
is an estimation based on an ARIEL-like instrument, given its current design, while is calculated based on stellar flux and the spot flux in the -th wavelength of the -th example:
As we see, both sources of noise (photon & stellar spot) are wavelength-dependent and target-dependent (they depend on the star, therefore are different for each datapoint).
The higher the score, the better your ranking. The maximum achievable score is 10000. The score is not lower-bounded (i.e. can be negative), but reasonable attempts (e.g. predicting the average target value for all test datapoints) should not produce scores below 4000. Upon registration to the competition, an entry with your username and a score of 0 will automatically appear on the leaderboard.
For transparency of the evaluation process, we will also provide a file with the coefficients of the test set examples, along with the ground truth (target values ) after the end of the competition (15th August 2019).