Summary of results
The goals of this study were 1) to develop in silico models to predict environmental toxicity of molecules against T. pyriformis using data measured at the laboratory of Prof. T.W. Schultz and 2) to estimate the prediction intervals for new compounds. The challenge was a big success and we had 108 participants from more than 25 countries around the globe.
Two Final Winners were identified:
We sincerely congratulate them!
Gavin got the lowest RMSE error on the blind test set (RMSE=0.74). Olga got similar result (RMSE=0.76, not significantly different from Gavin) but, in addition, the prediction intervals (confidence values) provided by her approach calculated the highest likelihood score for the Blind set.
We used the confidence values provided by Olga (as well as by other First Pass winners, see below) as a distance to model. The observed accuracy of predictions was calibrated versus the distance to model using the results for the Known set. This curve was applied to estimate likelihood of calculated errors for the Blind set. Thus, we did not use the predicted intervals directly, but used them as a proxy to calibrate the accuracy of predictions as distance to models, (see for details Tetko et al, 2008). Notice, that if no confidence intervals were provided (the requirement to provide prediction intervals was an optional one), we assumed that all predictions had the same distance to models for all molecules. However, the overall estimated error RMSE=0.48 for Olga's model is quite different from the calculated one for the Blind set (RMSE=0.76). These results will be further analyzed.
We had identified additional eight First Pass winners. All these participants got results non significantly different to the lowest RMSE of Gavin. Their names and descriptions of their methods are listed on the page with summary of challenge results.
We are going to publish an article with summary of results and under the join authorships of the First Pass winners and challenge organizers. A preliminary summary and statistics of the challenge can be found at the power point presentation, which can be downloaded here.
We have preliminary proposals from IJNS, SAR and QSAR in Environmental sciences and from International Journal of Environmental Research and Public Health to publish special issues with results of individual methods. If you are interested in these offers, please, contact us and express your interest to submit an article and indicate to which journal.
We plan to organize a special session at ICANN'2010 conference in Thessaloniki (15-18.09.2010) and all participants are welcome to submit your methodological article to this session as well as to participate to this conference.
Once, again thank you for your participation and, hopefully, see you next year at ICANN conference!
Best regards,
Igor V. Tetko, Terry W. Schultz and Wlodzislaw Duch
Two Final Winners were identified:
Dr. Gavin C. Cawley School of Computing Sciences University of East Anglia Norwich, NR4 7TJ, U.K. |
Dr. Olga Obrezanova Principal Scientist at Optibrium Ltd. Cambridge, United Kingdom |
We sincerely congratulate them!
Gavin got the lowest RMSE error on the blind test set (RMSE=0.74). Olga got similar result (RMSE=0.76, not significantly different from Gavin) but, in addition, the prediction intervals (confidence values) provided by her approach calculated the highest likelihood score for the Blind set.
We used the confidence values provided by Olga (as well as by other First Pass winners, see below) as a distance to model. The observed accuracy of predictions was calibrated versus the distance to model using the results for the Known set. This curve was applied to estimate likelihood of calculated errors for the Blind set. Thus, we did not use the predicted intervals directly, but used them as a proxy to calibrate the accuracy of predictions as distance to models, (see for details Tetko et al, 2008). Notice, that if no confidence intervals were provided (the requirement to provide prediction intervals was an optional one), we assumed that all predictions had the same distance to models for all molecules. However, the overall estimated error RMSE=0.48 for Olga's model is quite different from the calculated one for the Blind set (RMSE=0.76). These results will be further analyzed.
We had identified additional eight First Pass winners. All these participants got results non significantly different to the lowest RMSE of Gavin. Their names and descriptions of their methods are listed on the page with summary of challenge results.
We are going to publish an article with summary of results and under the join authorships of the First Pass winners and challenge organizers. A preliminary summary and statistics of the challenge can be found at the power point presentation, which can be downloaded here.
We have preliminary proposals from IJNS, SAR and QSAR in Environmental sciences and from International Journal of Environmental Research and Public Health to publish special issues with results of individual methods. If you are interested in these offers, please, contact us and express your interest to submit an article and indicate to which journal.
We plan to organize a special session at ICANN'2010 conference in Thessaloniki (15-18.09.2010) and all participants are welcome to submit your methodological article to this session as well as to participate to this conference.
Once, again thank you for your participation and, hopefully, see you next year at ICANN conference!
Best regards,
Igor V. Tetko, Terry W. Schultz and Wlodzislaw Duch