https://doi.org/10.1351/goldbook.11531
The molecules not used to devise a computational model but instead are used during model development to test for possible over-fitting as seen when additional
properties are added to a model but the predicted potencies of the validation set become less accurate.