https://doi.org/10.1351/goldbook.11506
The process whereby the molecules in a dataset are partitioned into the core ("scaffold"), usually user-specified, and the specific substituents ("R-groups")
found at each specific position.