The similarity between the original training and test samples. In each of the file below, the first line is the PDB IDs of the test proteins. The first column is the PDB IDs of the training proteins.
Y. Li and J. Yang, Structural and Sequence Similarity Makes a Significant Impact on Machine-Learning-Based Scoring Functions for Protein-Ligand Interactions, Journal of Chemical Information and Modeling, 57: 1007-1012 (2017).