Training dataset

  • training dataset (552 sequences)

    Validation dataset

  • validation dataset (227 sequences)

    Independent test dataset

  • TE210 dataset (210 sequences)
  • TE83 dataset (83 sequences)

    CAID dataset

  • CAID2 binding dataset (78 sequences)
  • CAID2 linker dataset (40 sequences)
  • CAID3 binding dataset (51 sequences)
  • CAID3 linker dataset (20 sequences)

    Explanation of dataset

    For Training, Validation, and Independent Test Datasets

    All three datasets share the same structure:

    For CAID Datasets

    All four datasets share the same structure: