Each protein is represented using 3 lines:
1. > UniProtID or DisProtID (can be used to trace the evidence for the annotations in the BioLip database or DisProt database)
2. Amino acid sequence
3. Annotation of binding: "0" non-binding (used as negative for training and evaluation); "1" DNA-binding (used as positive for training and evaluation); "2" binding a non-DNA ligand (used as negative for training and evaluation, and used to assess cross-predictions); "X" denotes the residues that lack annotations.