Each protein is represented using 3 lines: 1. > UniProtID or DisProtID (can be used to trace the evidence for the annotations in the BioLip database or DisProt database) 2. Amino acid sequence 3. Annotation of binding: "0" non-binding (used as negative for training and evaluation); "1" DNA-binding (used as positive for training and evaluation); "2" binding a non-DNA ligand (used as negative for training and evaluation, and used to assess cross-predictions); "X" denotes the residues that lack annotations.