train.tar.gz: the sequences, disorder predictions and native disorder annotations for the training dataset. testset_scores.tar.gz: all the scores of our test dataset. human_scores.tar.gz: results for the human proteome. Because of the large size, the results were split into 10 files, each with 10% of the human proteins. The format of each entry/protein in training dataset is based on 24 lines: # ID (protein number in the dataset) Uniprot ID Protein sequence Native annotation of disorder DisEMBL-Remark465 predicted disorder (numerical propensity) DisEMBL-Remark465 predicted disorder (binary predictions) DisEMBL-Hot Loops predicted disorder (numerical propensity) DisEMBL-Hot Loops predicted disorder (binary predictions) ESpritz-DisProt predicted disorder (numerical propensity) ESpritz-DisProt predicted disorder (binary predictions) ESpritz-NMR predicted disorder (numerical propensity) ESpritz-NMR predicted disorder (binary predictions) ESpritz-X-Ray predicted disorder (numerical propensity) ESpritz-X-Ray predicted disorder (binary predictions) GlobPlot predicted disorder (numerical propensity) GlobPlot predicted disorder (binary predictions) IUPred-long predicted disorder (numerical propensity) IUPred-long predicted disorder (binary predictions) IUPred-short predicted disorder (numerical propensity) IUPred-short predicted disorder (binary predictions) RONN predicted disorder (numerical propensity) RONN predicted disorder (binary predictions) VSL2B predicted disorder (numerical propensity) VSL2B predicted disorder (binary predictions) The format of each entry/protein in human and the test dataset is based on 36 lines: # ID (protein number in the dataset) Uniprot ID Protein sequence Native annotation of disorder (set to 0s for the human proteome where the annotation is unknown) DisEMBL-Remark465 predicted disorder (numerical propensity) DisEMBL-Remark465 predicted disorder (binary predictions) Quality assessment scores for DisEMBL-Remark465 DisEMBL-Hot Loops predicted disorder (numerical propensity) DisEMBL-Hot Loops predicted disorder (binary predictions) Quality assessment scores for DisEMBL-Hot Loops ESpritz-DisProt predicted disorder (numerical propensity) ESpritz-DisProt predicted disorder (binary predictions) Quality assessment scores for ESpritz-DisProt ESpritz-NMR predicted disorder (numerical propensity) ESpritz-NMR predicted disorder (binary predictions) Quality assessment scores for ESpritz-NMR ESpritz-X-Ray predicted disorder (numerical propensity) ESpritz-X-Ray predicted disorder (binary predictions) Quality assessment scores for ESpritz-X-Ray GlobPlot predicted disorder (numerical propensity) GlobPlot predicted disorder (binary predictions) Quality assessment scores for GlobPlot IUPred-long predicted disorder (numerical propensity) IUPred-long predicted disorder (binary predictions) Quality assessment scores for IUPred-long IUPred-short predicted disorder (numerical propensity) IUPred-short predicted disorder (binary predictions) Quality assessment scores for IUPred-short RONN predicted disorder (numerical propensity) RONN predicted disorder (binary predictions) Quality assessment scores for RONN VSL2B predicted disorder (numerical propensity) VSL2B predicted disorder (binary predictions) Quality assessment scores for VSL2B Combination predicted disorder (binary predictions) Combination high quality residues (binary)