Datasets and Supplements

Protein structure prediction methods (in alphabetical order)

  • 1D protein predictions: datasets Kurgan LA, Miri Disfani F, 2011. Structural Protein Descriptors in 1-Dimension and Their Sequence-Based Predictions. Current Protein and Peptide Science, special issue on Machine Learning Models in Protein Bioinformatics, 12(6):470-489
  • ATPsite: datasets and supplement Chen K, Mizianty MJ, Kurgan L, 2011. ATPsite: Sequence-based Prediction of ATP-binding Residues, Proteome Science, 9(Suppl. 1):S4
  • BEST: datasets and prediction model Gao J, Eshel E, Zhou Y, Jishou J, Kurgan L, 2012. BEST: improved prediction of B-cell epitopes from antigen sequences. PLoS ONE, 7(6): e40104
  • BETArPred: datasets and prediction model Kedarisetti KD, Mizianty M, Dick S, Kurgan LA, 2011. Improved sequence-based prediction of strand residues. Journal of Bioinformatics and Computational Biology, 9(1):67-89
  • BTcollocation: datasets Campbell K, Kurgan LA, 2008. Sequence-only based prediction of beta-turn location and type using collocation of amino acid pairs. Open Bioinformatics Journal, 2:37-49
  • BTNpred: datasets and prediction model Zheng C, Kurgan LA, 2008. Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments, BMC Bioinformatics, 9:430
  • Consensus-based disorder prediction: datasets Peng Z, Kurgan LA, 2011. On the complementarity of the consensus-based disorder prediction, Proceedings of the Pacific Symposium on Biocomputing (PSB 2012), 17:176-187
  • CRpred: datasets and prediction model Zhang T, Zhang H, Chen K, Shen S, Ruan J, Kurgan LA, 2008. Accurate sequence-based prediction of catalytic residues, Bioinformatics, 24(20):2329-2338
  • CRYSTALP2: datasets and prediction model Kurgan L, Razib A, Aghakhani S, Dick S, Mizianty M and Jahandideh S, 2009. CRYSTALP2: sequence-based protein crystallization propensity prediction, BMC Structural Biology, 9:50
  • DisCon: datasets Mizianty M, Zhang T, Xue B, Zhou Y, Dunker AK, Uversky VN, Kurgan LA, 2011. In-silico prediction of disorder content using hybrid sequence representation. BMC Bioinformatics, 12:245
  • FlexRP: datasets Chen K, Kurgan LA, Ruan J, 2007. Prediction of Flexible/Rigid Regions in Proteins from Sequences Using Collocated Amino Acid Pairs. BMC Structural Biology, 7:25
  • FOKIT: datasets Zhang H, Zhang T, Gao J, Ruan J, Shen S, Kurgan LA, 2012. Determination of Protein Folding Kinetic Types Using Sequence and Predicted Secondary Structure and Solvent Accessibility. Amino Acids, 42:271-283
  • MODAS: datasets and prediction model Mizianty M, Kurgan LA, 2009. Modular Prediction of Protein Structural Classes from Sequences of Twilight-Zone Identity with Predicting Sequences. BMC Bioinformatics, 10:414
  • MetaPPCP: datasets Mizianty M, Kurgan LA, 2009. Meta Prediction of Protein Crystallization Propensity. Biochemical and Biophysical Research Communications, 390(1):10-15
  • OMBBpred: datasets and prediction model Mizianty M, Kurgan LA, 2011. Improved Identification of Outer Membrane Beta Barrel Proteins Using Primary Sequence, Predicted Secondary Structure and Evolutionary Information. Proteins, 79(1):294-303
  • PFR-AF: datasets Gao J, Zhang T, Zhang H, Shen S, Ruan J, Kurgan LA, 2010. Accurate prediction of protein folding rates from sequence and sequence-derived residue flexibility and solvent accessibility. Proteins, 78(9):2114-2130
  • PFRES: datasets Chen K, Kurgan LA, 2007. PFRES: Protein fold classification by using evolutionary information and predicted secondary structure. Bioinformatics, 23(21):2843-2850
  • PPCpred: datasets and supplement Mizianty M, Kurgan LA, 2011. Sequence-based Prediction of Protein Crystallization, Purification, and Production Propensity. Bioinformatics, 27(13):i24-i33
  • PPFR: datasets Jiang Y, Iglinski P, Kurgan LA, 2009. Prediction of Protein Folding Rates from Primary Sequences using Hybrid Sequence Representation. Journal of Computational Chemistry, 30(5):772-783
  • SCEC: datasets Chen K, Kurgan LA, Ruan J, 2008. Prediction of Protein Structural Class Using Novel Evolutionary Collocation Based Sequence Representation. Journal of Computational Chemistry, 29(10):1596-1604
  • SCPRED: datasets and prediction model Kurgan LA, Cios KJ, Chen K, 2008. SCPRED: Accurate Prediction of Protein Structural Class for Sequences of Twilight-zone Similarity with Predicting Sequences. BMC Bioinformatics, 9:226
  • Secondary structure benchmark dataset Zhang H, Zhang T, Chen K, Kedarisetti KD, Mizianty MJ, Bao Q, Stach W, Kurgan LA, 2011. Critical Assessment of High-throughput Standalone Methods for Secondary Structure Prediction. Briefings in Bioinformatics, doi: 10.1093/bib/bbq088
  • Structure-based binding site prediction: datasets and supplement Chen K, Mizianty MJ, Gao J, and Kurgan LA, 2011. A Critical Comparative Assessment of Predictions of Protein Binding Sites for Biologically Relevant Organic Compounds. Structure, 19(5):613-621

Other methods (in alphabetical order)

  • Discreitzation for Naive Bayes: supplement Mizianty M, Kurgan LA, Ogiela M, 2010. Discretization as the Enabling Technique for the Naive Bayes and Semi-Naive Bayes Based Classification. Knowledge Engineering Review, 25(4):421-449