RAPID: Regression-based Accurate Prediction of protein Intrinsic Disorder content

RAPID webserver

RAPID is a server providing fast and accurate sequence based prediction of protein disorder content.

Please follow the three steps below to make predictions:

1. Upload a file with protein sequences, or paste them into text area

Server accepts up to 75 000 (FASTA formatted) protein sequences (there is also 40MB max file size limit). Either upload a file or enter each protein in a new line in the following text field (see Help for details):

2. Provide your email address (required)

Please provide your email address to be notified when results are ready.

3. Predict

Click button to launch prediction.

Additional materials

    Datasets:
  • TRAINING dataset - Dataset used to develop classifier including feature selection and parameterization using 5-fold cross validation.
  • TEST dataset - Dataset used to perform out-of-sample evaluation and comparison with existing disorder predictors. Test dataset shares low (<25%) pairwise sequence similarity with the Training dataset.
  • Human proteome - Dataset with the complete proteome of Homo sapiens from the Uniprot, release as of July 2012.
    The format of the abovementioned files is as follow:
  • Line 1: >protein name
  • Line 2: protein sequence (one letter amino acid code only)
  • Line 3: native disorde content (predicted by RAPID in case of Human proteome)

Help

RAPID accepts either single or multiple protein sequences and the input is limited to 75 000 protein sequences at the time. The user should submit the protein sequence(s) in FASTA format.

    The format of the input file is as follows (example):
  • Line 1: >protein name (The server will trim protein names to first 12 characters)
  • Line 2: protein sequence (one letter amino acid code only)

Acknowledgments

We acknowledge, with thanks, that the following software was used as a part of this server:

  • IUpred - Prediction of Intrinsically Unstructured Proteins
  • SEG - Application for the prediction of low complexity regions (LCRs)
  • Weka 3 - Data Mining Software in Java