The PDID database provides access to a comprehensive set of putative and native protein-drug interactions in the structural human proteome. The structural human proteome includes about 10,000 human and human-like (with high sequence similarity to human proteins) proteins with known 3-D structures. The database includes data for popular, FDA-approved drugs. The corresponding protein-drug interactions were generated with three predictors, and were collected from and linked with three related databases of known protein-drug interactions.
Tutorial that explains how to use PDID is available here.
The structural human proteome was collected from the Protein Data Bank by removing low resolution (< 3Å) structures. Proteins for which sequences can be mapped to human proteins in the Ensembl database were used. Structures of chains with at least 90% sequence identity (measured using BLAST) to any human protein from 68th release of Ensembl were selected. The list of included proteins (identifiers from the Protein Data Bank) is available at http://biomine.cs.vcu.edu/servers/PDID/files/list_proteome.txt. The structural human proteome will be periodically updated in the future releases of PDID as new data will be deposited into the Protein Data Bank.
The database includes the FDA-approved drugs and nutraceuticals found in structures of proteins from the PDB that were extracted with the help of PDBsum. The structure of the protein-drug complexes is required to predict drug targets that are stored in PDID. The list of included drugs is available at http://biomine.cs.vcu.edu/servers/PDID/files/list_drugs.txt. Additional drugs will be periodically added in the future releases of PDID.
The protein-drug interactions that are made available in the PDID include the known and putative (predicted) interactions.
The known interactions were collected from the DrugBank , BindingDB , and Protein Data Bank  resources. These interactions are annotated in PDID as known and are linked to the corresponding databases.
The putative interactions were predicted
with three methods:
1. Customized version of the eFindSite method [4, 5] that predicts targets based on similarity of binding pockets using threading.
2. Customized version of the SMAP method  that predicts targets based on similarity of binding pockets and protein fold using profile-profile alignment.
3. The ILbind method  that predicts targets using consensus of 15 support vector machines and combines similarity based on threading and profile-profile alignment.
The proteins are mapped into the UniProt database  using UniProt identifiers to facilitate mapping between PDID, Protein Data Bank, DrugBank, and BindingDB.
1. Wishart DS, Knox C, Guo AC, et al. (2006). DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res 34:D668-72
2. Liu T, Lin Y, Wen X, et al. (2007) BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinities. Nucleic Acids Res 35:D198-201
3. Berman HM, Westbrook J, Feng Z, et al. (2000). The Protein Data Bank. Nucleic Acids Res 28:235-42
4. Brylinski M and Feinstein WP. (2013). eFindSite: Improved prediction of ligand binding sites in protein models using meta-threading, machine learning and auxiliary ligands. J Comput Aided Mol Des. 27(6):551-567
5. Feinstein WP and Brylinski M. (2014). eFindSite: Enhanced fingerprint-based virtual screening against predicted ligand binding sites in protein models. Mol Inform. 33(2):135-50
6. Xie L and Bourne PE (2008) Detecting evolutionary relationships across existing fold space, using sequence order independent profile-profile alignments". Proc Natl Acad Sci USA 105(14):5441-6
7. Hu G, Gao J, Wang K, et al. (2012) Finding protein targets for small biologically relevant ligands across fold space using inverse ligand binding predictions. Structure 20:1815-22
8. The UniProt Consortium. Activities at the Universal Protein Resource (UniProt). Nucleic Acids Res 42:D191-8