Georgetown University
Publishes on Genomics and Phylogenetic Studies, RNA and protein synthesis mechanisms, Machine Learning in Bioinformatics. 59 papers and 4.1k citations.
Add your photo, update your bio, and get notified when your ranking changes.
The Protein Identification Resource, which provides the scientific community with an efficient on-line computer system designed for the identification and analysis of protein sequences and their corresponding coding sequences, has been established. The resource consists of an integrated computer system composed of a number of protein and nucleic acid sequence databases and the software necessary to analyze this information effectively.
We observed two unusual patterns of cysteine and glycine residues in transforming growth factor type 1 (TGF1); our computer search found only one other protein, the 19-kilodalton early protein of vaccinia virus, having both patterns. The sequences of epidermal growth factor (EGF) and of the light chains from several components of the blood coagulation system also have one of the patterns, but gaps are required to adjust conserved cysteine and glycine residues in the second pattern. We used several computer analyses to confirm these relationships; the 19-kilodalton protein appears to be related to TGF1 and EGF to the same degree that they are related to each other; all three are more distantly related to the coagulation factors. An evolutionary scheme is presented for these proteins. We suggest that the conservation of cysteine residues, which form the disulfide bonds present in the active EGF molecule, may extend to conservation of disulfide bonds in these other proteins. We also suggest that the structural similarities may be correlated with a protein-binding capability.
The Protein Identification Resource, which provides the scientific community with an efficient on-line computer system designed for the identification and analysis of protein sequences and their corresponding coding sequences, has been established. The resource consists of an integrated computer system composed of a number of protein and nucleic acid sequence databases and the software necessary to analyze this information effectively.