Brian Karlak

PANTHER: A Library of Protein Families and Subfamilies Indexed by Function

Paul D. Thomas, Michael J. Campbell, Anish Kejariwal et al.|Genome Research|2003

Cited by 3.1kOpen Access

In the genomic era, one of the fundamental goals is to characterize the function of proteins on a large scale. We describe a method, PANTHER, for relating protein sequence relationships to function relationships in a robust and accurate way. PANTHER is composed of two main components: the PANTHER library (PANTHER/LIB) and the PANTHER index (PANTHER/X). PANTHER/LIB is a collection of "books," each representing a protein family as a multiple sequence alignment, a Hidden Markov Model (HMM), and a family tree. Functional divergence within the family is represented by dividing the tree into subtrees based on shared function, and by subtree HMMs. PANTHER/X is an abbreviated ontology for summarizing and navigating molecular functions and biological processes associated with the families and subfamilies. We apply PANTHER to three areas of active research. First, we report the size and sequence diversity of the families and subfamilies, characterizing the relationship between sequence divergence and functional divergence across a wide range of protein families. Second, we use the PANTHER/X ontology to give a high-level representation of gene function across the human and mouse genomes. Third, we use the family HMMs to rank missense single nucleotide polymorphisms (SNPs), on a database-wide scale, according to their likelihood of affecting protein function.

Wiskott–Aldrich Syndrome Protein, a Novel Effector for the GTPase CDC42Hs, Is Implicated in Actin Polymerization

Marc Symons, Jonathan M.J. Derry, Brian Karlak et al.|Cell|1996

Cited by 855Open Access

From genes to proteins: High-throughput expression and purification of the human proteome

Joanna S. Albala, Ken Franke, Ian R. McConnell et al.|Journal of Cellular Biochemistry|2000

Cited by 80

The development of high-throughput methods for gene discovery has paved the way for the design of new strategies for genome-scale protein analysis. Lawrence Livermore National Laboratory and Onyx Pharmaceuticals, Inc., have produced an automatable system for the expression and purification of large numbers of proteins encoded by cDNA clones from the IMAGE (Integrated Molecular Analysis of Genomes and Their Expression) collection. This high-throughput protein expression system has been developed for the analysis of the human proteome, the protein equivalent of the human genome, comprising the translated products of all expressed genes. Functional and structural analysis of novel genes identified by EST (Expressed Sequence Tag) sequencing and the Human Genome Project will be greatly advanced by the application of this high-throughput expression system for protein production. A prototype was designed to demonstrate the feasibility of our approach. Using a PCR-based strategy, 72 unique IMAGE cDNA clones have been used to create an array of recombinant baculoviruses in a 96-well microtiter plate format. Forty-two percent of these cDNAs successfully produced soluble, recombinant protein. All of the steps in this process, from PCR to protein production, were performed in 96-well microtiter plates, and are thus amenable to automation. Each recombinant protein was engineered to incorporate an epitope tag at the amino terminal end to allow for immunoaffinity purification. Proteins expressed from this system are currently being analyzed for functional and biochemical properties.

Assessment of Utility of ESTs for Nucleotide Diversity Using Available Assembled Alignments from dbESt, STACK 2.0 and STACK-INDEX

Brian Karlak, Yoshihide Hayashizaki|Proceedings Genome Informatics Workshop/Genome informatics|1998

Cited by 1

Single Nucleotide Polymorphisms (SNPs) in virtual expressed gene fragment alignments represent a potentially signi cant resource for both the detection of non-coding and coding, sequence variations. We have clustered and assembled 767 866 human ESTs into 76 131 alignments localised to speci c tissues [1]. In addition, we have clustered and aligned 300 000 consensus sequences and unclustered ESTs to generate a comprehensive human gene index of over 38 000 unique linked virtual transcripts (STACKINDEX) with associated alignments. The resulting dataset is a potentially rich resource for the detection and characterisation of alternate splicing and polymorphisms. Public access to these data will allow investigators to add functional and scienti c value to the emerging human gene sequences [2]. We have surveyed the dataset and have developed an initial set of criteria for assessment of possible high likelihood SNPs. We have studied the protein p53 as a model for the system.

Refined music clustering

Christian Weitenberner, Ullas Gargi, Girum Ibssa et al.|Unknown|2018

Cited by 0

Is this you? Claim your profile.

Top publicationsby citations