S

Shoshana Brown

Neighborhood House

Publishes on Protein Structure and Dynamics, Enzyme Structure and Function, Microbial Metabolic Engineering and Bioproduction. 42 papers and 3.9k citations.

42Publications
3.9kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

InterPro in 2019: improving coverage, classification and access to protein sequence annotations
Alex Mitchell, Teresa K. Attwood, Patricia C. Babbitt et al.|Nucleic Acids Research|2018
Cited by 1.5kOpen Access

The InterPro database (http://www.ebi.ac.uk/interpro/) classifies protein sequences into families and predicts the presence of functionally important domains and sites. Here, we report recent developments with InterPro (version 70.0) and its associated software, including an 18% growth in the size of the database in terms on new InterPro entries, updates to content, the inclusion of an additional entry type, refined modelling of discontinuous domains, and the development of a new programmatic interface and website. These developments extend and enrich the information provided by InterPro, and provide greater flexibility in terms of data access. We also show that InterPro's sequence coverage has kept pace with the growth of UniProtKB, and discuss how our evaluation of residue coverage may help guide future curation activities.

Annotation Error in Public Databases: Misannotation of Molecular Function in Enzyme Superfamilies
Alexandra M. Schnoes, Shoshana Brown, Igor Dodevski et al.|PLoS Computational Biology|2009
Cited by 711Open Access

Due to the rapid release of new data from genome sequencing projects, the majority of protein sequences in public databases have not been experimentally characterized; rather, sequences are annotated using computational analysis. The level of misannotation and the types of misannotation in large public databases are currently unknown and have not been analyzed in depth. We have investigated the misannotation levels for molecular function in four public protein sequence databases (UniProtKB/Swiss-Prot, GenBank NR, UniProtKB/TrEMBL, and KEGG) for a model set of 37 enzyme families for which extensive experimental information is available. The manually curated database Swiss-Prot shows the lowest annotation error levels (close to 0% for most families); the two other protein sequence databases (GenBank NR and TrEMBL) and the protein sequences in the KEGG pathways database exhibit similar and surprisingly high levels of misannotation that average 5%-63% across the six superfamilies studied. For 10 of the 37 families examined, the level of misannotation in one or more of these databases is >80%. Examination of the NR database over time shows that misannotation has increased from 1993 to 2005. The types of misannotation that were found fall into several categories, most associated with "overprediction" of molecular function. These results suggest that misannotation in enzyme superfamilies containing multiple families that catalyze different reactions is a larger problem than has been recognized. Strategies are suggested for addressing some of the systematic problems contributing to these high levels of misannotation.

The Structure–Function Linkage Database
Eyal Akiva, Shoshana Brown, Daniel E. Almonacid et al.|Nucleic Acids Research|2013
Cited by 244Open Access

The Structure-Function Linkage Database (SFLD, http://sfld.rbvi.ucsf.edu/) is a manually curated classification resource describing structure-function relationships for functionally diverse enzyme superfamilies. Members of such superfamilies are diverse in their overall reactions yet share a common ancestor and some conserved active site features associated with conserved functional attributes such as a partial reaction. Thus, despite their different functions, members of these superfamilies 'look alike', making them easy to misannotate. To address this complexity and enable rational transfer of functional features to unknowns only for those members for which we have sufficient functional information, we subdivide superfamily members into subgroups using sequence information, and lastly into families, sets of enzymes known to catalyze the same reaction using the same mechanistic strategy. Browsing and searching options in the SFLD provide access to all of these levels. The SFLD offers manually curated as well as automatically classified superfamily sets, both accompanied by search and download options for all hierarchical levels. Additional information includes multiple sequence alignments, tab-separated files of functional and other attributes, and sequence similarity networks. The latter provide a new and intuitively powerful way to visualize functional trends mapped to the context of sequence similarity.

Leveraging Enzyme Structure−Function Relationships for Functional Inference and Experimental Design:  The Structure−Function Linkage Database
Scott C.‐H. Pegg, Shoshana Brown, Sunil Ojha et al.|Biochemistry|2006
Cited by 151

The study of mechanistically diverse enzyme superfamilies-collections of enzymes that perform different overall reactions but share both a common fold and a distinct mechanistic step performed by key conserved residues-helps elucidate the structure-function relationships of enzymes. We have developed a resource, the structure-function linkage database (SFLD), to analyze these structure-function relationships. Unique to the SFLD is its hierarchical classification scheme based on linking the specific partial reactions (or other chemical capabilities) that are conserved at the superfamily, subgroup, and family levels with the conserved structural elements that mediate them. We present the results of analyses using the SFLD in correcting misannotations, guiding protein engineering experiments, and elucidating the function of recently solved enzyme structures from the structural genomics initiative. The SFLD is freely accessible at http://sfld.rbvi.ucsf.edu.