S

Stephanie Kim

Seoul National University Bundang Hospital

ORCID: 0000-0002-5116-0798

Publishes on Protein Structure and Dynamics, Machine Learning in Bioinformatics, Epigenetics and DNA Methylation. 46 papers and 3.4k citations.

46Publications
3.4kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

Fast and accurate protein structure search with Foldseek
Michel van Kempen, Stephanie Kim, Charlotte Tumescheit et al.|Nature Biotechnology|2023
Cited by 2.3kOpen Access

As structure prediction methods are generating millions of publicly available protein structures, searching these databases is becoming a bottleneck. Foldseek aligns the structure of a query protein against a database by describing tertiary amino acid interactions within proteins as sequences over a structural alphabet. Foldseek decreases computation times by four to five orders of magnitude with 86%, 88% and 133% of the sensitivities of Dali, TM-align and CE, respectively.

Fast and accurate protein structure search with Foldseek
Michel van Kempen, Stephanie Kim, Charlotte Tumescheit et al.|bioRxiv (Cold Spring Harbor Laboratory)|2022
Cited by 377Open Access

As structure prediction methods are generating millions of publicly available protein structures, searching these databases is becoming a bottleneck. Foldseek aligns the structure of a query protein against a database by describing the amino acid backbone of proteins as sequences over a structural alphabet. Foldseek decreases computation times by four to five orders of magnitude with 86%, 88% and 133% of the sensitivities of DALI, TM-align and CE, respectively.

AlphaFold2 reveals commonalities and novelties in protein structure space for 21 model organisms
Nicola Bordin, Ian Sillitoe, Vamsi Nallapareddy et al.|Communications Biology|2023
Cited by 99Open Access

Deep-learning (DL) methods like DeepMind's AlphaFold2 (AF2) have led to substantial improvements in protein structure prediction. We analyse confident AF2 models from 21 model organisms using a new classification protocol (CATH-Assign) which exploits novel DL methods for structural comparison and classification. Of ~370,000 confident models, 92% can be assigned to 3253 superfamilies in our CATH domain superfamily classification. The remaining cluster into 2367 putative novel superfamilies. Detailed manual analysis on 618 of these, having at least one human relative, reveal extremely remote homologies and further unusual features. Only 25 novel superfamilies could be confirmed. Although most models map to existing superfamilies, AF2 domains expand CATH by 67% and increases the number of unique 'global' folds by 36% and will provide valuable insights on structure function relationships. CATH-Assign will harness the huge expansion in structural data provided by DeepMind to rationalise evolutionary changes driving functional divergence.

Novel machine learning approaches revolutionize protein knowledge
Nicola Bordin, Christian Dallago, Michael Heinzinger et al.|Trends in Biochemical Sciences|2022
Cited by 72Open Access

Breakthrough methods in machine learning (ML), protein structure prediction, and novel ultrafast structural aligners are revolutionizing structural biology. Obtaining accurate models of proteins and annotating their functions on a large scale is no longer limited by time and resources. The most recent method to be top ranked by the Critical Assessment of Structure Prediction (CASP) assessment, AlphaFold 2 (AF2), is capable of building structural models with an accuracy comparable to that of experimental structures. Annotations of 3D models are keeping pace with the deposition of the structures due to advancements in protein language models (pLMs) and structural aligners that help validate these transferred annotations. In this review we describe how recent developments in ML for protein science are making large-scale structural bioinformatics available to the general scientific community.