GenVarLoader: An accelerated dataloader for applying deep learning to personalized genomics
David Laub(University of California San Diego), A. Ho(Salk Institute for Biological Studies), Jeff Jaureguy(Salk Institute for Biological Studies), Adam Klie(University of California San Diego), Rany M. Salem(University of California San Diego), Graham McVicker(Salk Institute for Biological Studies), Hannah Carter(University of California San Diego)
Cited by 2Open Access
Abstract
Deep learning sequence models trained on personalized genomics can improve variant effect prediction, however, applications of these models are limited by computational requirements for storing and reading large datasets. We address this with GenVarLoader, which stores personalized genomic data in new memory-mapped formats with optimal data locality to achieve ~1,000x faster throughput and ~2,000x better compression compared to existing alternatives.
Related Papers
No related papers found
Powered by citation graph analysis