A

Ali Bashir

École Centrale de Lyon

ORCID: 0000-0002-5240-9604

Publishes on Genomics and Phylogenetic Studies, Genomic variations and chromosomal abnormalities, Genomics and Rare Diseases. 103 papers and 10.1k citations.

103Publications
10.1kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

An integrated map of structural variation in 2,504 human genomes
Cited by 2.6kOpen Access

Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association. The Structural Variation Analysis Group of The 1000 Genomes Project reports an integrated structural variation map based on discovery and genotyping of eight major structural variation classes in 2,504 unrelated individuals from across 26 populations; structural variation is compared within and between populations and its functional impact is quantified. The Structural Variation Analysis Group of The 1000 Genomes Project reports an integrated structural variation map based on discovery and genotyping of eight major structural variation classes in genomes for 2,504 unrelated individuals from across 26 populations. They characterize structural variation within and between populations and quantify its functional effect. The authors further create a phased reference panel that will be valuable for population genetic and disease association studies.

Multi-platform discovery of haplotype-resolved structural variation in human genomes
Mark Chaisson, Ashley D. Sanders, Xuefang Zhao et al.|Nature Communications|2019
Cited by 1kOpen Access

The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50 bp) and 27,622 SVs (≥50 bp) per genome. We also discover 156 inversions per genome and 58 of the inversions intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a three to sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The methods and the dataset presented serve as a gold standard for the scientific community allowing us to make recommendations for maximizing structural variation sensitivity for future genome sequencing studies.

Origins of the <i>E. coli</i> Strain Causing an Outbreak of Hemolytic–Uremic Syndrome in Germany
David A. Rasko, Dale R. Webster, Jason W. Sahl et al.|New England Journal of Medicine|2011
Cited by 830Open Access

BACKGROUND: A large outbreak of diarrhea and the hemolytic-uremic syndrome caused by an unusual serotype of Shiga-toxin-producing Escherichia coli (O104:H4) began in Germany in May 2011. As of July 22, a large number of cases of diarrhea caused by Shiga-toxin-producing E. coli have been reported--3167 without the hemolytic-uremic syndrome (16 deaths) and 908 with the hemolytic-uremic syndrome (34 deaths)--indicating that this strain is notably more virulent than most of the Shiga-toxin-producing E. coli strains. Preliminary genetic characterization of the outbreak strain suggested that, unlike most of these strains, it should be classified within the enteroaggregative pathotype of E. coli. METHODS: We used third-generation, single-molecule, real-time DNA sequencing to determine the complete genome sequence of the German outbreak strain, as well as the genome sequences of seven diarrhea-associated enteroaggregative E. coli serotype O104:H4 strains from Africa and four enteroaggregative E. coli reference strains belonging to other serotypes. Genomewide comparisons were performed with the use of these enteroaggregative E. coli genomes, as well as those of 40 previously sequenced E. coli isolates. RESULTS: The enteroaggregative E. coli O104:H4 strains are closely related and form a distinct clade among E. coli and enteroaggregative E. coli strains. However, the genome of the German outbreak strain can be distinguished from those of other O104:H4 strains because it contains a prophage encoding Shiga toxin 2 and a distinct set of additional virulence and antibiotic-resistance factors. CONCLUSIONS: Our findings suggest that horizontal genetic exchange allowed for the emergence of the highly virulent Shiga-toxin-producing enteroaggregative E. coli O104:H4 strain that caused the German outbreak. More broadly, these findings highlight the way in which the plasticity of bacterial genomes facilitates the emergence of new pathogens.

Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding
Kevin McKernan, Heather E. Peckham, Gina L. Costa et al.|Genome Research|2009
Cited by 524Open Access

We describe the genome sequencing of an anonymous individual of African origin using a novel ligation-based sequencing assay that enables a unique form of error correction that improves the raw accuracy of the aligned reads to >99.9%, allowing us to accurately call SNPs with as few as two reads per allele. We collected several billion mate-paired reads yielding approximately 18x haploid coverage of aligned sequence and close to 300x clone coverage. Over 98% of the reference genome is covered with at least one uniquely placed read, and 99.65% is spanned by at least one uniquely placed mate-paired clone. We identify over 3.8 million SNPs, 19% of which are novel. Mate-paired data are used to physically resolve haplotype phases of nearly two-thirds of the genotypes obtained and produce phased segments of up to 215 kb. We detect 226,529 intra-read indels, 5590 indels between mate-paired reads, 91 inversions, and four gene fusions. We use a novel approach for detecting indels between mate-paired reads that are smaller than the standard deviation of the insert size of the library and discover deletions in common with those detected with our intra-read approach. Dozens of mutations previously described in OMIM and hundreds of nonsynonymous single-nucleotide and structural variants in genes previously implicated in disease are identified in this individual. There is more genetic variation in the human genome still to be uncovered, and we provide guidance for future surveys in populations and cancer biopsies.