Next-generation DNA sequencing (NGS) projects, such as the 1000 Genomes Project, are already revolutionizing our understanding of genetic variation among individuals. However, the massive data sets generated by NGS--the 1000 Genome pilot alone includes nearly five terabases--make writing feature-rich, efficient, and robust analysis tools difficult for even computationally sophisticated individuals. Indeed, many professionals are limited in the scope and the ease with which they can answer scientific questions by the complexity of accessing and manipulating the data produced by these machines. Here, we discuss our Genome Analysis Toolkit (GATK), a structured programming framework designed to ease the development of efficient and robust analysis tools for next-generation DNA sequencers using the functional programming philosophy of MapReduce. The GATK provides a small but rich set of data access patterns that encompass the majority of analysis tool needs. Separating specific analysis calculations from common data management infrastructure enables us to optimize the GATK framework for correctness, stability, and CPU and memory efficiency and to enable distributed and shared memory parallelization. We highlight the capabilities of the GATK by describing the implementation and application of robust, scale-tolerant tools like coverage calculators and single nucleotide polymorphism (SNP) calling. We conclude that the GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.
Transfusion-dependent -thalassemia (TDT) and sickle cell disease (SCD) are severe monogenic diseases with severe and potentially life-threatening manifestations. BCL11A is a transcription factor that represses -globin expression and fetal hemoglobin in erythroid cells. We performed electroporation of CD34+ hematopoietic stem and progenitor cells obtained from healthy donors, with CRISPR-Cas9 targeting the BCL11A erythroid-specific enhancer. Approximately 80% of the alleles at this locus were modified, with no evidence of off-target editing. After undergoing myeloablation, two patients -one with TDT and the other with SCD -received autologous CD34+ cells edited with CRISPR-Cas9 targeting the same BCL11A enhancer. More than a year later, both patients had high levels of allelic editing in bone marrow and blood, increases in fetal hemoglobin that were distributed pancellularly, transfusion independence, and (in the patient with SCD) elimination of vaso-occlusive episodes. (Funded by CRISPR Therapeutics and Vertex Pharmaceuticals; ClinicalTrials.gov numbers, NCT03655678 for CLIMB THAL-111 and NCT03745287 for CLIMB SCD-121.) 2] Mutations in HBB that cause TDT 4 result in reduced ( + ) or absent ( 0 ) -globin synthesis and an imbalance between the -like and -like globin (e.g., , , and ) chains of hemoglobin, which causes ineffective erythropoiesis. Sickle hemoglobin is the result of a point mutation in HBB that replaces glutamic acid with valine at amino acid position 6. Polymerization of deoxygenated sickle hemoglobin causes erythrocyte deformation, hemolysis, anemia, painful vaso-occlusive episodes, irreversible end-organ damage, and a reduced life expectancy. reatment options primarily consist of transfusion and iron chelation in patients with TDT 7 and pain management, transfusion, and hydroxyurea in those with SCD. 8 Recently approved therapies, including luspatercept 9 and crizanlizumab, 10 have reduced transfusion requirements in patients with TDT and the incidence of vaso-occlusive episodes in those with SCD, respectively, but neither treatment addresses the underlying cause of the disease nor fully ameliorates disease manifestations. Allogeneic bone marrow transplantation can cure both TDT and
The recent discovery of mutations in metabolic enzymes has rekindled interest in harnessing the altered metabolism of cancer cells for cancer therapy. One potential drug target is isocitrate dehydrogenase 1 (IDH1), which is mutated in multiple human cancers. Here, we examine the role of mutant IDH1 in fully transformed cells with endogenous IDH1 mutations. A selective R132H-IDH1 inhibitor (AGI-5198) identified through a high-throughput screen blocked, in a dose-dependent manner, the ability of the mutant enzyme (mIDH1) to produce R-2-hydroxyglutarate (R-2HG). Under conditions of near-complete R-2HG inhibition, the mIDH1 inhibitor induced demethylation of histone H3K9me3 and expression of genes associated with gliogenic differentiation. Blockade of mIDH1 impaired the growth of IDH1-mutant--but not IDH1-wild-type--glioma cells without appreciable changes in genome-wide DNA methylation. These data suggest that mIDH1 may promote glioma growth through mechanisms beyond its well-characterized epigenetic effects.
A number of human cancers harbor somatic point mutations in the genes encoding isocitrate dehydrogenases 1 and 2 (IDH1 and IDH2). These mutations alter residues in the enzyme active sites and confer a gain-of-function in cancer cells, resulting in the accumulation and secretion of the oncometabolite (R)-2-hydroxyglutarate (2HG). We developed a small molecule, AGI-6780, that potently and selectively inhibits the tumor-associated mutant IDH2/R140Q. A crystal structure of AGI-6780 complexed with IDH2/R140Q revealed that the inhibitor binds in an allosteric manner at the dimer interface. The results of steady-state enzymology analysis were consistent with allostery and slow-tight binding by AGI-6780. Treatment with AGI-6780 induced differentiation of TF-1 erythroleukemia and primary human acute myelogenous leukemia cells in vitro. These data provide proof-of-concept that inhibitors targeting mutant IDH2/R140Q could have potential applications as a differentiation therapy for cancer.