Replicated microarray datacDNA microarrays permit us to study the expression of thousands of genes simultaneously. They are now used in many dierent contexts to compare mRNA levels between two or more samples of cells. Microarray experiments typically give us expression measurements on a large number of genes, say 10,000-20,000, but with few, if any replicates for each gene. Traditional methods using means and standard deviations to detect dierential expression are not completely satisfactory in this context, and so a dierent approach seems desirable. In this paper we present an empirical Bayes method for analysing replicated microarray data. Data from all the genes in a replicate set of experiments are combined into estimates of parameters of a prior distribution. These parameter estimates are then combined at the gene level with means and standard deviations to form a statistic B which can be used to decide whether dierential expression has occurred. The statistic B avoids the problems of using averages or t-statistics. The method is illustrated using data from an experiment comparing the expression of genes in the livers of SR-BI transgenic mice with that of the corresponding wild-type mice. In addition we present the results of a simulation study estimating the ROC curve of B and three other statistics for determining dierential expression: the average and two simple modications of the usual t-statistic. B was found to be the most powerful of the four, though the margin was not great. The data were simulated to resemble the SR-BI data. Keywords: cDNA microarray, dierential expression, empirical Bayes, replication, ROC curve, t-statistic Department of Mathematics, Uppsala University y Correspondence should be addressed to Ingrid Lonnstedt, telephone/fax +46-18-4712842/4713201, e...
Comparison of clustering tools in R for medium-sized 10x Genomics single-cell RNA-sequencing data<ns4:p> <ns4:bold>Background:</ns4:bold> The commercially available 10x Genomics protocol to generate droplet-based single-cell RNA-seq (scRNA-seq) data is enjoying growing popularity among researchers. Fundamental to the analysis of such scRNA-seq data is the ability to cluster similar or same cells into non-overlapping groups. Many competing methods have been proposed for this task, but there is currently little guidance with regards to which method to use. </ns4:p> <ns4:p> <ns4:bold>Methods:</ns4:bold> Here we use one gold standard 10x Genomics dataset, generated from the mixture of three cell lines, as well as three silver standard 10x Genomics datasets generated from peripheral blood mononuclear cells to examine not only the accuracy but also robustness of a dozen methods. </ns4:p> <ns4:p> <ns4:bold>Results:</ns4:bold> We found that some methods, including Seurat and Cell Ranger, outperform other methods, although performance seems to be dependent on the complexity of the studied system. Furthermore, we found that solutions produced by different methods have little in common with each other. </ns4:p> <ns4:p> <ns4:bold>Conclusions:</ns4:bold> In light of this, we conclude that the choice of clustering tool crucially determines interpretation of scRNA-seq data generated by 10x Genomics. Hence practitioners and consumers should remain vigilant about the outcome of 10x Genomics scRNA-seq analysis. </ns4:p>
Comparison of clustering tools in R for medium-sized 10x Genomics single-cell RNA-sequencing data<ns4:p> <ns4:bold>Background:</ns4:bold> The commercially available 10x Genomics protocol to generate droplet-based single cell RNA-seq (scRNA-seq) data is enjoying growing popularity among researchers. Fundamental to the analysis of such scRNA-seq data is the ability to cluster similar or same cells into non-overlapping groups. Many competing methods have been proposed for this task, but there is currently little guidance with regards to which method to use. </ns4:p> <ns4:p> <ns4:bold>Methods:</ns4:bold> Here we use one gold standard 10x Genomics dataset, generated from the mixture of three cell lines, as well as multiple silver standard 10x Genomics datasets generated from peripheral blood mononuclear cells to examine not only the accuracy but also running time and robustness of a dozen methods. </ns4:p> <ns4:p> <ns4:bold>Results: </ns4:bold> We found that Seurat outperformed other methods, although performance seems to be dependent on many factors, including the complexity of the studied system. Furthermore, we found that solutions produced by different methods have little in common with each other. </ns4:p> <ns4:p> <ns4:bold>Conclusions: </ns4:bold> In light of this we conclude that the choice of clustering tool crucially determines interpretation of scRNA-seq data generated by 10x Genomics. Hence practitioners and consumers should remain vigilant about the outcome of 10x Genomics scRNA-seq analysis. </ns4:p>
Familial testicular cancer and second primary cancers in testicular cancer patients by histological typeC Dong, Ingrid Lönnstedt, Kari Hemminki|European Journal of Cancer|2001 Physical fitness, but not muscle strength, is a risk factor for death in amyotrophic lateral sclerosis at an early agePeter Mattsson, Ingrid Lönnstedt, Ingela Nygren et al.|Journal of Neurology Neurosurgery & Psychiatry|2010 BACKGROUND: Amyotrophic lateral sclerosis (ALS) is a rare neurodegenerative disorder mainly characterised by motor symptoms. Extensive physical activity has been implicated in the aetiology of ALS. Differences in anthropometrics, physical fitness and isometric strength measured at 18-19 years were assessed to determine if they are associated with subsequent death in ALS. METHOD: Data on body weight and height, physical fitness, resting heart rate and isometric strength measured at conscription were linked with data on death certificates in men born in 1951-1965 in Sweden (n=809 789). Physical fitness was assessed as a maximal test on an electrically braked bicycle ergometer. Muscle strength was measured as the maximal isometric strength in handgrip, elbow flexion and knee extension in standardised positions, using a dynamometer. Analyses were based on 684 459 (84.5%) men because of missing data. A matched case control study within this sample was performed. The population was followed until 31 December 2006, and 85 men died from ALS during this period. RESULTS: Weight adjusted physical fitness (W/kg), but not physical fitness per se, was a risk factor for ALS (OR 1.98, 95% CI 1.32 to 2.97), whereas resting pulse rate, muscle strength and other variables were not. CONCLUSIONS: Physical fitness, but not muscle strength, is a risk factor for death at early age in ALS. This may indicate that a common factor underlies both fitness (W/kg) and risk of ALS.