C

Chang Han

Hunan Normal University

ORCID: 0000-0002-6282-1256

Publishes on Cosmology and Gravitation Theories, Bioinformatics and Genomic Networks, Genomics and Phylogenetic Studies. 10 papers and 1.1k citations.

10Publications
1.1kTotal Citations
#7in Base Editing

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

Majorbio Cloud: A one‐stop, comprehensive bioinformatic platform for multiomics analyses
Yi Ren, Yu Guo, Caiping Shi et al.|iMeta|2022
Cited by 847Open Access

The platform consists of three modules, which are pre-configured bioinformatic pipelines, cloud toolsets, and online omics' courses. The pre-configured bioinformatic pipelines not only combine analytic tools for metagenomics, genomes, transcriptome, proteomics and metabolomics, but also provide users with powerful and convenient interactive analysis reports, which allow them to analyze and mine data independently. As a useful supplement to the bioinformatics pipelines, a wide range of cloud toolsets can further meet the needs of users for daily biological data processing, statistics, and visualization. The rich online courses of multi-omics also provide a state-of-art platform to researchers in interactive communication and knowledge sharing.

Majorbio Cloud 2024: Update single‐cell and multiomics workflows
Chang Han, Caiping Shi, Linmeng Liu et al.|iMeta|2024
Cited by 249Open Access

Majorbio Cloud (https://cloud.majorbio.com/) is a one-stop online analytic platform aiming at promoting the development of bioinformatics services, narrowing the gap between wet and dry experiments, and accelerating the discoveries for the life sciences community. In 2024, three single-omics workflows, two multiomics workflows, and extensions were newly released to facilitate omics data mining and interpretation. Advances in high-throughput multiomics technologies have significantly influenced life science and basic medical research, specifically based on multiomics data, including genomic/transcriptomic sequencings and proteomic/metabolomic mass spectra, paving the way for the discovery of novel predictive biomarkers for predicting treatment response from diverse dimension levels. The state-of-the-art multiomics technologies have enabled researchers to understand biological processes and molecular functions in health and disease. The emerging novel omics strategies and instruments continue to evolve toward higher throughput and lower detection costs. The evergrowing quantity of multiomics data needed to have access to the resources and be analyzed in an easy, fast, and accurate way. The requirement for the development and application of appropriate bioinformatic tools and pipelines to interpret these data is urgent. Two key elements of omics are automatic data analysis and data visualization. Bioinformatics analysis platforms, such as Cell Ranger [1], MetaboAnalyst [2], GEPIA2 [3], and iNAP [4] provide web interfaces to access the data and computational results. However, these interaction-friendly web services are designed for a single type of omics. Majorbio Cloud (https://cloud.majorbio.com/) offers an easy and powerful approach to profiling the bulk transcriptome, single-cell transcriptome, proteome, metabolome, metagenome, and other omics data. It facilitates researchers to analyze complex multiomics data and infer the biological meaning of integrated omics data. Since Majorbio Cloud's first publication in iMeta, it has attracted the attention of researchers around the world and has been widely used by researchers who are not specialists in omics or bioinformatics [5]. Furthermore, it is an interactive communication and omics knowledge dispersion platform. Single-cell RNA sequencing is an emerging technology for high-throughput sequencing analysis of genetic material at the level of individual cells [6]. It has been widely applied in immunology, developmental biology, oncology, cardiology, and neurobiology. The single-cell transcriptomics workflow is an easy-to-use and effective pipeline for high-dimensional single-cell transcriptome data mining, including the following six steps: (1) data preprocessing; (2) cell filtration; (3) batch effect removal and sample merging; (4) clustering; (5) marker gene identification; and (6) downstream analysis. The detailed process is as follows: Reads are processed using the Cell Ranger (v7.1.0) with default parameters. FASTQ files generated by the Illumina sequencer are aligned to the genome. The Seurat package was used for cell normalization and regression based on the unique molecular identifier counts for each sample and mito % to obtain the scaled data, which was normalized by the function NormalizeData for further analysis. The function FindVariableGenes was used to calculate highly variable genes across the single cells. Unsupervised cell cluster results were generated based on the principal component analysis's (PCA's) top 30 principal components by applying the graph-based cluster method (resolution 0.8) in the Seurat package. For subclustering, we applied the same procedure of scaling, dimensionality reduction, and clustering to a specific set of data (usually restricted to one cell type). For each cluster, we used the Wilcoxon Rank-Sum test to find significant deferentially expressed genes comparing the remaining clusters. SingleR [7] and known marker genes were used to identify cell types. Downstream analysis, such as differential expression genes and pathway enrichment of different cell types, pseudo-time analysis, and cell communication analysis, could be used to reveal the functions, states, and interactions of various types of cells in a sample (Figure 1). The proteomics workflow is a user-friendly, comprehensive pipeline for data-independent acquisition mass spectrometry-based, label-free quantitation (LFQ), and Tandem mass tag-based quantitative proteomics data processing, analysis, and interpretation (Figure 2). The standard proteomics workflow consists of seven main modules: data processing, protein expression and functional annotation, statistical analysis, protein set analysis, weighted gene correlation network analysis (WGCNA), gene set enrichment analysis (GSEA), and time-series data analysis. An additional module, biomarker discovery and model development is provided for medical cohort research. The function of the proteomics data processing module is low-quality data filtering and missing value estimation. The protein expression and functional annotation module includes Venn, PCA, correlation analysis, and functional annotations based on databases or software. Paired/unpaired t test, analysis of variance, Kruskal–Wallis test, and post-hoc test are provided for statistically significant differences in protein identification. A protein set is a protein list that is related to the phenotype of the research object according to the protein expression profile, functional annotation, biological pathway enrichment, and research background. Users can generate protein sets of their interest and interpret the data via clustering, protein–protein interaction, pathway analysis, functional enrichment, and so on. LASSO-Logistic/Cox regression, Random Forest [8], and SVM [9] can be used for disease risk prediction, early diagnosis, prognosis monitoring, and response to treatment. Metabolomics research is primarily based on the use of liquid/gas chromatography-mass spectrometry and nuclear magnetic resonance spectroscopy to detect, identify, and quantify small molecule metabolites in organisms [10]. Metabolomics data is large and complex, often requiring specialized data analysis software as well as extensive knowledge of cheminformatics, bioinformatics, and statistics. To enable users to perform metabolomics data analysis easily and quickly, we provide a comprehensive solution for metabolomics workflow (Figure S1). The standard metabolomics workflow consists of five steps: (1) Data preprocessing: the methods mainly include filtering the missing values of the original data, missing value estimation, data normalization, quality control verification, and data transformation. (2) Sample comparison analysis: multivariate statistical analysis was performed by PCA and partial least squares discriminant analysis (PLS-DA); (3) metabolite annotation: metabolites were annotated in kyoto encyclopedia of genes and genomes (http://www.genome.jp/kegg/) and human metabolome database (https://hmdb.ca/) databases; (4) differential expression metabolites analysis: a combination of multidimensional analysis and single-dimensional analysis was used to screen differential metabolites between groups; and (5) metabolite set analysis: analysis and visualization of the key or differential expression metabolites, such as metabolite clustering, correlation analysis, and so on. Moreover, we also provide some advanced analyses to reveal the mysteries of biological processes, such as biomarker discovery by random forest, support vector machine (SVM), and so on. The multiomics technologies facilitate researchers to uncover underlying mechanistic insights into disease pathophysiology and delineate the landscape of clinical phenotypes. Multiomics provides an integrated perspective across multiple levels, while single omics data can only partially explain one aspect of complex biological processes [11]. The transcriptomic and proteomic data combined analysis pipeline supports differential expression analysis, correlation between messenger RNA (mRNA) and protein abundance, functional annotation and enrichment, GSVA [12], and interactive visualization including Venn, quadrant diagram, nine quadrant diagram, bubble plot, box plot, and donut plot. The pipeline enables a combined, complementary insight, which improves a comprehensive understanding of biological molecular processes from mRNA to protein. The microbiome and metabolome association analysis workflow can be used to analyze the association between species/function and metabolites so as to help establish the logical association between “species/function—metabolite—phenotype/target organ.” The results systematically delineate the regulatory mechanisms of biological processes of different dimensions. To facilitate the intuitive presentation of scientific findings, the workflow provides a broad diversity of analyses. The main analysis contents are as follows: (1) annotation and abundance (species, KEGG orthology genes, and metabolic species) of the single-omics feature set; (2) procrustes and orthogonal partial least squares discriminant analysis (O2PLS) are used to analyze the synergy between microbial communities and metabolites and to screen the species and metabolites that contributed the most to distinguishing different groups of samples. (3) To explain the association between key flora and metabolites can be achieved by HCLUST correlation analysis, Mantel test network heatmap, expression correlation heatmap and chord map, expression correlation network, linear regression analysis, MaAsLin analysis, and canonical correlation analysis. (4) The microbiome and metabolome data are used to form a combinatorial marker panel, and four integrated machine learning algorithms, including random forest, SVM, least absolute shrinkage and selection operator (LASSO), and logistic regression, are used to efficiently screen predictive biomarkers. In addition, A metabolic network-based tool for inferring mechanism-supported relationships in microbiome-metabolome data (MIMOSA2) [13], mmvec [14], and WGCNA are available for further analyses to interpret the possible interactions between microorganisms and metabolites. (5) Metabolite detection technology is used to detect intermediate metabolites. In combination with the metabolic pathways predicted by metagenome data, the downstream metabolic pathways can be reconstructed to obtain a complete microbial metabolic pathway. To improve the user experience and expand the depth of analysis, we have developed a completely new interactive analysis mode. Taking the eukaryotic reference transcriptome analysis pipeline as an example, users can select the data table generated in the pipeline and set parameters in extension tools to complete more in-depth data mining. The intermediate data generated by the workflow is extracted into javascript object notation format parameters, encrypted transmission to a specific tool via base64, and parameter parsing is finished in the tool (Figure S2). Twenty-eight eukaryotic reference transcriptome analysis pipeline-specific extension tools are available for users, including integrative genomics viewer visualization, multiref-genome blast, differential expression genes radar chart, hyperbolic curve volcano chart, circos chart, single gene GSEA, multipathways GSEA, and so on. Since October 2016, more than 150,000 scientific and clinical research users, involving over 9000 well-known universities and institutions, have completed more than 600,000 omics data mining tasks on the Majorbio Cloud platform. In 2024, 2015 journal articles cited Majorbio Cloud in their methods. 20, 62, and 393 research articles have been published with the facilitation of the single-cell transcriptomics workflow, proteomics workflow, and metabolomics workflow for data mining, respectively. We will constantly update and iterate the platform to make our users delve more deeply into the omics data. Jichen Han, Chang Han, Caiping Shi, and Linmeng Liu conceived the platform and idea. Linmeng Liu, Caiping Shi, and Wenyao Fu implemented the MIST main code. Chang Han, Qianqian Yang, Yan Wang, and Xiaodan Li designed the graphical user interface. Chang Han, Yan Wang, Xiaodan Li, and Qianqian Yang wrote the manuscript. Chang Han was responsible for editing and revising the manuscript. All authors contributed to the development of Majorbio Cloud. All authors have read the final manuscript and approved it for publication. The authors acknowledge Dr. Boya Liao for the advice on this manuscript. This work was supported by a grant from the Shanghai Science and Technology Little Giant Project (220HX001400). The authors declare no conflict of interest. Supporting Information (graphical abstract, Supporting Information Table) may be found in the online DOI or iMeta Science http://www.imeta.science/. Figure S1: Metabolomics workflow. Figure S2: “Pipeline + Extensions” interactive analysis mode. Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.

MIST: A microbial identification and source tracking system for next‐generation sequencing data
Minghui Song, Chang Han, Linmeng Liu et al.|iMeta|2023
Cited by 5Open Access

The Professional Committee of Microbiology of the National Pharmacopoeia Commission organized the drafting of the Technical Guidelines for Microbial Whole Genome Sequencing (WGS), aiming to standardize the method process and technical indicators of microbial WGS and ensure the accuracy of sequencing and identification. On the basis of the Guidelines, we developed an integrated microbial identification and source tracking (MIST) system, which could meet the needs of microbial identification and contamination investigation in food and drug quality control. MIST integrates three analysis pipelines: 16S/18S/internal transcribed spacer amplicon-based microbial identification, WGS-based microbial identification, and single-nucleotide polymorphism-based microbial source tracking. MIST can analyze sequence data in a variety of formats, such as Fasta, base call file, and FASTQ. It can be connected to a high-throughput sequencing instrument to acquire sequencing data directly. We also developed a publicly accessible web server for MIST (http://syj.i-sanger.cn). Microbial identification is of great value for clinical, epidemiological, food, and pharmaceutical research [1]. Traditionally, microbes have been identified based on their morphological, physical, and biochemical properties [2]. However, many prokaryotic microbes are difficult to culture using traditional methods [3] and thus cannot be detected by traditional methods. These unculturable microbes harbor a potential source of novel metabolites and are essential components of natural metabolic networks [4]. Moreover, traditional methods also fail to detect novel culturable microbes and have problems in detecting unusual microbes that have not been comprehensively evaluated [5]. High-throughput sequencing technology (HTS) has enabled sequence-based genomics to become one of the routine and promising methods for microbial identification [6]. HTS-based methods can be subdivided into two categories: amplicon sequencing [7], which amplifies conserved sequences in microbes (e.g., 16S ribosomal RNA [rRNA] for bacteria and 18S recombinant DNA [rDNA]/internal transcribed spacer [ITS] region for fungi), and whole genome sequencing (WGS) [8], which sequences the whole genomes of a microbe after isolation. The 16S rDNA-based amplicon sequencing is an efficient method to investigate all bacteria in a sample because this region has been recognized as the conventional method for prokaryotic identification [9]. The community has accumulated a large amount of well-characterized 16S rDNA sequences in large databases, such as Ribosomal Database Project [10] and SILVA [11]. Amajor limitation of amplicon sequencing is its lack of discrimination among closely related species [12]. WGS-based bacterial identification provides higher discriminatory power and allows bacterial identification at species or even at strain level. It also provides a powerful way for investigating functional genes, such as antibiotic resistance genes (ARGs) [13, 14] and virulence factors genes (VFGs) [15]. Furthermore, the multilocus sequence type (MLST) [16] and single-nucleotide polymorphism (SNP) [17-19] enable source tracking of genetically closely related bacteria that were isolated from different sources. Such analysis enables WGS-based applications in multiple fields, such as forensic investigations, strain identification, and outbreak tracking [20]. Currently, there are some web services and tools for microbial identification, for example, BacWGSTdb [21], ImageGP [22], Bacterial Analysis Pipeline (CGE) (https://cge.cbs.dtu.dk/services/cge/) [23], Qiime2 [24], EasyAmplicon [25], GCType (GCM Type Strain Sequencing project), and rANOMALY [26]. Each website has its own unique strengths and limitations. For example, BacWGSTdb offers MLST-based and whole-genome-based bacterial genotyping but only accepts assembly genome files as inputs. CGE provides various tools for genome-based phenotyping, phylogeny, and annotation of ARGs and VFGs. However, users should upload their data into FASTQ each of these tools separately due to the lack of an integrated backend. Furthermore, all web-based tools require a fast and consistent internet connection to upload raw sequence files, which can have sizes of hundreds to thousands of MBs [8]. With the development of NGS technology, the downstream bioinformatics analysis is challenging, and more software and systems need to be developed [27, 28]. Here, we present a system for the classification and identification of microbes. It implements sophisticated pipelines for both amplicon sequencing data, which enable efficient profiling of unculturable microbes, and WGS data, which enable accurate genotyping of cultured microbes. The system also implements pipelines for the MLST, SNP-based source tracking, and ARGs or VFGs annotation from WGS data. The system consists of three pipelines: (1) amplicon-based microbial identification, such as 16S rDNA/18S rDNA/ITS genes, (2) WGS-based microbial identification, and (3) SNP-based source tracking. To initiate the analysis, users only need to choose sequencing files in base call file or FASTQ format generated by Illumina sequencer, or Fasta-formatted sequence files (such as assembled genomes or 16S sequences) into the server. Then, users can create a task by selecting a pipeline and setting corresponding parameters. Finally, sequencing data and parameters are submitted to the server and trigger the analytic pipelines (Figure 1A). The system provides mainstream reference databases for microbial identification and functional annotation (Figure 1B). We also have a data management system that is responsible for monitoring the processing tasks and managing the database, such as inputs and outputs files (Figure 1D). Users can view the task results on the online interactive analysis report interface and download the results for further use (Figure 1C). This pipeline can be used to identify microbes, cultured or uncultured, using 16S/18S rDNA and ITS regions. The pipeline contains “Quality Control,” “Primer Removal,” “Denoising,” “Annotation,” and “Evaluation” functional components. In short, Fastp v0.23.4 [29] was used to perform quality control and clean the paired-end (PE) FASTQ reads by trimming and filtering reads based on their quality and length. The reads were truncated at any site receiving an average quality score of <20 over a 50 bp sliding window, and the truncated reads shorter than 50 bp were discarded; reads containing ambiguous characters were also discarded. The resulting reads were subjected to the server for merging the pair-end reads, followed by primer removal by a homemade Python script, duplicate removal by vsearch v2.22.1 [30], and denoising by deblur v1.1.1 [31]. The procedure above generates a set of amplicon sequence variants (ASVs), which were each treated as a taxonomic unit. Each ASV was then aligned to a reference genome database using BLASTn v2.11.0 [32]. The taxonomic classification of ASV was estimated by best-hits matches in the reference database. Phylogenetic tree was constructed by the maximum likelihood (ML) method. The workflow is illustrated in Figure 2A. We selected dozens of bacterial species from two different habitats, the human gut and marine, and generated corresponding simulated sequencing data based on the V3–V4, V4, and V4–V5 regions of 16S ribosomal gene. On the basis of the simulated data, the performance of the amplicon identification program was tested. All the bacteria were identified correctly on the genus level (Table S1). The WGS has been increasingly used in basic research and clinical diagnostics. In our system, we used housekeeping genes and Average Nucleotide Identity (ANI) to identify microbial species and infer their phylogenetic relationships with others. The pipeline contains six modules: Quality control, Assembly, Gene prediction, ANI calculation, Annotation, and MLST. Fastp was used for quality control and cleaning the PE FASTQ reads. In the assembly process, SPAdes v3.11 [33] was used to assemble the genome, but for some contaminated samples, the metaSPAdes v3.10 [34] was used for contaminated sample assembly. BUSCO v5.1 [35] was used to evaluate the completeness and contamination of the genomes. We used Prodigal to predict the open reading frames and then translated them into protein products. HMMER v3.1b [36] was used to find the 31 single-copy housekeeping genes (for genes list, see genome database curation) in the genome. The databases CARD v3.1.3 and carbohydrate-active enzymes (CAZy) (202001 updated) [37] were used separately to identify the possible ARGs and CAZy, with the parameter of e-value > 1e − 5. The database virulence factor database (VFDB) 2022 is used to identify potential virulence factors for the identified pathogen strain. Extracting the sequences of the single-copy housekeeping genes from predicted genes after HMM search against 31 single-copy housekeeping genes profiles. Blasting each of the housekeeping genes against the 31 single-copy housekeeping genes database and keeping the top 200 blast results for each gene under e-value > 1e − 5 with the same score and identity. For each species in the database, we then counted the number of housekeeping genes that included the species in the blast results and ranked species based on the number. By default, the pipeline filtered out the species with the counted number of housekeeping genes less than 15, but this value can be modified by users. So our strategy can identify not only the cultured individual microbes but also the contaminating samples. The ANI value was calculated between the genome of the sample and each genome of the species selected from the above method, and only the maximum ANI value of a species was reported. For some species that contained too many strains, we chose up to 1000 strains for ANI calculation. Barrnap v0.9 (https://github.com/tseemann/barrnap) was used to predict 16S rDNA. The phylogenetic tree of 16S rDNA and housekeeping genes was built using IQ-TREE v1.6.12 [38]. Further, if the species identified were included in the PubMLST database (http://pubmlst.org) [39], the molecular typing of the sample was analyzed automatically. The workflow is illustrated in Figure 2C. This workflow was applied to analyze a sample, downloaded from the National Center for Biotechnology Information Short Read Archive database under accession number: SRR12560292. The sample data contained 1,418,820 reads, which produced 46 scaffolds, and the length of the assembly was 2.76 Mbp. The 31 single-copy housekeeping genes were enriched in Staphylococcus aureus, and the S. aureus S3 was the most related strain in the database. The MLST type was ST22, and a total of 142 genes were identified as having a role in the resistance to various antibiotics in CARD and 462 virulence factors in this sample (Figure 3). The genomes of 560 ATCC standard strains were downloaded to test the accuracy of our identification procedure. There were only five genomes whose identification results were inconsistent with their own names. Through careful analysis, it was found that three of them were caused by the naming error of the reference species in the database (GTDB database has corrected their names based on WGS). The other two had disputes about the nomenclature of the representative strains. However, all of our identifications came from the highest-scoring genomes in the database (Tables S2–S4). In practice, in addition to microbial species identification, we also need to analyze the evolutionary relationship between different isolates of a certain species. For example, in a pharmaceutical factory environment, we can determine the source of strain contamination by analyzing the evolutionary distance between isolates. Two modes for microbial traceability by SNP phylogeny are integrated into the system, which are implemented through the software EToki v1.2 [40] and kSNP v3.0 [18], respectively. In the EToki mode, SNPs are called by comparing genomes to a reference genome, and the derived consensus sequence file is used to create an ML phylogeny. The kSNP is a program for SNP identification and phylogenetic analysis without genome alignment or the requirement for reference genomes, which is more useful when the concerned microorganisms are unculturable or have a large intraspecies evolutionary distance. In addition, a phylogenetic tree view is provided in both modes. The workflow is illustrated in Figure 2B. SILVA v138 and UNITE v8.0 [41] are integrated as the source of the amplicon reference database used in the microbial identification by 16S rDNA/18S rDNA/ITS pipeline and the microbial community diversity analysis pipeline. Details of the reference database are described in Table 1. In addition, we built a housekeeping gene database covering 223,491 bacterial RefSeq [42] genomes for fast and accurate profiling of microbial identification in the WGS workflow. Genes with the same name or product of 31 single-copy housekeeping genes (dnaG, frr, infC, nusA, pgk, pyrG, rplA, rplB, rplC, rplD, rplE, rplF, rplK, rplL, rplM, rplN, rplP, rplS, rplT, rpmA, rpoB, rpsB, rpsC, rpsE, rpsI, rpsJ, rpsK, rpsM, rpsS, smpB, and tsf) were extracted from each genome to construct the full database, which contains 6,855,279 amino acid sequences in total. The 31 single-copy housekeeping genes database was used to identify probable species in the WGS pipeline. WGS, amplicon sequencing, and metagenomic sequencing are increasingly used in research to produce complicated environmental sequence data sets, which paved the way for a cultivation-independent genetic content assessment and exploitation of the entire communities of organisms [4, 42-44]. Therefore, it is urgent to develop WGS and amplicon-based microbial species identification pipelines in the field of food safety and drug control. Here, we provide a system to analyze the WGS, amplicon sequences for microbial identification, MLST typing, and SNP source tracking. In our system, one important potential use of the WGS microbial identification pipeline is to identify contaminated sequences or metagenome samples. Simultaneously, it has great value in speeding up pathogen detection in clinical laboratories, while the existing identification and taxonomy methods may be unreliable with contaminated samples. Meicheng Yang, Feng Qin, and Yi Ren conceived the system and idea. Linmeng Liu and Hao Gao implemented the MIST main code. Chang Han and Dan Zhang designed the graphical user interface. Minghui Song, Chang Han, and Linmeng Liu wrote the manuscript. Yi Ren, Chang Han, Qiongqiong Li, and Yiling Fan were responsible for editing and revising the manuscript. All authors contributed to the development of MIST. We are grateful to Zhuo Yang for the graphical user interface development. This work was supported by the grants from the Science and Technology Commission of Shanghai Municipality (22142201600 and 20DZ2293600), the Open Fund Project of NMPA Key Laboratory for Testing Technology of Pharmaceutical Microbiology (2021-WSW-01), and the Standard Improvement Project of Chinese Pharmacopoeia Commission (2022Y21 and 2023Y36). The authors declare no conflict of interest. Supplementary materials (tables, scripts, graphical abstracts, slides, videos, Chinese translated version, and update materials) may be found in the online DOI or iMeta Science http://www.imeta.science/. Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.

Machine Learning-Constrained Semi-Analysis Model for Efficient Bathymetric Mapping in Data-Scarce Coastal Waters
Qifei Wang, Xianliang Zhang, Zhongqiang Wu et al.|Remote Sensing|2025
Cited by 4Open Access

Nearshore bathymetry is critical for coastal management and ecology. While airborne hyperspectral remote sensing provides high-resolution image data, obtaining rapid and accurate bathymetric inversion in coastal areas lacking in situ data remains challenging. The widely used Hyperspectral Optimization Process Exemplar (HOPE) achieves high accuracy but suffers from computational inefficiency, making it impractical for large-scale, high-resolution datasets. By contrast, HOPE-Pure Water (HOPE-PW) offers computational efficiency but exhibits limitations in capturing fine-scale spatial patterns of bottom reflectance (ρ), and its applicability in transitional waters between Case I and II types requires further validation. Against this background, we employed machine learning-based substrate classification (support vector machine, random forest, maximum likelihood) in Wenchang coastal waters, China, to constrain ρ estimation in HOPE-PW, with validation using ICESat-2 data that extends its conventional application scenarios. Results demonstrate that when constrained by the optimal classifier (random forest), HOPE-PW achieves comparable accuracy to HOPE in shallow water while reducing runtime by 56% and memory usage by 68%. However, HOPE-PW exhibits slight underestimation in deeper areas, likely because simplification reduces sensitivity to water optical properties. Future research will focus on this issue. This study proposes an efficient and reliable framework for monitoring and evaluating water depth in areas lacking in situ data, offering a practical solution for integrated coastal zone management.

Constraining inflation with nonminimal derivative coupling with the Parkes Pulsar Timing Array third data release
Chang Han, Liyang Chen, Zu-Cheng Chen et al.|Physical review. D/Physical review. D.|2025
Cited by 3

We study an inflation model with nonminimal derivative coupling that features a coupling between the derivative of the inflaton field and the Einstein tensor. This model naturally amplifies curvature perturbations at small scales via gravitationally enhanced friction, a mechanism critical for the formation of primordial black holes and the associated production of potentially detectable scalar-induced gravitational waves. We derive analytical expressions for the primordial power spectrum, enabling efficient exploration of the model parameter space without requiring computationally intensive numerical solutions of the Mukhanov-Sasaki equation. Using the third data release of the Parkes Pulsar Timing Array (PPTA DR3), we constrain the model parameters characterizing the coupling function: ${\ensuremath{\phi}}_{c}={3.7}_{\ensuremath{-}0.5}^{+0.3}{M}_{\mathrm{P}}$, ${\mathrm{log}}_{10}{\ensuremath{\omega}}_{L}={7.1}_{\ensuremath{-}0.3}^{+0.6}$, and ${\mathrm{log}}_{10}\ensuremath{\sigma}=\ensuremath{-}{8.3}_{\ensuremath{-}0.6}^{+0.3}$ at 90% confidence level. Our results demonstrate the growing capability of pulsar timing arrays to probe early Universe physics, complementing traditional cosmic microwave background observations by providing unique constraints on inflationary dynamics at small scales.

Similar Researchers

Coming soon — researchers in similar fields and career stages