L

Liwei Peng

Henan University

Publishes on Sepsis Diagnosis and Treatment, Traumatic Brain Injury and Neurovascular Disturbances, Oral and Maxillofacial Pathology. 82 papers and 3.9k citations.

82Publications
3.9kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding
Wei Wang, Bin Bi, Ming Yan et al.|arXiv (Cornell University)|2019
Cited by 100Open Access

Recently, the pre-trained language model, BERT (and its robustly optimized version RoBERTa), has attracted a lot of attention in natural language understanding (NLU), and achieved state-of-the-art accuracy in various NLU tasks, such as sentiment classification, natural language inference, semantic textual similarity and question answering. Inspired by the linearization exploration work of Elman [8], we extend BERT to a new model, StructBERT, by incorporating language structures into pre-training. Specifically, we pre-train StructBERT with two auxiliary tasks to make the most of the sequential order of words and sentences, which leverage language structures at the word and sentence levels, respectively. As a result, the new model is adapted to different levels of language understanding required by downstream tasks. The StructBERT with structural pre-training gives surprisingly good empirical results on a variety of downstream tasks, including pushing the state-of-the-art on the GLUE benchmark to 89.0 (outperforming all published models), the F1 score on SQuAD v1.1 question answering to 93.0, the accuracy on SNLI to 91.7.

EFLOPS: Algorithm and System Co-Design for a High Performance Distributed Training Platform
Jianbo Dong, Zheng Cao, Tao Zhang et al.|Unknown|2020
Cited by 69

Deep neural networks (DNNs) have gained tremendous attractions as compelling solutions for applications such as image classification, object detection, speech recognition, and so forth. Its great success comes with excessive trainings to make sure the model accuracy is good enough for those applications. Nowadays, it becomes challenging to train a DNN model because of 1) the model size and data size keep increasing, which usually needs more iterations to train; 2) DNN algorithms evolve rapidly, which requires the training phase to be short for a quick deployment. To address those challenges, distributed training platforms have been proposed to leverage massive server nodes for training, with the hope of significant training time reduction. Therefore, scalability is a critical performance metric to evaluate a distributed training platform. Nevertheless, our analysis reveals that traditional server clusters have poor scalability for training due to the traffic congestions within the server and beyond. The intra-server traffic on the I/O fabric can result in severe congestions and skewed quality of service as high performance devices are competing with each other. Moreover, the traffic congestions on the Ethernet for inter-server communication could also incur significant performance degradation. In this work, we devise a novel distributed training platform, EFLOPS, that adopts an algorithm and system co-design methodology to achieve good scalability. A new server architecture is proposed to alleviate the intra-server congestions. Moreover, a new network topology, BiGraph, is proposed to divide the network into two separate parts, so that there is always a direct connection between any nodes from different parts. Finally, accompany with BiGraph, a topology-aware allreduce algorithm is proposed to eliminate the traffic congestion on the direct connection. The experimental results show that eliminating the congestions on network interface can gain up to 11.3xcommunication speedup. The proposed algorithm and topology can provide further improvement up to 6.08x. The overall performance of ResNet-50 training achieves near-linear scalability, and is competitive to the top-rankings of MLPerf results.