PLA Rocket Force University of Engineering
Publishes on Advanced Image and Video Retrieval Techniques, Machine Learning in Materials Science, Advanced Graph Neural Networks. 74 papers and 1.2k citations.
Add your photo, update your bio, and get notified when your ranking changes.
The Transformer architecture has become a dominant choice in many domains, such as natural language processing and computer vision. Yet, it has not achieved competitive performance on popular leaderboards of graph-level prediction compared to mainstream GNN variants. Therefore, it remains a mystery how Transformers could perform well for graph representation learning. In this paper, we solve this mystery by presenting Graphormer, which is built upon the standard Transformer architecture, and could attain excellent results on a broad range of graph representation learning tasks, especially on the recent OGB Large-Scale Challenge. Our key insight to utilizing Transformer in the graph is the necessity of effectively encoding the structural information of a graph into the model. To this end, we propose several simple yet effective structural encoding methods to help Graphormer better model graph-structured data. Besides, we mathematically characterize the expressive power of Graphormer and exhibit that with our ways of encoding the structural information of graphs, many popular GNN variants could be covered as the special cases of Graphormer.
In this paper we propose a new video interaction model called adaptive fast-forwarding to help people quickly browse videos with predefined semantic rules. This model is designed around the metaphor of scenic car driving, in which the driver slows down near areas of interest and speeds through unexciting areas. Results from a preliminary user study of our video player suggest the following: (1) the player should adaptively adjust the current playback speed based on the complexity of the present scene and predefined semantic events; (2) the player should learn user preferences about predefined event types as well as a suitable playback speed; (3) the player should fast-forward the video continuously with a playback rate acceptable to the user to avoid missing any undefined events or areas of interest. Furthermore, our user study results suggest that for certain types of video, our SmartPlayer yields better user experiences in browsing and fast-forwarding videos than existing video players' interaction models.
Normalization is known to help the optimization of deep neural networks. Curiously, different architectures require specialized normalization methods. In this paper, we study what normalization is effective for Graph Neural Networks (GNNs). First, we adapt and evaluate the existing methods from other domains to GNNs. Faster convergence is achieved with InstanceNorm compared to BatchNorm and LayerNorm. We provide an explanation by showing that InstanceNorm serves as a preconditioner for GNNs, but such preconditioning effect is weaker with BatchNorm due to the heavy batch noise in graph datasets. Second, we show that the shift operation in InstanceNorm results in an expressiveness degradation of GNNs for highly regular graphs. We address this issue by proposing GraphNorm with a learnable shift. Empirically, GNNs with GraphNorm converge faster compared to GNNs using other normalization. GraphNorm also improves the generalization of GNNs, achieving better performance on graph classification benchmarks.
Building LEGO sculptures requires accounting for the target object's shape, colors, and stability. In particular, finding a good layout of LEGO bricks that prevents the sculpture from collapsing (due to its own weight) is usually challenging, and it becomes increasingly difficult as the target object becomes larger or more complex. We devise a force-based analysis for estimating physical stability of a given sculpture. Unlike previous techniques for Legolization, which typically use heuristic-based metrics for stability estimation, our force-based metric gives 1) an ordering in the strength so that we know which structure is more stable, and 2) a threshold for stability so that we know which one is stable enough. In addition, our stability analysis tells us the weak portion of the sculpture. Building atop our stability analysis, we present a layout refinement algorithm that iteratively improves the structure around the weak portion, allowing for automatic generation of a LEGO brick layout from a given 3D model, accounting for color information, required workload (in terms of the number of bricks) and physical stability. We demonstrate the success of our method with real LEGO sculptures built up from a wide variety of 3D models, and compare against previous methods.