Zhang Le (张乐)

I am a Professor at University of Electronic Science and Technology of China, where I work on Deep Learning.

Email  /  Google Scholar  /  LinkedIn


PhD: Nanyang Technological University, Singapore (2012-2016).

MSC: Nanyang Technological University, Singapore (2011-2012).

BEng: University of Electronic Science and Technology of China (2007-2011).

Working Experience

Professor (06/2021-) : School of Information and Communication Engineering, University of Electronic Science and Technology of China (UESTC).

Scientist (09/2018-06/2021) : Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR).

Postdoc (02/2016-09/2018) : Advanced Digital Sciences Center.

Professional Services

Associate Editor: IET Biometrics.
Guest Editor: "Ensemble Deep Learning" in Pattern Recognition.
Guest Editor: "Deep Learning for Human Activity Recognition" in Neurocomputing.

IJCAI2019: 1st workshop on "Deep Learning for Human Activity Recognition"
IJCAI2020: 2nd workshop on "Deep Learning for Human Activity Recognition"


EE2073-Introduction to EEE Design and Project @ NTU


02/2020: We are organizing the second workshop on "Deep Learning for Human Activity Recognition" in IJCAI2020.

02/2020: I am serving as an Associate Editor of IET Biometrics.

01/2019: We organized an worksop on "Deep Learning for Human Activity Recognition" in IJCAI2019. Selected papers (or extensions) could be published on a special issue of "Deep Learning for Human Activity Recognition" at Elsevier Journal, Neurocomputing.

10/2018: We organized an special issue on "Ensemble Deep Learning" in Pattern Recognition.

09/2018: I joined I2R as a scientist.

05/2018: In OMG-Emotion Challenge 2018, our ADSC team's submissions ranked 1st for vision-only arousal/valence prediction and 2nd for overall valence prediction!


I'm interested in machine learning, deep learning, computer vision, image processing and their applications.

Selected Papers

Locality-Aware Crowd Counting
Joey Tianyi Zhou, Le Zhang* , Jiawei Du, Xi Peng, Zhiwen Fang, Zhe Xiao and Hongyuan Zhu
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI) , 2021.
Project Page / Bibtex

We tackle the long-standing problem of imbalanced data distribution by proposing a simple yet effective locality-based learning paradigm. The method is locality-aware in two aspects. First, we introduce a locality-aware data partition (LADP) approach to group the training data into different bins via locality-sensitive hashing. As a result, a more balanced data batch is then constructed by LADP. To further reduce the training bias and enhance the collaboration with LADP, a new data augmentation method called locality-aware data augmentation (LADA) is proposed where the image patches are adaptively augmented based on the loss. We demonstrate the versatility of the proposed method by applying it for crowd counting and adversarial defense.

Two-Stream Convolution Augmented Transformer for Human Activity Recognition
Bing Li, Wei Cui, Wei Wang, Le Zhang, Zhenghua Chen and Min Wu
AAAI , 2021.
Project Page / Bibtex

we propose a novel Two-stream Convolution Augmented Human Activity Transformer (THAT) model to utilize a two-stream structure to capture both time-over-channel and channel-over-time features, and use the multi-scale con-volution augmented transformer to capture range-based patterns.

Ordered or Orderless: A Revisit for Video based Person Re-Identification
Le Zhang, Joey Tianyi Zhou, Ming-Ming Cheng, Yun Liu, Jia-Wang Bian, Zeng Zeng, Chunhua Shen
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI) , 2020.
Project Page / Bibtex

We first show that the common practice of employing recurrent neural networks (RNNs) to aggregate temporalspatial features for Video-based Person Reid (VPRe-id) may not be optimal. Specifically, with a diagnostic analysis, we show that the recurrent structure may not be effective learn temporal dependencies than what we expected and implicitly yields an orderless representation. Based on this observation, we then present a simple yet surprisingly powerful approach for VPRe-id, where we treat VPRe-id as an efficient orderless ensemble of image based person re-identification problem. More specifically, we divide videos into individual images and re-identify person with ensemble of image based rankers. Under the i.i.d. assumption, we provide an error bound that sheds light upon how could we improve VPRe-id. Our work also presents a promising way to bridge the gap between video and image based person re-identification.

GMS: Grid-Based Motion Statistics for Fast, Ultra-robust Feature Correspondence
Jia-Wang Bian, Wen-Yan Lin, Yun Liu, Le Zhang , Sai-Kit Yeung, Ming-Ming Cheng, Ian Reid
International Journal of Computer Vision (IJCV) , 2020.
Project Page / Opencv_contrib / Open Access / Youtube

We presents a fast correspondence selection method to effectively separate true correspondences from false ones at high speed by leveraging the motion smoothness constraint.

Attention-Driven Loss for Anomaly Detection in Video Surveillance
Joey Tianyi Zhou, Le Zhang* , Zhiwen Fang, Jiawei Du, Xi Peng, Xiao Yang
IEEE Transactions on Circuits and Systems for Video Technology (T-CSVT), 2020.
Project Page / PDF

We provide a simple solution to alleviate the foreground-background imbalance problem in video anomaly detection.

Using Reinforcement Learning to Minimize the Probability of Delay Occurrence in Transportation
Zhiguang Cao, Hongliang Guo, Wen Song, Kaizhou Gao, Zhenghua Chen,Le Zhang, Xuexi Zhang
IEEE Transactions on Vehicular Technology (T-VT) , 2020.

We design a novel and practical Q-learning approach where the converged Q-values have the practical meaning as the actual probabilities of arriving on time so as to improve the accuracy of finding the real optimal path.

Nonlinear Regression Via Deep Negative Correlation Learning
Le Zhang, Zenglin Shi, Ming-Ming Cheng, Yun Liu, Jia-Wang Bian, Joey Tianyi Zhou, Guoyan Zheng, Zeng Zeng
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI) , 2019.
project page / arXiv

We provide a general deep regression framework which mimics ensemble learning with a single model. We demonstrate its effectiveness on several computer vision tasks including corwd counting, age estimation, apparent personality analysis and image super-resolution.

PersEmoN: A Deep Network for Joint Analysis of Apparent Personality, Emotion and Their Relationship
Le Zhang, Songyou Peng and Stefan Winkler.
IEEE Transactions on Affective Computing (TAC/TAFFC), 2019.
Project Page

A journal extension of our ACM MM paper for joint analysis of apparent personality, emotion and their relationship

Contrast Prior and Fluid Pyramid Integration for RGBD Salient Object Detection
Jia-Xing Zhao, Yang Cao, Deng-Ping Fan, Ming-Ming Cheng, Xuan-Yi Li, Le Zhang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , 2019.
Project Page

In this paper, we utilize contrast prior, which used to be a dominant cue in none deep learning based SOD approaches, into CNNs-based architecture to enhance the depth information.

An Evaluation of Feature Matchers for Fundamental Matrix Estimation
Jia-Wang Bian, Yu-Huan Wu, Ji Zhao, Yun Liu, Le Zhang, Ming-Ming Cheng, Ian Reid
British Machine Vision Conference (BMVC) , 2019.

project page

This paper evaluates the recently proposed local features, correspondence pruning algorithms, and robust estimators using strictly defined metrics in the context of image matching and fundamental matrix estimation. Comprehensive evaluation results on four large-scale datasets provide insights into which datasets are particularly challenging and which algorithms perform well in which scenarios.

Heterogeneous Oblique Random Forest
Rakesh Katuwal, P.N. Suganthan and Le Zhang.
Pattern Recognition (PR) , 2019.

We propose a heterogeneous oblique random forest that employs an oblique linear hyperplane at each node. On benchmarking 190 classifiers on 121 UCI datasets, we find that the oblique random forests proposed in this paper are the top 3 ranked classifiers with the heterogeneous oblique random forest being statistically significantly better than all other classifiers

Deep Learning based Human Activity Recognition for Healthcare Services
Zhenghua Chen, Le Zhang, Wu min, Xiaoli Li.
in book “Deep Learning for Biomedical Data Analysis: Techniques, Approaches and Applications”, Springer, to be published in 2020.

Light Sensor Based Occupancy Estimation via Bayes Filter with Neural Networks
Zhenghua Chen, Yanbing Yang, Chaoyang Jiang, Jie Hao, Le Zhang
IEEE Transactions on Industrial Electronics (T-IE) , 2019.

A Bayes filter with neural networks is proposed for the optimal estimation of occupancy based on light sensor data.

WiFi CSI Based Passive Human Activity Recognition Using Attention Based BLSTM
Zhenghua Chen, Le Zhang*,Chaoyang Jiang, Zhiguang Cao, and Wei Cui (* indicates the corresponding author)
IEEE Transaction on Mobile Computing (T-MC) , preprint.

An attention based bi-directional long short-term memory for passive human activity recognition using WiFi CSI signals.

Richer Convolutional Features for Edge Detection
Yun Liu, Ming-Ming Cheng, Xiaowei Hu, Jia-Wang Bian, Le Zhang, Xiang Bai and Jinhui Tang
IEEE Transaction on Pattern Analysis and Machine Intelligence (T-PAMI), preprint.
project page / Blog

An accurate edge detector using richer convolutional features.

Robust Mobile Location Estimation in NLOS Environment Using GMM, IMM, and EKF
Wei Cui, Bing Li, Le Zhang and Wei Meng,
IEEE Systems Journal , preprint.

A mobile location estimation scheme for realistically mixed LOS/NLOS/LOS-NLOS environments.

Using FTOC to Track Shuttlecock for the Badminton Robot
Wei Chen, Tingbo Liao, Zhihang Li, HaozhiLin, Hong Xue,Le Zhang, Jing Guo, Zhiguang Cao
Neurocomputing , preprint.

An omnidirectional mobile badminton robot, which is composed of mechanical, visual and motion control subsystems.

An ensemble of decision trees with random vector functional link networks for multi-class classification
Katuwal, Rakesh, P. N. Suganthan, and Le Zhang
Applied Soft Computing , 2018.

A new ensemble of classifiers that consists of decision trees and random vector functional link network for multi-class classification

Distilling the Knowledge from Handcrafted Features for Human Activity Recognition
Chen, Zhenghua, Le Zhang*, Zhiguang Cao, and Jing Guo (* indicates the corresponding author)
IEEE Transactions on Industrial Informatics (T-II) , 2018.

A novel knowledge distilling strategy to improve deep learning with handcrafed features

Multiscale Multitask Deep NetVLAD for Crowd Counting
Zenglin Shi, Le Zhang, Yibo Sun, and Yangdong Ye
IEEE Transactions on Industrial Informatics (T-II) , 2018.
project page

we introduce a dynamic augmentation technique to train a much deeper CNN for crowd counting. In order to decrease over-fitting caused by limited number of training samples, multitask learning is further employed to learn generalisable representations across similar domains. We also propose to aggregate multi-scale convolutional features extracted from the entire image into a compact single vector representation amenable to efficient and accurate counting by way of "Vector of Locally Aggregated Descriptors" (VLAD).

Received Signal Strength Based Indoor Positioning Using a Random Vector Functional Link Network
Cui, Wei, Le Zhang *,, Bing Li, Jing Guo, Wei Meng, Haixia Wang, and Lihua Xie (* indicates the corresponding author)
IEEE Transactions on Industrial Informatics (T-II) , 2018.

A robust and parallel RVFL for RSS based indoor positioning.

Historical Context-based Style Classification of Painting Images via Label Distribution Learning
Jufeng Yang, Liyi Chen, Le Zhang , Xiaoxiao Sun, Dongyu She, Shao-Ping Lu and Ming-Ming Cheng
ACM Multimedia (MM) , 2018.

Novel knowledge distilling strategy to assist visual feature learning in the convolutional neural network for painting style classification.

Give Me One Portrait Image, I Will Tell You Your Emotion and Personality
Songyou Peng, Le Zhang , Stefan Winkler and Marianne Winslett
ACM Multimedia (MM) , 2018.

A technical Demo. A deep Siamese-like network is introduced to predict one's Big-Five personality and arousal-valence emotion from one portrait photo.

Bayesian VoxDRN: A Probabilistic Deep Voxelwise Dilated Residual Network for WholeHeart Segmentation from 3D MR Images
Zenglin Shi, Guodong Zeng,Le Zhang , Xiahai Zhuang, Lei Li, Guang Yang, and Guoyan Zheng,
International Conference On Medical Image Computing & Computer Assisted Intervention (MICCAI) , 2018.

A probabilistic deep voxelwise dilated residual network to segment the whole heart from 3D MR images.

DEL: Deep Embedding Learning for Efficient Image Segmentation
Yun Liu, Peng-Tao Jiang, Xiaowei Hu, Vahan Petrosyan, Shi-Jie Li, Jia-Wang Bian, Le Zhang, and Ming-ming Cheng
International Joint Conference on Artificial Intelligence (IJCAI) , 2018.
project page

We train a fully convolutional network to learn the feature embedding space for each superpixel.

Crowd Counting With Deep Negative Correlation Learning
Zenglin Shi, Le Zhang *, Yun Liu, Xiaofeng Cao, Yangdong Ye, Shi-Jie Li, and Guoyan Zheng (* indicates the corresponding author)
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , 2018.
project page / Blog

With no extra parameters, we mimic ensemble learning within a single network.

Kernel Cross-Correlator
Chen Wang, Le Zhang, Lihua Xie, Junsong Yuan,
AAAI Conference on Artificial Intelligence (AAAI) , 2018.
project page / Blog

KCC extends KCF to any kernel function and is not limited to circulant structure on training data, thus it is able to predict affine transformations with customized properties.

Visual Tracking With Convolutional Random Vector Functional Link Network
Le Zhang and Ponnuthurai Nagaratnam Suganthan,
IEEE Transaction on Cybernetics (T-Cyber) , 2017.
project page

Ensemble of randomized ConvNets for visual tracking.

Robust visual tracking via co-trained Kernelized correlation filters
Le Zhang and Ponnuthurai Nagaratnam Suganthan,
Pattern Recognition (PR) , 2017.
project page

Ensemble of KCFs for visual tracking.

Benchmarking Ensemble Classifiers with Novel Co-Trained Kernel Ridge Regression and Random Vector Functional Link Ensembles
Le Zhang, and Ponnuthurai Nagaratnam Suganthan,
IEEE Computational Intelligence Magazine (IEEE CIM), 2017.

A benchmark summarization for my PhD study.

Robust Human Activity Recognition Using Smartphone Sensors via CT-PCA and Online SVM
Zhenghua Chen, Qingchang Zhu, Yeng Chai Soh and Le Zhang* (* indicates the corresponding author)
IEEE Transaction on Industrial Informatics (T-II) , 2017.

An online SVM for time series signal.

Oblique random forest ensemble via Least Square Estimation for time series forecasting
Xueheng Qiu, Le Zhang, Ponnuthurai Nagaratnam Suganthan, and Gehan A.J. Amaratunga,
Information Sciences , 2017.

Extend the oblique random forest for regression problems.

Finding the ‘faster’ path in vehicle routing
Guo, Jing, Yaoxin Wu, Xuexi Zhang, Le Zhang, Wei Chen, Zhiguang Cao, Lu Zhang, and Hongliang Guo
IET Intelligent Transport Systems , 2017.

Improve the faster criterion in vehicle routing by extending the bi-delta distribution to the binormal distribution.

Robust Visual Tracking Using Oblique Random Forests
Le Zhang, Jagan Varadarajan, Ponnuthurai Nagaratnam Suganthan, Narendra Ahuja, and Pierre Moulin,
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , 2017.
project page

An incremental oblique random forest.

Robust Multi-Modal Cues for Dyadic Human Interaction Recognition
Trabelsi, Rim, Jagannadan Varadarajan, Yong Pei, Le Zhang, Issam Jabri, Ammar Bouallegue, and Pierre Moulin.
ACM Multimedia Workshop on Multimodal Understanding of Social, Affective and Subjective Attributes , 2017.

We addressed the problem of dyadic interaction recognition using multi-modal data by combining all three modalities and uses both person centric features (via proxemic descriptors from 3D joints) and holistic features (via FCNN based color and depth features).

Ensemble classification and regression-recent developments, applications and future directions
Le Zhang*, Ye Ren* and Ponnuthurai Nagaratnam Suganthan (* indicates co-first authors).
IEEE Computational Intelligence Magazine (IEEE CIM) , 2016.

This paper reviews traditional emsemble learning as well as state-of-the-art deep ensemble methods and thus can serve as an extensive summary for practitioners and beginner.

A Comprehensive Evaluation of Random Vector Functional Link Networks
Le Zhang, and Ponnuthurai Nagaratnam Suganthan .
Information Sciences , 2016.

Benchmark evaluation of RVFL.

A Survey of Randomized Algorithms for Training Neural Networks
Le Zhang, and Ponnuthurai Nagaratnam Suganthan .
Information Sciences , 2016.

Oblique decision tree ensemble via multisurface proximal support vector machine
Le Zhang, and Ponnuthurai Nagaratnam Suganthan .
IEEE Transaction on Cybernetic (T-Cyber) , 2015.

Oblique Random Forest by fast iterative clustering.

Visual tracking with convolutional neural network
Le Zhang, and Ponnuthurai Nagaratnam Suganthan .
IEEE International Conference on Systems, Man, and Cybernetics , 2015.

ConvNet for visual tracking.

Random forests with ensemble of feature spaces
Le Zhang, and Ponnuthurai Nagaratnam Suganthan .
Pattern Recognitio (PR) , 2014.

Source code