As a major component of speech signal processing, speech emotion recognition has become increasingly essential to understanding human communication. Benefitting from deep learning, many researchers have proposed various unsupervised models to extract effective emotional features and supervised models to train emotion recognition systems. In this paper, we utilize semi-supervised ladder networks for speech emotion recognition. The model is trained by minimizing the supervised loss and auxiliary unsupervised cost function. The addition of the unsupervised auxiliary task provides powerful discriminative representations of the input features, and is also regarded as the regularization of the emotional supervised task. We also compare the ladder network with other classical autoencoder structures. The experiments were conducted on the interactive emotional dyadic motion capture (IEMOCAP) database, and the results reveal that the proposed methods achieve superior performance with a small number of labelled data and achieves better performance than other methods.
Facial emotion recognition is an essential and important aspect of the field of human-machine interaction. Past research on facial emotion recognition focuses on the laboratory environment. However, it faces many challenges in real-world conditions, i.e., illumination changes, large pose variations and partial or full occlusions. Those challenges lead to different face areas with different degrees of sharpness and completeness. Inspired by this fact, we focus on the authenticity of predictions generated by different <emotion, region> pairs. For example, if only the mouth areas are available and the emotion classifier predicts happiness, then there is a question of how to judge the authenticity of predictions. This problem can be converted into the contribution of different face areas to different emotions. In this paper, we divide the whole face into six areas: nose areas, mouth areas, eyes areas, nose to mouth areas, nose to eyes areas and mouth to eyes areas. To obtain more convincing results, our experiments are conducted on three different databases: facial expression recognition + ( FER+), real-world affective faces database (RAF-DB) and expression in-the-wild (ExpW) dataset. Through analysis of the classification accuracy, the confusion matrix and the class activation map (CAM), we can establish convincing results. To sum up, the contributions of this paper lie in two areas: 1) We visualize concerned areas of human faces in emotion recognition; 2) We analyze the contribution of different face areas to different emotions in real-world conditions through experimental analysis. Our findings can be combined with findings in psychology to promote the understanding of emotional expressions.
This paper presents a novel five degrees of freedom (DOF) two-wheeled robotic machine (TWRM) that delivers solutions for both industrial and service robotic applications by enlarging the vehicle′s workspace and increasing its flexibility. Designing a two-wheeled robot with five degrees of freedom creates a high challenge for the control, therefore the modelling and design of such robot should be precise with a uniform distribution of mass over the robot and the actuators. By employing the Lagrangian modelling approach, the TWRM′s mathematical model is derived and simulated in Matlab/Simulink®. For stabilizing the system′s highly nonlinear model, two control approaches were developed and implemented: proportional-integral-derivative (PID) and fuzzy logic control (FLC) strategies. Considering multiple scenarios with different initial conditions, the proposed control strategies′ performance has been assessed.
In this paper, a new adaptive hierarchical sliding mode control scheme for a 3D overhead crane system is proposed. A controller is first designed by the use of a hierarchical structure of two first-order sliding surfaces represented by two actuated and un-actuated subsystems in the bridge crane. Parameters of the controller are then intelligently estimated, where uncertain parameters due to disturbances in the 3D overhead crane dynamic model are proposed to be represented by radial basis function networks whose weights are derived from a Lyapunov function. The proposed approach allows the crane system to be robust under uncertainty conditions in which some uncertain and unknown parameters are highly difficult to determine. Moreover, stability of the sliding surfaces is proved to be guaranteed. Effectiveness of the proposed approach is then demonstrated by implementing the algorithm in both synthetic and real-life systems, where the results obtained by our method are highly promising.
This paper proposes an image encryption algorithm LQBPNN (logistic quantum and back propagation neural network) based on chaotic sequences incorporating quantum keys. Firstly, the improved one-dimensional logistic chaotic sequence is used as the basic key sequence. After the quantum key is introduced, the quantum key is incorporated into the chaotic sequence by nonlinear operation. Then the pixel confused process is completed by the neural network. Finally, two sets of different mixed secret key sequences are used to perform two rounds of diffusion encryption on the confusing image. The experimental results show that the randomness and uniformity of the key sequence are effectively enhanced. The algorithm has a secret key space greater than 2182. The adjacent pixel correlation of the encrypted image is close to 0, and the information entropy is close to 8. The ciphertext image can resist several common attacks such as typical attacks, statistical analysis attacks and differential attacks.
Extracting the three-dimensional (3D) information including location and height of a pedestrian is important for vision-based intelligent traffic monitoring systems. This paper tackles the relationship between pixels′ actual size and pixels′ spatial resolution through a new method named pixel-resolution mapping (P-RM). The proposed P-RM method derives the equations for pixels′ spatial resolutions (XY-direction) and object′s height (Z-direction) in the real world, while introducing new tilt angle and mounting height calibration methods that do not require special calibration patterns placed in the real world. Both controlled laboratory and actual world experiments were performed and reported. The tests on 3D mensuration using proposed P-RM method showed overall better than 98.7% accuracy in laboratory environments and better than 96% accuracy in real world pedestrian height estimations. The 3D reconstructed images for measured points were also determined with the proposed P-RM method which shows that the proposed method provides a general algorithm for 3D information extraction.
In this paper, the problem of load transportation and robust mitigation of payload oscillations in uncertain tower-cranes is addressed. This problem is tackled through a control scheme based on the philosophy of Active-Disturbance-Rejection. Here, a general disturbance model built with two dominant components: polynomial and harmonic, is stated. Then, a disturbance observer is formulated through state-vector augmentation of the tower-crane model. Thus, better performance of estimations for system states and disturbances is achieved. The control law is then formulated to actively reject the disturbances but also to accommodate the closed-loop system dynamics even under system uncertainty. The proposed control schema is validated via experimentation using a small-scale tower-crane, and compared with other relevant ADRC-based techniques. The experimental results show that the proposed control scheme is robust under parametric uncertainty of the system, and provides improved attenuation of payload oscillations even under system uncertainty.
Image registration is an indispensable component in multi-source remote sensing image processing. In this paper, we put forward a remote sensing image registration method by including an improved multi-scale and multi-direction Harris algorithm and a novel compound feature. Multi-scale circle Gaussian Combined invariant moments and multi-direction gray level co-occurrence matrix are extracted as features for image matching. The proposed algorithm is evaluated on numerous multi-source remote sensor images with noise and illumination changes. Extensive experimental studies prove that our proposed method is capable of receiving stable and even distribution of key points as well as obtaining robust and accurate correspondence matches. It is a promising scheme in multi-source remote sensing image registration.
The aim of this work is to model and analyze the behavior of a new smart nano force sensor. To do so, the carbon nanotube has been used as a suspended gate of a metal-oxide-semiconductor field-effect transistor (MOSFET). The variation of the applied force on the carbon nanotube (CNT) generates a variation of the capacity of the transistor oxide-gate and therefore the variation of the threshold voltage, which allows the MOSFET to become a capacitive nano force sensor. The sensitivity of the nano force sensor can reach 0.124 31 V/nN. This sensitivity is greater than results in the literature. We have found through this study that the response of the sensor depends strongly on the geometric and physical parameters of the CNT. From the results obtained in this study, the increase in the applied force has as a consequence an increase in the value of the threshold voltage VTh of the MOSFET. In this paper, we first used artificial neural networks to faithfully reproduce the response of the nano force sensor model. This neural model is called direct model. Then, secondly, we designed an inverse model called an intelligent sensor which allows linearization of the response of our developed force sensor.
The objective of this paper is to propose a reduced-order observer for a class of Lipschitz nonlinear discrete-time systems. The conditions that guarantee the existence of this observer are presented in the form of linear matrix inequalities (LMIs). To handle the Lipschitz nonlinearities, the Lipschitz condition and the Young′s relation are adequately operated to add more degrees of freedom to the proposed LMI. Necessary and sufficient conditions for the existence of the unbiased reduced-order observer are given. An extension to
This paper investigates the necessity of feasibility considerations in a fault tolerant control system using the constrained control allocation methodology where both static and dynamic actuator constraints are considered. In the proposed feasible control allocation scheme, the constrained model predictive control (MPC) is employed as the main controller. This considers the admissible region of the control allocation problem as its constraints. Using the feasibility notion in the control allocation problem provides the main controller with information regarding the actuator′s status, which leads to closed loop system performance improvement. Several simulation examples under normal and faulty conditions are employed to illustrate the effectiveness of the proposed methodology. The main results clearly indicate that closed loop performance and stability characteristics can be significantly degraded by neglecting the actuator constraints in the main controller. Also, it is shown that the proposed strategy substantially enlarges the domain of attraction of the MPC combined with the control allocation as compared to the conventional MPC.
【Special Collection】Top Articles by Academicians and Fellows
IJAC collects all the articles written by world-famous academicians and fellows. Don’t miss it!
- 【Open Access】Download highlight papers for free
Collection of top reviews published in IJAC
Reviews on Deep Learning, Intelligent Robot, Autonomous Driving, Social Networks, ect.