Volume 16, Number 4, 2019
Single image super-resolution has attracted increasing attention and has a wide range of applications in satellite imaging, medical imaging, computer vision, security surveillance imaging, remote sensing, objection detection, and recognition. Recently, deep learning techniques have emerged and blossomed, producing " the state-of-the-art” in many domains. Due to their capability in feature extraction and mapping, it is very helpful to predict high-frequency details lost in low-resolution images. In this paper, we give an overview of recent advances in deep learning-based models and methods that have been applied to single image super-resolution tasks. We also summarize, compare and discuss various models from the past and present for comprehensive understanding and finally provide open problems and possible directions for future research.
In this contribution, we present iHEARu-PLAY, an online, multi-player platform for crowdsourced database collection and labelling, including the voice analysis application (VoiLA), a free web-based speech classification tool designed to educate iHEARu-PLAY users about state-of-the-art speech analysis paradigms. Via this associated speech analysis web interface, in addition, VoiLA encourages users to take an active role in improving the service by providing labelled speech data. The platform allows users to record and upload voice samples directly from their browser, which are then analysed in a state-of-the-art classification pipeline. A set of pre-trained models targeting a range of speaker states and traits such as gender, valence, arousal, dominance, and 24 different discrete emotions is employed. The analysis results are visualised in a way that they are easily interpretable by laymen, giving users unique insights into how their voice sounds. We assess the effectiveness of iHEARu-PLAY and its integrated VoiLA feature via a series of user evaluations which indicate that it is fun and easy to use, and that it provides accurate and informative results.
As a major component of speech signal processing, speech emotion recognition has become increasingly essential to understanding human communication. Benefitting from deep learning, many researchers have proposed various unsupervised models to extract effective emotional features and supervised models to train emotion recognition systems. In this paper, we utilize semi-supervised ladder networks for speech emotion recognition. The model is trained by minimizing the supervised loss and auxiliary unsupervised cost function. The addition of the unsupervised auxiliary task provides powerful discriminative representations of the input features, and is also regarded as the regularization of the emotional supervised task. We also compare the ladder network with other classical autoencoder structures. The experiments were conducted on the interactive emotional dyadic motion capture (IEMOCAP) database, and the results reveal that the proposed methods achieve superior performance with a small number of labelled data and achieves better performance than other methods.
Extracting the three-dimensional (3D) information including location and height of a pedestrian is important for vision-based intelligent traffic monitoring systems. This paper tackles the relationship between pixels′ actual size and pixels′ spatial resolution through a new method named pixel-resolution mapping (P-RM). The proposed P-RM method derives the equations for pixels′ spatial resolutions (XY-direction) and object′s height (Z-direction) in the real world, while introducing new tilt angle and mounting height calibration methods that do not require special calibration patterns placed in the real world. Both controlled laboratory and actual world experiments were performed and reported. The tests on 3D mensuration using proposed P-RM method showed overall better than 98.7% accuracy in laboratory environments and better than 96% accuracy in real world pedestrian height estimations. The 3D reconstructed images for measured points were also determined with the proposed P-RM method which shows that the proposed method provides a general algorithm for 3D information extraction.
With the rapid development of the robotic industry, domestic robots have become increasingly popular. As domestic robots are expected to be personal assistants, it is important to develop a natural language-based human-robot interactive system for end-users who do not necessarily have much programming knowledge. To build such a system, we developed an interactive tutoring framework, named " Holert”, which can translate task descriptions in natural language to machine-interpretable logical forms automatically. Compared to previous works, Holert allows users to teach the robot by further explaining their intentions in an interactive tutor mode. Furthermore, Holert introduces a semantic dependency model to enable the robot to " understand” similar task descriptions. We have deployed Holert on an open-source robot platform, Turtlebot 2. Experimental results show that the system accuracy could be significantly improved by 163.9% with the support of the tutor mode. This system is also efficient. Even the longest task session with 10 sentences can be handled within 0.7 s.
This paper presents a novel movement planning algorithm for a guard robot in an indoor environment, imitating the job of human security. A movement planner is employed by the guard robot to continuously observe a certain person. This problem can be distinguished from the person following problem which continuously follows the object. Instead, the movement planner aims to reduce the movement and the energy while keeping the target person under its visibility. The proposed algorithm exploits the topological features of the environment to obtain a set of viewpoint candidates, and it is then optimized by a cost-based set covering problem. Both the robot and the target person are modeled using geodesic motion model which considers the environment shape. Subsequently, a particle model-based planner is employed, considering the chance constraints over the robot visibility, to choose an optimal action for the robot. Simulation results using 3D simulator and experiments on a real environment are provided to show the feasibility and effectiveness of our algorithm.
Pursuit-evasion games involving mobile robots provide an excellent platform to analyze the performance of pursuit and evasion strategies. Pursuit-evasion has received considerable attention from researchers in the past few decades due to its application to a broad spectrum of problems that arise in various domains such as defense research, robotics, computer games, drug delivery, cell biology, etc. Several methods have been introduced in the literature to compute the winning chances of a single pursuer or single evader in a two-player game. Over the past few decades, proportional navigation guidance (PNG) based methods have proved to be quite effective for the purpose of pursuit especially for missile navigation and target tracking. However, a performance comparison of these pursuer-centric strategies against recent evader-centric schemes has not been found in the literature, for wheeled mobile robot applications. With a view to understanding the performance of each of the evasion strategies against various pursuit strategies and vice versa, four different proportional navigation-based pursuit schemes have been evaluated against five evader-centric schemes and vice-versa for non-holonomic wheeled mobile robots. The pursuer′s strategies include three well-known schemes namely, augmented ideal proportional navigation guidance (AIPNG), modified AIPNG, angular acceleration guidance (AAG), and a recently introduced pursuer-centric scheme called anticipated trajectory-based proportional navigation guidance (ATPNG). Evader-centric schemes are classic evasion, random motion, optical-flow based evasion, Apollonius circle based evasion and another recently introduced evasion strategy called anticipated velocity based evasion. The performance of each of the pursuit methods was evaluated against five different evasion methods through hardware implementation. The performance was analyzed in terms of time of interception and the distance traveled by players. The working environment was obstacle-free and the maximum velocity of the pursuer was taken to be greater than that of the evader to conclude the game in finite time. It is concluded that ATPNG performs better than other PNG-based schemes, and the anticipated velocity based evasion scheme performs better than the other evasion schemes.
This paper presents a novel five degrees of freedom (DOF) two-wheeled robotic machine (TWRM) that delivers solutions for both industrial and service robotic applications by enlarging the vehicle′s workspace and increasing its flexibility. Designing a two-wheeled robot with five degrees of freedom creates a high challenge for the control, therefore the modelling and design of such robot should be precise with a uniform distribution of mass over the robot and the actuators. By employing the Lagrangian modelling approach, the TWRM′s mathematical model is derived and simulated in Matlab/Simulink®. For stabilizing the system′s highly nonlinear model, two control approaches were developed and implemented: proportional-integral-derivative (PID) and fuzzy logic control (FLC) strategies. Considering multiple scenarios with different initial conditions, the proposed control strategies′ performance has been assessed.
The convergence analysis of MaxMin-SOMO algorithm is presented. The SOM-based optimization (SOMO) is an optimization algorithm based on the self-organizing map (SOM) in order to find a winner in the network. Generally, through a competitive learning process, the SOMO algorithm searches for the minimum of an objective function. The MaxMin-SOMO algorithm is the generalization of SOMO with two winners for simultaneously finding two winning neurons i.e., first winner stands for minimum and second one for maximum of the objective function. In this paper, the convergence analysis of the MaxMin-SOMO is presented. More specifically, we prove that the distance between neurons decreases at each iteration and finally converge to zero. The work is verified with the experimental results.
The problem of robust stabilization for a class of discrete-time switched large-scale systems with parameter uncertainties and nonlinear interconnected terms is considered. By using state feedback and Lyapunov function technique, a decentralized switching control approach is put forward to guarantee the solutions of large-scale systems converge to the origin globally. A numerical example and a corresponding simulation result are utilized to verify the effectiveness of the presented approach.
The purpose of this paper is to propose a synthesis method of parametric sensitivity constrained linear quadratic (SCLQ) controller for an uncertain linear time invariant (LTI) system. System sensitivity to parameter variation is handled through an additional quadratic trajectory parametric sensitivity term in the standard LQ criterion to be minimized. The main purpose here is to find a suboptimal linear quadratic control taking explicitly into account the parametric uncertainties. The paper main contribution is threefold: 1) A descriptor system approach is used to show that the underlying singular linear-quadratic optimal control problem leads to a non-standard Riccati equation. 2) A solution to the proposed control problem is then given based on a connection to the so-called Lur'e matrix equations. 3) A synthesis method of multiple parametric SCLQ controllers is proposed to cover the whole parametric uncertainty while degrading as less as possible the intrinsic robustness properties of each local linear quadratic controller. Some examples are presented in order to illustrate the effectiveness of the approach.