Display Method:
Review
Deep Audio-visual Learning: A Survey
Hao Zhu, Man-Di Luo, Rui Wang, Ai-Hua Zheng, Ran He
doi: 10.1007/s11633-021-1293-0
Abstract PDF SpringerLink
Abstract:
Audio-visual learning, aimed at exploiting the relationship between audio and visual modalities, has drawn considerable attention since deep learning started to be used successfully. Researchers tend to leverage these two modalities to improve the performance of previously considered single-modality tasks or address new challenging problems. In this paper, we provide a comprehensive survey of recent audio-visual learning development. We divide the current audio-visual learning tasks into four different subfields: audio-visual separation and localization, audio-visual correspondence learning, audio-visual generation, and audio-visual representation learning. State-of-the-art methods, as well as the remaining challenges of each subfield, are further discussed. Finally, we summarize the commonly used datasets and challenges.
Advances in Deep Learning Methods for Visual Tracking: Literature Review and Fundamentals
Xiao-Qin Zhang, Run-Hua Jiang, Chen-Xiang Fan, Tian-Yu Tong, Tao Wang, Peng-Cheng Huang
doi: 10.1007/s11633-020-1274-8
Abstract PDF SpringerLink
Abstract:
Recently, deep learning has achieved great success in visual tracking tasks, particularly in single-object tracking. This paper provides a comprehensive review of state-of-the-art single-object tracking algorithms based on deep learning. First, we introduce basic knowledge of deep visual tracking, including fundamental concepts, existing algorithms, and previous reviews. Second, we briefly review existing deep learning methods by categorizing them into data-invariant and data-adaptive methods based on whether they can dynamically change their model parameters or architectures. Then, we conclude with the general components of deep trackers. In this way, we systematically analyze the novelties of several recently proposed deep trackers. Thereafter, popular datasets such as Object Tracking Benchmark (OTB) and Visual Object Tracking (VOT) are discussed, along with the performances of several deep trackers. Finally, based on observations and experimental results, we discuss three different characteristics of deep trackers, i.e., the relationships between their general components, exploration of more effective tracking frameworks, and interpretability of their motion estimation components.
A Comprehensive Review of Group Activity Recognition in Videos
Li-Fang Wu, Qi Wang, Meng Jian, Yu Qiao, Bo-Xuan Zhao
doi: 10.1007/s11633-020-1258-8
Abstract PDF SpringerLink
Abstract:
Human group activity recognition (GAR) has attracted significant attention from computer vision researchers due to its wide practical applications in security surveillance, social role understanding and sports video analysis. In this paper, we give a comprehensive overview of the advances in group activity recognition in videos during the past 20 years. First, we provide a summary and comparison of 11 GAR video datasets in this field. Second, we survey the group activity recognition methods, including those based on handcrafted features and those based on deep learning networks. For better understanding of the pros and cons of these methods, we compare various models from the past to the present. Finally, we outline several challenging issues and possible directions for future research. From this comprehensive literature review, readers can obtain an overview of progress in group activity recognition for future studies.
Research Article
Design and Analysis of a Novel 2T2R Parallel Mechanism with the Closed-loop Limbs
Hai-Rong Fang, Peng-Fei Liu, Hui Yang, Bing-Shan Jiang
doi: 10.1007/s11633-021-1294-z
Abstract PDF SpringerLink
Abstract:
This paper presents a novel four degrees of freedom (DOF) parallel mechanism with the closed-loop limbs, which includes two translational (2T) DOF and two rotational (2R) DOF. By connecting the proposed parallel mechanism with the guide rail in series, the 5-DOF hybrid robot system is obtained, which can be applied for the composite material tape laying in aerospace industry. The analysis in this paper mainly focuses on the parallel module of the hybrid robot system. First, the freedom of the proposed parallel mechanism is calculated based on the screw theory. Then, according to the closed-loop vector equation, the inverse kinematics and Jacobian matrix of the parallel mechanism are carried out. Next, the workspace stiffness and dexterity analysis of the parallel mechanism are investigated based on the constraint equations, static stiffness matrix and Jacobian condition number. Finally, the correctness of the inverse kinematics and the high stiffness of the parallel mechanism are verified by the kinematics and stiffness simulation analysis, which lays a foundation for the automatic composite material tape laying.
2D and 3D Palmprint and Palm Vein Recognition Based on Neural Architecture Search
Wei Jia, Wei Xia, Yang Zhao, Hai Min, Yan-Xiang Chen
doi: 10.1007/s11633-021-1292-1
Abstract PDF SpringerLink
Abstract:
Palmprint recognition and palm vein recognition are two emerging biometrics technologies. In the past two decades, many traditional methods have been proposed for palmprint recognition and palm vein recognition and have achieved impressive results. In recent years, in the field of artificial intelligence, deep learning has gradually become the mainstream recognition technology because of its excellent recognition performance. Some researchers have tried to use convolutional neural networks (CNNs) for palmprint recognition and palm vein recognition. However, the architectures of these CNNs have mostly been developed manually by human experts, which is a time-consuming and error-prone process. In order to overcome some shortcomings of manually designed CNN, neural architecture search (NAS) technology has become an important research direction of deep learning. The significance of NAS is to solve the deep learning model′s parameter adjustment problem, which is a cross-study combining optimization and machine learning. NAS technology represents the future development direction of deep learning. However, up to now, NAS technology has not been well studied for palmprint recognition and palm vein recognition. In this paper, in order to investigate the problem of NAS-based 2D and 3D palmprint recognition and palm vein recognition in-depth, we conduct a performance evaluation of twenty representative NAS methods on five 2D palmprint databases, two palm vein databases, and one 3D palmprint database. Experimental results show that some NAS methods can achieve promising recognition results. Remarkably, among different evaluated NAS methods, ProxylessNAS achieves the best recognition performance.
Low-cost Position and Force Measurement System for Payload Transport Using UAVs
Daniel Ceferino Gandolfo, Claudio D. Rosales, Lucio R. Salinas, J. Gimenez, Ricardo Carelli
doi: 10.1007/s11633-021-1281-4
Abstract PDF SpringerLink
Abstract:
In recent years, multiple applications have emerged in the area of payload transport using unmanned aerial vehicles (UAVs). This has attracted considerable interest among the scientific community, especially the cases involving one or several rotarywing UAVs. In this context, this work proposes a novel measurement system which can estimate the payload position and the force exerted by it on the UAV. This measurement system is low cost, easy to implement, and can be used either in indoor or outdoor environments (no sensorized laboratory is needed). The measurement system is validated statically and dynamically. In the first test, the estimations obtained by the system are compared with measurements produced by high-precision devices. In the second test, the system is used in real experiments to compare its performance with the ones obtained using known procedures. These experiments allowed to draw interesting conclusions on which future research can be based.
Fuzzy Tuned PID Controller for Envisioned Agricultural Manipulator
Satyam Paul, Ajay Arunachalam, Davood Khodadad, Henrik Andreasson, Olena Rubanenko
doi: 10.1007/s11633-021-1280-5
Abstract PDF SpringerLink
Abstract:
The implementation of image-based phenotyping systems has become an important aspect of crop and plant science research which has shown tremendous growth over the years. Accurate determination of features using images requires stable imaging and very precise processing. By installing a camera on a mechanical arm driven by motor, the maintenance of accuracy and stability becomes non-trivial. As per the state-of-the-art, the issue of external camera shake incurred due to vibration is a great concern in capturing accurate images, which may be induced by the driving motor of the manipulator. So, there is a requirement for a stable active controller for sufficient vibration attenuation of the manipulator. However, there are very few reports in agricultural practices which use control algorithms. Although, many control strategies have been utilized to control the vibration in manipulators associated to various applications, no control strategy with validated stability has been provided to control the vibration in such envisioned agricultural manipulator with simple low-cost hardware devices with the compensation of non-linearities. So, in this work, the combination of proportional-integral-differential (PID) control with type-2 fuzzy logic (T2-F-PID) is implemented for vibration control. The validation of the controller stability using Lyapunov analysis is established. A torsional actuator (TA) is applied for mitigating torsional vibration, which is a new contribution in the area of agricultural manipulators. Also, to prove the effectiveness of the controller, the vibration attenuation results with T2-F-PID is compared with conventional PD/PID controllers, and a type-1 fuzzy PID (T1-F-PID) controller.
Identification and Classification of Driving Behaviour at Signalized Intersections Using Support Vector Machine
Soni Lanka Karri, Liyanage Chandratilak De Silva, Daphne Teck Ching Lai, Shiaw Yin Yong
doi: 10.1007/s11633-021-1295-y
Abstract PDF SpringerLink
Abstract:
When the drivers approaching signalized intersections (onset of yellow signal), the drivers would enter into a zone, where they will be in uncertain mode assessing their capabilities to stop or cross the intersection. Therefore, any improper decision might lead to a right-angle or back-end crash. To avoid a right-angle collision, drivers apply the harsh brakes to stop just before the signalized intersection. But this may lead to a back-end crash when the following driver encounters the former′s sudden stopping decision. This situation gets multifaceted when the traffic is heterogeneous, containing various types of vehicles. In order to reduce this issue, this study′s primary objective is to identify the driving behaviour at signalized intersections based on the driving features (parameters). The secondary objective is to classify the outcome of driving behaviour (safe stopping and unsafe stopping) at the signalized intersection using a support vector machine (SVM) technique. Turning moments are used to identify the zones and label them accordingly for further classification. The classification of 50 instances is identified for training and testing using a 70%−30% rule resulted in an accuracy of 85% and 86%, respectively. Classification performance is further verified by random sampling using five cross-validation and 30 iterations, which gave an accuracy of 97% and 100% for training and testing. These results demonstrate that the proposed approach can help develop a pre-warning system to alert the drivers approaching signalized intersections, thus reducing back-end crash and accidents.
DLA+: A Light Aggregation Network for Object Classification and Detection
Fu-Tian Wang, Li Yang, Jin Tang, Si-Bao Chen, Xin Wang
doi: 10.1007/s11633-021-1287-y
Abstract PDF SpringerLink
Abstract:
An efficient convolution neural network (CNN) plays a crucial role in various visual tasks like object classification or detection, etc. The most common way to construct a CNN is stacking the same convolution block or complex connection. These approaches may be efficient but the parameter size and computation (Comp) have explosive growth. So we present a novel architecture called “DLA+”, which could obtain the feature from the different stages, and by the newly designed convolution block, could achieve better accuracy, while also dropping the computation six times compared to the baseline. We design some experiments about classification and object detection. On the CIFAR10 and VOC data-sets, we get better precision and faster speed than other architecture. The lightweight network even allows us to deploy to some low-performance device like drone, laptop, etc.
PokerNet: Expanding Features Cheaply via Depthwise Convolutions
Wei Tang, Yan Huang, Liang Wang
doi: 10.1007/s11633-021-1288-x
Abstract PDF SpringerLink
Abstract:
Pointwise convolution is usually utilized to expand or squeeze features in modern lightweight deep models. However, it takes up most of the overall computational cost (usually more than 90%). This paper proposes a novel Poker module to expand features by taking advantage of cheap depthwise convolution. As a result, the Poker module can greatly reduce the computational cost, and meanwhile generate a large number of effective features to guarantee the performance. The proposed module is standardized and can be employed wherever the feature expansion is needed. By varying the stride and the number of channels, different kinds of bottlenecks are designed to plug the proposed Poker module into the network. Thus, a lightweight model can be easily assembled. Experiments conducted on benchmarks reveal the effectiveness of our proposed Poker module. And our PokerNet models can reduce the computational cost by 7.1%−15.6%. PokerNet models achieve comparable or even higher recognition accuracy than previous state-of-the-art (SOTA) models on the ImageNet ILSVRC2012 classification dataset. Code is available at https://github.com/diaomin/pokernet.
Fault Classification for On-board Equipment of High-speed Railway Based on Attention Capsule Network
Lu-Jie Zhou, Jian-Wu Dang, Zhen-Hai Zhang
doi: 10.1007/s11633-021-1291-2
Abstract PDF SpringerLink
Abstract:
The conventional troubleshooting methods for high-speed railway on-board equipment, with over-reliance on personnel experience, is characterized by one-sidedness and low efficiency. In the process of high-speed train operation, numerous text-based on-board logs are recorded by on-board computers. Machine learning methods can help technicians make a correct judgment of fault types using the on-board log reasonably. Therefore, a fault classification model of on-board equipment based on attention capsule networks is proposed. This paper presents an empirical exploration of the application of a capsule network with dynamic routing in fault classification. A capsule network can encode the internal spatial part-whole relationship between various entities to identify the fault types. As the importance of each word in the on-board log and the dependencies between them have a significant impact on fault classification, an attention mechanism is incorporated into the capsule network to distill important information. Considering the imbalanced distribution of normal data and fault data in the on-board log, the focal loss function is introduced into the model to adjust the imbalanced data. The experiments are conducted on the on-board log of a railway bureau and compared with other baseline models. The experimental results demonstrate that our model outperforms the compared baseline methods, proving the superiority and competitiveness of our model.
Skill Learning for Robotic Insertion Based on One-shot Demonstration and Reinforcement Learning
Ying Li, De Xu
doi: 10.1007/s11633-021-1290-3
Abstract PDF SpringerLink
Abstract:
In this paper, an efficient skill learning framework is proposed for robotic insertion, based on one-shot demonstration and reinforcement learning. First, the robot action is composed of two parts: expert action and refinement action. A force Jacobian matrix is calibrated with only one demonstration, based on which stable and safe expert action can be generated. The deep deterministic policy gradients (DDPG) method is employed to learn the refinement action, which aims to improve the assembly efficiency. Second, an episode-step exploration strategy is developed, which uses the expert action as a benchmark and adjusts the exploration intensity dynamically. A safety-efficiency reward function is designed for the compliant insertion. Third, to improve the adaptability with different components, a skill saving and selection mechanism is proposed. Several typical components are used to train the skill models. And the trained models and force Jacobian matrices are saved in a skill pool. Given a new component, the most appropriate model is selected from the skill pool according to the force Jacobian matrix and directly used to accomplish insertion tasks. Fourth, a simulation environment is established under the guidance of the force Jacobian matrix, which avoids tedious training process on real robotic systems. Simulation and experiments are conducted to validate the effectiveness of the proposed methods.
Designing an Intelligent Control Philosophy in Reservoirs of Water Transfer Networks in Supervisory Control and Data Acquisition System Stations
Ali Dolatshahi Zand, Kaveh Khalili-Damghani, Sadigh Raissi
doi: 10.1007/s11633-021-1284-1
Abstract PDF SpringerLink
Abstract:
In this paper, a hybrid neural-genetic fuzzy system is proposed to control the flow and height of water in the reservoirs of water transfer networks. These controls will avoid probable water wastes in the reservoirs and pressure drops in water distribution networks. The proposed approach combines the artificial neural network, genetic algorithm, and fuzzy inference system to improve the performance of the supervisory control and data acquisition stations through a new control philosophy for instruments and control valves in the reservoirs of the water transfer networks. First, a multi-core artificial neural network model, including a multi-layer perceptron and radial based function, is proposed to forecast the daily consumption of the water in a reservoir. A genetic algorithm is proposed to optimize the parameters of the artificial neural networks. Then, the online height of water in the reservoir and the output of artificial neural networks are used as inputs of a fuzzy inference system to estimate the flow rate of the reservoir inlet. Finally, the estimated inlet flow is translated into the input valve position using a transform control unit supported by a nonlinear autoregressive exogenous model. The proposed approach is applied in the Tehran water transfer network. The results of this study show that the usage of the proposed approach significantly reduces the deviation of the reservoir height from the desired levels.
STRNet: Triple-stream Spatiotemporal Relation Network for Action Recognition
Zhi-Wei Xu, Xiao-Jun Wu, Josef Kittler
doi: 10.1007/s11633-021-1289-9
Abstract PDF SpringerLink
Abstract:
Learning comprehensive spatiotemporal features is crucial for human action recognition. Existing methods tend to model the spatiotemporal feature blocks in an integrate-separate-integrate form, such as appearance-and-relation network (ARTNet) and spatiotemporal and motion network (STM). However, with blocks stacking up, the rear part of the network has poor interpretability. To avoid this problem, we propose a novel architecture called spatial temporal relation network (STRNet), which can learn explicit information of appearance, motion and especially the temporal relation information. Specifically, our STRNet is constructed by three branches, which separates the features into 1) appearance pathway, to obtain spatial semantics, 2) motion pathway, to reinforce the spatiotemporal feature representation, and 3) relation pathway, to focus on capturing temporal relation details of successive frames and to explore long-term representation dependency. In addition, our STRNet does not just simply merge the multi-branch information, but we apply a flexible and effective strategy to fuse the complementary information from multiple pathways. We evaluate our network on four major action recognition benchmarks: Kinetics-400, UCF-101, HMDB-51, and Something-Something v1, demonstrating that the performance of our STRNet achieves the state-of-the-art result on the UCF-101 and HMDB-51 datasets, as well as a comparable accuracy with the state-of-the-art method on Something-Something v1 and Kinetics-400.
EDT Method for Multiple Labelled Objects Subject to Tied Distances
Andre Marasca, Andre Backes, Fabio Favarim, Marcelo Teixeira, Dalcimar Casanova
doi: 10.1007/s11633-021-1285-0
Abstract PDF SpringerLink
Abstract:
The success of new scientific areas can be assessed by their potential for contributing to new theoretical approaches aligned with real-world applications. The Euclidean distance transform (EDT) has fared well in both cases, providing a sound theoretical basis for a number of applications, such as median axis transform, fractal analysis, skeletonization, and Voronoi diagrams. Despite its wide applicability, the discrete form of the EDT includes interesting properties that have not yet been fully exploited in the literature. In this paper, we are particularly interested in the properties of 1) working with multiple objects/labels; and 2) identifying and counting equidistant pixels/voxels from certain points of interest. In some domains (such as dataset classification, texture, and complexity analysis), the result of applying the EDT transform with different objects, and their respective tied distances, may compromise the performance. In this sense, we propose an efficient modification in the method presented in [1], which leads to a novel approach for computing the distance transform in a space with multiple objects, and for counting equidistant pixels/voxels.
Research on Voiceprint Recognition of Camouflage Voice Based on Deep Belief Network
Nan Jiang, Ting Liu
doi: 10.1007/s11633-021-1283-2
Abstract PDF SpringerLink
Abstract:
The problem of disguised voice recognition based on deep belief networks is studied. A hybrid feature extraction algorithm based on formants, Gammatone frequency cepstrum coefficients (GFCC) and their different coefficients is proposed to extract more discriminative speaker features from the original voice data. Using mixed features as the input of the model, a masquerade voice library is constructed. A masquerade voice recognition model based on a depth belief network is proposed. A dropout strategy is introduced to prevent overfitting, which effectively solves the problems of traditional Gaussian mixture models, such as insufficient modeling ability and low discrimination. Experimental results show that the proposed disguised voice recognition method can better fit the feature distribution, and significantly improve the classification effect and recognition rate.
A Spatial Cognitive Model that Integrates the Effects of Endogenous and Exogenous Information on the Hippocampus and Striatum
Jing Huang, He-Yuan Yang, Xiao-Gang Ruan, Nai-Gong Yu, Guo-Yu Zuo, Hao-Meng Liu
doi: 10.1007/s11633-021-1286-z
Abstract PDF SpringerLink
Abstract:
Reproducing the spatial cognition of animals using computational models that make agents navigate autonomously has attracted much attention. Many biologically inspired models for spatial cognition focus mainly on the simulation of the hippocampus and only consider the effect of external environmental information (i.e., exogenous information) on the hippocampal coding. However, neurophysiological studies have shown that the striatum, which is closely related to the hippocampus, also plays an important role in spatial cognition and that information inside animals (i.e., endogenous information) also affects the encoding of the hippocampus. Inspired by the progress made in neurophysiological studies, we propose a new spatial cognitive model that consists of analogies between the hippocampus and striatum. This model takes into consideration how both exogenous and endogenous information affects coding by the environment. We carried out a series of navigation experiments that simulated a water maze and compared our model with other models. Our model is self-adaptable and robust and has better performance in navigation path length. We also discuss the possible reasons for the results and how our findings may help us understand real mechanisms in the spatial cognition of animals.
Global FLS-based Consensus of Stochastic Uncertain Nonlinear Multi-agent Systems
Jia-Xi Chen, Jun-Min Li
doi: 10.1007/s11633-021-1279-y
Abstract PDF SpringerLink
Abstract:
Using graph theory, matrix theory, adaptive control, fuzzy logic systems and other tools, this paper studies the leader-follower global consensus of two kinds of stochastic uncertain nonlinear multi-agent systems (MAS). Firstly, the fuzzy logic systems replaces the feedback compensator as the feedforward compensator to describe the uncertain nonlinear dynamics. Secondly, based on the network topology, all followers are divided into two categories: One is the followers who can obtain the leader signal, and the other is the follower who cannot obtain the leader signal. Thirdly, based on the adaptive control method, distributed control protocols are designed for the two types of followers. Fourthly, based on matrix theory and stochastic Lyapunov stability theory, the stability of the closed-loop systems is analyzed. Finally, three simulation examples are given to verify the effectiveness of the proposed control algorithms.
Optimal Policies for Quantum Markov Decision Processes
Ming-Sheng Ying, Yuan Feng, Sheng-Gang Ying
doi: 10.1007/s11633-021-1278-z
Abstract PDF SpringerLink
Abstract:
Markov decision process (MDP) offers a general framework for modelling sequential decision making where outcomes are random. In particular, it serves as a mathematical framework for reinforcement learning. This paper introduces an extension of MDP, namely quantum MDP (qMDP), that can serve as a mathematical model of decision making about quantum systems. We develop dynamic programming algorithms for policy evaluation and finding optimal policies for qMDPs in the case of finite-horizon. The results obtained in this paper provide some useful mathematical tools for reinforcement learning techniques applied to the quantum world.
A Regularized LSTM Method for Predicting Remaining Useful Life of Rolling Bearings
Zhao-Hua Liu, Xu-Dong Meng, Hua-Liang Wei, Liang Chen, Bi-Liang Lu, Zhen-Heng Wang, Lei Chen
doi: 10.1007/s11633-020-1276-6
Abstract PDF SpringerLink
Abstract:
Rotating machinery is important to industrial production. Any failure of rotating machinery, especially the failure of rolling bearings, can lead to equipment shutdown and even more serious incidents. Therefore, accurate residual life prediction plays a crucial role in guaranteeing machine operation safety and reliability and reducing maintenance cost. In order to increase the forecasting precision of the remaining useful life (RUL) of the rolling bearing, an advanced approach combining elastic net with long short-time memory network (LSTM) is proposed, and the new approach is referred to as E-LSTM. The E-LSTM algorithm consists of an elastic mesh and LSTM, taking temporal-spatial correlation into consideration to forecast the RUL through the LSTM. To solve the over-fitting problem of the LSTM neural network during the training process, the elastic net based regularization term is introduced to the LSTM structure. In this way, the change of the output can be well characterized to express the bearing degradation mode. Experimental results from the real-world data demonstrate that the proposed E-LSTM method can obtain higher stability and relevant values that are useful for the RUL forecasting of bearing. Furthermore, these results also indicate that E-LSTM can achieve better performance.
Application of Machine Learning for Online Reputation Systems
Ahmad Alqwadri, Mohammad Azzeh, Fadi Almasalha
doi: 10.1007/s11633-020-1275-7
Abstract PDF SpringerLink
Abstract:
Users on the Internet usually require venues to provide better purchasing recommendations. This can be provided by a reputation system that processes ratings to provide recommendations. The rating aggregation process is a main part of reputation systems to produce global opinions about the product quality. Naive methods that are frequently used do not consider consumer profiles in their calculations and cannot discover unfair ratings and trends emerging in new ratings. Other sophisticated rating aggregation methods that use a weighted average technique focus on one or a few aspects of consumers′ profile data. This paper proposes a new reputation system using machine learning to predict reliability of consumers from their profile. In particular, we construct a new consumer profile dataset by extracting a set of factors that have a great impact on consumer reliability, which serve as an input to machine learning algorithms. The predicted weight is then integrated with a weighted average method to compute product reputation score. The proposed model has been evaluated over three MovieLens benchmarking datasets, using 10-folds cross validation. Furthermore, the performance of the proposed model has been compared to previous published rating aggregation models. The obtained results were promising which suggest that the proposed approach could be a potential solution for reputation systems. The results of the comparison demonstrated the accuracy of our models. Finally, the proposed approach can be integrated with online recommendation systems to provide better purchasing recommendations and facilitate user experience on online shopping markets.
Research on Transfer Learning of Vision-based Gesture Recognition
Bi-Xiao Wu, Chen-Guang Yang, Jun-Pei Zhong
doi: 10.1007/s11633-020-1273-9
Abstract PDF SpringerLink
Abstract:
Gesture recognition has been widely used for human-robot interaction. At present, a problem in gesture recognition is that the researchers did not use the learned knowledge in existing domains to discover and recognize gestures in new domains. For each new domain, it is required to collect and annotate a large amount of data, and the training of the algorithm does not benefit from prior knowledge, leading to redundant calculation workload and excessive time investment. To address this problem, the paper proposes a method that could transfer gesture data in different domains. We use a red-green-blue (RGB) Camera to collect images of the gestures, and use Leap Motion to collect the coordinates of 21 joint points of the human hand. Then, we extract a set of novel feature descriptors from two different distributions of data for the study of transfer learning. This paper compares the effects of three classification algorithms, i.e., support vector machine (SVM), broad learning system (BLS) and deep learning (DL). We also compare learning performances with and without using the joint distribution adaptation (JDA) algorithm. The experimental results show that the proposed method could effectively solve the transfer problem between RGB Camera and Leap Motion. In addition, we found that when using DL to classify the data, excessive training on the source domain may reduce the accuracy of recognition in the target domain.
STEP AP 242 Managed Model-based 3D Engineering: An Application Towards the Automation of Fixture Planning
Remil George Thomas, Deepak Lawrence K., Manu R.
doi: 10.1007/s11633-020-1272-x
Abstract PDF SpringerLink
Abstract:
Fixture design and planning is one of the most important manufacturing activities, playing a pivotal role in deciding the lead time for product development. Fixture design, which affects the part-quality in terms of geometric accuracy and surface finish, can be enhanced by using the product manufacturing information (PMI) stored in the neutral standard for the exchange of product model data (STEP) file, thereby integrating design and manufacturing. The present paper proposes a unique fixture design approach, to extract the geometry information from STEP application protocol (AP) 242 files of computer aided design (CAD) models, for providing automatic suggestions of locator positions and clamping surfaces. Automatic feature extraction software “FiXplan”, developed using the programming language C#, is used to extract the part feature, dimension and geometry information. The information from the STEP AP242 file is deduced using geometric reasoning techniques, which in turn is utilized for fixture planning. The developed software is observed to be adept in identifying the primary, secondary, and tertiary locating faces and locator position configurations of prismatic components. Structural analysis of the prismatic part under different locator positions was performed using commercial finite element method software, ABAQUS, and the optimized locator position was identified on the basis of minimum deformation of the workpiece. The area-ratio (base locator enclosed area (%)/work piece base area (%)) for the ideal locator configuration was observed as 33%. Experiments were conducted on a prismatic workpiece using a specially designed fixture, for different locator configurations. The surface roughness and waviness of the machined surfaces were analysed using an Alicona non-contact optical profilometer. The best surface characteristics were obtained for the surface machined under the ideal locator positions having an area-ratio of 33%, thus validating the predicted numerical results. The efficiency, capability and applicability of the developed software is demonstrated for the finishing operation of a sensor cover – a typical prismatic component having applications in the naval industry, under different locator configurations. The best results were obtained under the proposed ideal locator configuration of area-ratio 33%.
Delayed Teleoperation with Force Feedback of a Humanoid Robot
Viviana Moya, Emanuel Slawiñski, Vicente Mut
doi: 10.1007/s11633-020-1267-7
Abstract PDF SpringerLink
Abstract:
Teleoperation systems allow the extension of human capabilities to remote-control devices by providing the operator with conditions similar to those at the remote site through a communication channel that sends information from one site to the other. This article aims to present an analysis of the benefits of force feedback applied to the bilateral teleoperation of a humanoid robot with time-varying delay. As a control scheme, we link adaptive inverse dynamics compensation, balance control, and P+d like controllers. Finally, a test is performed where an operator simultaneously handles the locomotion (forward velocity and turn angle) and arm of a simulated 3D humanoid robot to do a pick-and-place task using two master devices with force feedback, where indexes such as time to complete the task, coordination errors, path tracking error, and percentage of successful tests are reported for different time-delays. We conclude with the results achieved.
Behavior-based Autonomous Navigation and Formation Control of Mobile Robots in Unknown Cluttered Dynamic Environments with Dynamic Target Tracking
Nacer Hacene, Boubekeur Mendil
doi: 10.1007/s11633-020-1264-x
Abstract PDF SpringerLink
Abstract:
While different species in nature have safely solved the problem of navigation in a dynamic environment, this remains a challenging task for researchers around the world. The paper addresses the problem of autonomous navigation in an unknown dynamic environment for a single and a group of three wheeled omnidirectional mobile robots (TWOMRs). The robot has to track a dynamic target while avoiding dynamic obstacles and dynamic walls in an unknown and very dense environment. It adopts a behavior-based controller that consists of four behaviors: “target tracking”, “obstacle avoidance”, “dynamic wall following” and “avoid robots”. The paper considers the problem of kinematic saturation. In addition, it introduces a strategy for predicting the velocity of dynamic obstacles based on two successive measurements of the ultrasonic sensors to calculate the velocity of the obstacle expressed in the sensor frame. Furthermore, the paper proposes a strategy to deal with dynamic walls even when they have U-like or V-like shapes. The approach can also deal with the formation control of a group of robots based on the leader-follower structure and the behavior-based control, where the robots have to get together and maintain a given formation while navigating toward the target, avoiding obstacles and walls in a dynamic environment. The effectiveness of the proposed approaches is demonstrated via simulation.
Robust Observer-based Control of Nonlinear Multi- Omnidirectional Wheeled Robot Systems via High Order Sliding-mode Consensus Protocol
M. R. Rahimi Khoygani, R. Ghasemi, P. Ghayoomi
doi: 10.1007/s11633-020-1254-z
Abstract PDF SpringerLink
Abstract:
This paper presents a novel observer-based controller for a class of nonlinear multi-agent robot models using the high order sliding mode consensus protocol. In many applications, demand for autonomous vehicles is growing; omnidirectional wheeled robots are suggested to meet this demand. They are flexible, fast, and autonomous, able to find the best direction and can move on an optional path at any time. Multi-agent omnidirectional wheeled robot (MOWR) systems consist of several similar or different robots and there are multiple different interactions between their agents, thus the MOWR systems have complex dynamics. Hence, designing a robust reliable controller for the nonlinear MOWR operations is considered an important obstacles in the science of the control design. A high order sliding mode is selected in this work that is a suitable technique for implementing a robust controller for nonlinear complex dynamics models. Furthermore, the proposed method ensures all signals involved in the multi-agent system (MAS) are uniformly ultimately bounded and the system is robust against the external disturbances and uncertainties. Theoretical analysis of candidate Lyapunov functions has been presented to depict the stability of the overall MAS, the convergence of observer and tracking error to zero, and the reduction of the chattering phenomena. In order to illustrate the promising performance of the methodology, the observer is applied to two nonlinear dynamic omnidirectional wheeled robots. The results display the meritorious performance of the scheme.
Observer-based Multirate Feedback Control Design for Two-time-scale System
Ravindra Munje, Wei-Dong Zhang
doi: 10.1007/s11633-020-1268-6
Abstract PDF SpringerLink
Abstract:
The use of a lower sampling rate for designing a discrete-time state feedback-based controller fails to capture information of fast states in a two-time-scale system, while the use of a higher sampling rate increases the amount of computation considerably. Thus, the use of single-rate sampling for systems with slow and fast states has evident limitations. In this paper, multirate state feedback (MRSF) control for a linear time-invariant two-time-scale system is proposed. Here, multirate sampling refers to the sampling of slow and fast states at different sampling rates. Firstly, a block-triangular form of the original continuous two-time-scale system is constructed. Then, it is discretized with a smaller sampling period and feedback control is designed for the fast subsystem. Later, the system is block-diagonalized and equivalently represented into a system with a higher sampling period. Subsequently, feedback control is designed for the slow subsystem and overall MRSF control is derived. It is proved that the derived MRSF control stabilizes the full-order system. Being the transformed states of the original system, slow and fast states need to be estimated for the MRSF control realization. Hence, a sequential two-stage observer is formulated to estimate these states. Finally, the applicability of the design method is demonstrated with a numerical example and simulation results are compared with the single-rate sampling method. It is found that the proposed MRSF control and observer designs reduce computations without compromising closed-loop performance.
Learning Deep RGBT Representations for Robust Person Re-identification
Ai-Hua Zheng, Zi-Han Chen, Cheng-Long Li, Jin Tang, Bin Luo
doi: 10.1007/s11633-020-1262-z
Abstract PDF SpringerLink
Abstract:
Person re-identification (Re-ID) is the scientific task of finding specific person images of a person in a non-overlapping camera networks, and has achieved many breakthroughs recently. However, it remains very challenging in adverse environmental conditions, especially in dark areas or at nighttime due to the imaging limitations of a single visible light source. To handle this problem, we propose a novel deep red green blue (RGB)-thermal (RGBT) representation learning framework for a single modality RGB person Re-ID. Due to the lack of thermal data in prevalent RGB Re-ID datasets, we propose to use the generative adversarial network to translate labeled RGB images of person to thermal infrared ones, trained on existing RGBT datasets. The labeled RGB images and the synthetic thermal images make up a labeled RGBT training set, and we propose a cross-modal attention network to learn effective RGBT representations for person Re-ID in day and night by leveraging the complementary advantages of RGB and thermal modalities. Extensive experiments on Market1501, CUHK03 and DukeMTMC-reID datasets demonstrate the effectiveness of our method, which achieves state-of-the-art performance on all above person Re-ID datasets.
Display Method:
Review
Evolutionary Computation for Large-scale Multi-objective Optimization: A Decade of Progresses
Wen-Jing Hong, Peng Yang, Ke Tang
2021,  vol. 18,  no. 2, pp. 155-169,  doi: 10.1007/s11633-020-1253-0
Abstract PDF SpringerLink
Abstract:
Large-scale multi-objective optimization problems (MOPs) that involve a large number of decision variables, have emerged from many real-world applications. While evolutionary algorithms (EAs) have been widely acknowledged as a mainstream method for MOPs, most research progress and successful applications of EAs have been restricted to MOPs with small-scale decision variables. More recently, it has been reported that traditional multi-objective EAs (MOEAs) suffer severe deterioration with the increase of decision variables. As a result, and motivated by the emergence of real-world large-scale MOPs, investigation of MOEAs in this aspect has attracted much more attention in the past decade. This paper reviews the progress of evolutionary computation for large-scale multi-objective optimization from two angles. From the key difficulties of the large-scale MOPs, the scalability analysis is discussed by focusing on the performance of existing MOEAs and the challenges induced by the increase of the number of decision variables. From the perspective of methodology, the large-scale MOEAs are categorized into three classes and introduced respectively: divide and conquer based, dimensionality reduction based and enhanced search-based approaches. Several future research directions are also discussed.
fMRI-based Decoding of Visual Information from Human Brain Activity: A Brief Review
Shuo Huang, Wei Shao, Mei-Ling Wang, Dao-Qiang Zhang
2021,  vol. 18,  no. 2, pp. 170-184,  doi: 10.1007/s11633-020-1263-y
Abstract PDF SpringerLink
Abstract:
One of the most significant challenges in the neuroscience community is to understand how the human brain works. Recent progress in neuroimaging techniques have validated that it is possible to decode a person′s thoughts, memories, and emotions via functional magnetic resonance imaging (i.e., fMRI) since it can measure the neural activation of human brains with satisfied spatiotemporal resolutions. However, the unprecedented scale and complexity of the fMRI data have presented critical computational bottlenecks requiring new scientific analytic tools. Given the increasingly important role of machine learning in neuroscience, a great many machine learning algorithms are presented to analyze brain activities from the fMRI data. In this paper, we mainly provide a comprehensive and up-to-date review of machine learning methods for analyzing neural activities with the following three aspects, i.e., brain image functional alignment, brain activity pattern analysis, and visual stimuli reconstruction. In addition, online resources and open research problems on brain pattern analysis are also provided for the convenience of future research.
Humans and Robots: A Mutually Inclusive Relationship in a Contagious World
Akash Gupta, Anshuman Singh, Deepak Bharadwaj, Amit Kumar Mondal
2021,  vol. 18,  no. 2, pp. 185-203,  doi: 10.1007/s11633-020-1266-8
Abstract PDF SpringerLink
Abstract:
The coronavirus global pandemic has spread faster and more severely than experts had anticipated. While this has presented itself as a great challenge, researchers worldwide have shown ingenuity and dexterity in adapting technology and devising new strategies to combat this pandemic. However, implementing these strategies alone impedes the nature of everyone′s daily life. Hence, an intersection between these strategies and the technological advantages of robotics, artificial intelligence, and autonomous systems is essential for near-to-normal operation. In this review paper, different applications of robotic systems, various aspects of modern technologies, including medical imaging, telemedicine, and supply chains, have been covered with respect to the COVID-19 pandemic. Furthermore, concerns over user′s data privacy, job losses, and legal aspects of the implementation of robotics are also been discussed.
Research Article
Structured Computational Modeling of Human Visual System for No-reference Image Quality Assessment
Wen-Han Zhu, Wei Sun, Xiong-Kuo Min, Guang-Tao Zhai, Xiao-Kang Yang
2021,  vol. 18,  no. 2, pp. 204-218,  doi: 10.1007/s11633-020-1270-z
Abstract PDF SpringerLink
Abstract:
Objective image quality assessment (IQA) plays an important role in various visual communication systems, which can automatically and efficiently predict the perceived quality of images. The human eye is the ultimate evaluator for visual experience, thus the modeling of human visual system (HVS) is a core issue for objective IQA and visual experience optimization. The traditional model based on black box fitting has low interpretability and it is difficult to guide the experience optimization effectively, while the model based on physiological simulation is hard to integrate into practical visual communication services due to its high computational complexity. For bridging the gap between signal distortion and visual experience, in this paper, we propose a novel perceptual no-reference (NR) IQA algorithm based on structural computational modeling of HVS. According to the mechanism of the human brain, we divide the visual signal processing into a low-level visual layer, a middle-level visual layer and a high-level visual layer, which conduct pixel information processing, primitive information processing and global image information processing, respectively. The natural scene statistics (NSS) based features, deep features and free-energy based features are extracted from these three layers. The support vector regression (SVR) is employed to aggregate features to the final quality prediction. Extensive experimental comparisons on three widely used benchmark IQA databases (LIVE, CSIQ and TID2013) demonstrate that our proposed metric is highly competitive with or outperforms the state-of-the-art NR IQA measures.
Prediction of Spatiotemporal Evolution of Urban Traffic Emissions Based on Taxi Trajectories
Zhen-Yi Zhao, Yang Cao, Yu Kang, Zhen-Yi Xu
2021,  vol. 18,  no. 2, pp. 219-232,  doi: 10.1007/s11633-020-1271-y
Abstract PDF SpringerLink
Abstract:
With the rapid increase of the amount of vehicles in urban areas, the pollution of vehicle emissions is becoming more and more serious. Precise prediction of the spatiotemporal evolution of urban traffic emissions plays a great role in urban planning and policy making. Most existing methods usually focus on estimating vehicle emissions at historical or current moments which cannot well meet the demands of future planning. Recent work has started to pay attention to the evolution of vehicle emissions at future moments using multiple attributes related to emissions, however, they are not effective and efficient enough in the combination and utilization of different inputs. To address this issue, we propose a joint framework to predict the future evolution of vehicle emissions based on the GPS trajectories of taxis with a multi-channel spatiotemporal network and the motor vehicle emission simulator (MOVES) model. Specifically, we first estimate the spatial distribution matrices with GPS trajectories through map-matching algorithms. These matrices can reflect the attributes related to the traffic status of road networks such as volume, speed and acceleration. Then, our multi-channel spatiotemporal network is used to efficiently combine three key attributes (volume, speed and acceleration) through the feature sharing mechanism and generate a precise prediction of them in the future period. Finally, we adopt an MOVES model to estimate vehicle emissions by integrating several traffic factors including the predicted traffic states, road networks and the statistical information of urban vehicles. We evaluate our model on the Xi′an taxi GPS trajectories dataset. Experiments show that our proposed network can effectively predict the temporal evolution of vehicle emissions.
Computational Decision Support System for ADHD Identification
Senuri De Silva, Sanuwani Dayarathna, Gangani Ariyarathne, Dulani Meedeniya, Sampath Jayarathna, Anne M. P. Michalek
2021,  vol. 18,  no. 2, pp. 233-255,  doi: 10.1007/s11633-020-1252-1
Abstract PDF SpringerLink
Abstract:
Attention deficit/hyperactivity disorder (ADHD) is a common disorder among children. ADHD often prevails into adulthood, unless proper treatments are facilitated to engage self-regulatory systems. Thus, there is a need for effective and reliable mechanisms for the early identification of ADHD. This paper presents a decision support system for the ADHD identification process. The proposed system uses both functional magnetic resonance imaging (fMRI) data and eye movement data. The classification processes contain enhanced pipelines, and consist of pre-processing, feature extraction, and feature selection mechanisms. fMRI data are processed by extracting seed-based correlation features in default mode network (DMN) and eye movement data using aggregated features of fixations and saccades. For the classification using eye movement data, an ensemble model is obtained with 81% overall accuracy. For the fMRI classification, a convolutional neural network (CNN) is used with 82% accuracy for the ADHD identification. Both ensemble models are proved for overfitting avoidance.
Image Inpainting Based on Structural Tensor Edge Intensity Model
Jing Wang, Yan-Hong Zhou, Hai-Feng Sima, Zhan-Qiang Huo, Ai-Zhong Mi
2021,  vol. 18,  no. 2, pp. 256-265,  doi: 10.1007/s11633-020-1256-x
Abstract PDF SpringerLink
Abstract:
In the exemplar-based image inpainting approach, there are usually two major problems: the unreasonable calculation of priority and only considering the color features in the patch lookup strategy. In this paper, we propose an image inpainting approach based on the structural tensor edge intensity model. First, we use the progressive scanning inpainting method to avoid the image filling order being affected by the priority function. Then, we use the edge intensity model to build the patches similarity function for correctly identifying the local image structure. Finally, the balance operator is used to restrict the excessive propagation of structural information to ensure the correct structural reconstruction. The experimental results show that the our approach is comparable and even superior to some state-of-the-art inpainting algorithms.
Camera-based Basketball Scoring Detection Using Convolutional Neural Network
Xu-Bo Fu, Shao-Long Yue, De-Yun Pan
2021,  vol. 18,  no. 2, pp. 266-276,  doi: 10.1007/s11633-020-1259-7
Abstract PDF SpringerLink
Abstract:
Recently, deep learning methods have been applied in many real scenarios with the development of convolutional neural networks (CNNs). In this paper, we introduce a camera-based basketball scoring detection (BSD) method with CNN based object detection and frame difference-based motion detection. In the proposed BSD method, the videos of the basketball court are taken as inputs. Afterwards, the real-time object detection, i.e., you only look once (YOLO) model, is implemented to locate the position of the basketball hoop. Then, the motion detection based on frame difference is utilized to detect whether there is any object motion in the area of the hoop to determine the basketball scoring condition. The proposed BSD method runs in real-time with satisfactory basketball scoring detection accuracy. Our experiments on the collected real scenario basketball court videos show the accuracy of the proposed BSD method. Furthermore, several intelligent basketball analysis systems based on the proposed method have been installed at multiple basketball courts in Beijing, and they provide good performance.
Suction-based Grasp Point Estimation in Cluttered Environment for Robotic Manipulator Using Deep Learning-based Affordance Map
Tri Wahyu Utomo, Adha Imam Cahyadi, Igi Ardiyanto
2021,  vol. 18,  no. 2, pp. 277-287,  doi: 10.1007/s11633-020-1260-1
Abstract PDF SpringerLink
Abstract:
Perception and manipulation tasks for robotic manipulators involving highly-cluttered objects have become increasingly indemand for achieving a more efficient problem solving method in modern industrial environments. But, most of the available methods for performing such cluttered tasks failed in terms of performance, mainly due to inability to adapt to the change of the environment and the handled objects. Here, we propose a new, near real-time approach to suction-based grasp point estimation in a highly cluttered environment by employing an affordance-based approach. Compared to the state-of-the-art, our proposed method offers two distinctive contributions. First, we use a modified deep neural network backbone for the input of the semantic segmentation, to classify pixel elements of the input red, green, blue and depth (RGBD) channel image which is then used to produce an affordance map, a pixel-wise probability map representing the probability of a successful grasping action in those particular pixel regions. Later, we incorporate a high speed semantic segmentation to the system, which makes our solution have a lower computational time. This approach does not need to have any prior knowledge or models of the objects since it removes the step of pose estimation and object recognition entirely compared to most of the current approaches and uses an assumption to grasp first then recognize later, which makes it possible to have an object-agnostic property. The system was designed to be used for household objects, but it can be easily extended to any kind of objects provided that the right dataset is used for training the models. Experimental results show the benefit of our approach which achieves a precision of 88.83%, compared to the 83.4% precision of the current state-of-the-art.
A Tracking Registration Method for Augmented Reality Based on Multi-modal Template Matching and Point Clouds
Peng-Xia Cao, Wen-Xin Li, Wei-Ping Ma
2021,  vol. 18,  no. 2, pp. 288-299,  doi: 10.1007/s11633-020-1265-9
Abstract PDF SpringerLink
Abstract:
In order to overcome the defects where the surface of the object lacks sufficient texture features and the algorithm cannot meet the real-time requirements of augmented reality, a markerless augmented reality tracking registration method based on multi-modal template matching and point clouds is proposed. The method first adapts the linear parallel multi-modal LineMod template matching method with scale invariance to identify the texture-less target and obtain the reference image as the key frame that is most similar to the current perspective. Then, we can obtain the initial pose of the camera and solve the problem of re-initialization because of tracking registration interruption. A point cloud-based method is used to calculate the precise pose of the camera in real time. In order to solve the problem that the traditional iterative closest point (ICP) algorithm cannot meet the real-time requirements of the system, Kd-tree (k-dimensional tree) is used under the graphics processing unit (GPU) to replace the part of finding the nearest points in the original ICP algorithm to improve the speed of tracking registration. At the same time, the random sample consensus (RANSAC) algorithm is used to remove the error point pairs to improve the accuracy of the algorithm. The results show that the proposed tracking registration method has good real-time performance and robustness.
Fire Detection Method Based on Depthwise Separable Convolution and YOLOv3
Yue-Yan Qin, Jiang-Tao Cao, Xiao-Fei Ji
2021,  vol. 18,  no. 2, pp. 300-310,  doi: 10.1007/s11633-020-1269-5
Abstract PDF SpringerLink
Abstract:
Recently, video-based fire detection technology has become an important research topic in the field of machine vision. This paper proposes a method of combining the classification model and target detection model in deep learning for fire detection. Firstly, the depthwise separable convolution is used to classify fire images, which saves a lot of detection time under the premise of ensuring detection accuracy. Secondly, You Only Look Once version 3 (YOLOv3) target regression function is used to output the fire position information for the images whose classification result is fire, which avoids the problem that the accuracy of detection cannot be guaranteed by using YOLOv3 for target classification and position regression. At the same time, the detection time of target regression for images without fire is greatly reduced saved. The experiments were tested using a network public database. The detection accuracy reached 98% and the detection rate reached 38 fps. This method not only saves the workload of manually extracting flame characteristics, reduces the calculation cost, and reduces the amount of parameters, but also improves the detection accuracy and detection rate.
Feature Selection and Feature Learning for High-dimensional Batch Reinforcement Learning: A Survey
De-Rong Liu, Hong-Liang, Li Ding Wang
2015,  vol. 12,  no. 3, pp. 229-242,  doi: 10.1007/s11633-015-0893-y
Abstract PDF SpringerLink
Second-order Sliding Mode Approaches for the Control of a Class of Underactuated Systems
Sonia Mahjoub, Faiçal Mnif, Nabil Derbel
2015,  vol. 12,  no. 2, pp. 134-141,  doi: 10.1007/s11633-015-0880-3
Abstract PDF SpringerLink
Genetic Algorithm with Variable Length Chromosomes for Network Intrusion Detection
Sunil Nilkanth Pawar, Rajankumar Sadashivrao Bichkar
2015,  vol. 12,  no. 3, pp. 337-342,  doi: 10.1007/s11633-014-0870-x
Abstract PDF SpringerLink
Grey Qualitative Modeling and Control Method for Subjective Uncertain Systems
Peng Wang, Shu-Jie Li, Yan Lv, Zong-Hai Chen
2015,  vol. 12,  no. 1, pp. 70-76,  doi: 10.1007/s11633-014-0820-7
Abstract PDF SpringerLink
Recent Progress in Networked Control Systems-A Survey
Yuan-Qing Xia, Yu-Long Gao, Li-Ping Yan, Meng-Yin Fu
2015,  vol. 12,  no. 4, pp. 343-367,  doi: 10.1007/s11633-015-0894-x
Abstract PDF SpringerLink
A Wavelet Neural Network Based Non-linear Model Predictive Controller for a Multi-variable Coupled Tank System
Kayode Owa, Sanjay Sharma, Robert Sutton
2015,  vol. 12,  no. 2, pp. 156-170,  doi: 10.1007/s11633-014-0825-2
Abstract PDF SpringerLink
Cooperative Formation Control of Autonomous Underwater Vehicles: An Overview
Bikramaditya Das, Bidyadhar Subudhi, Bibhuti Bhusan Pati
2016,  vol. 13,  no. 3, pp. 199-225,  doi: 10.1007/s11633-016-1004-4
Abstract PDF SpringerLink
An Unsupervised Feature Selection Algorithm with Feature Ranking for Maximizing Performance of the Classifiers
Danasingh Asir Antony Gnana Singh, Subramanian Appavu Alias Balamurugan, Epiphany Jebamalar Leavline
2015,  vol. 12,  no. 5, pp. 511-517,  doi: 10.1007/s11633-014-0859-5
Abstract PDF SpringerLink
Sliding Mode and PI Controllers for Uncertain Flexible Joint Manipulator
Lilia Zouari, Hafedh Abid, Mohamed Abid
2015,  vol. 12,  no. 2, pp. 117-124,  doi: 10.1007/s11633-015-0878-x
Abstract PDF SpringerLink
Bounded Real Lemmas for Fractional Order Systems
Shu Liang, Yi-Heng Wei, Jin-Wen Pan, Qing Gao, Yong Wang
2015,  vol. 12,  no. 2, pp. 192-198,  doi: 10.1007/s11633-014-0868-4
Abstract PDF SpringerLink
Robust Face Recognition via Low-rank Sparse Representation-based Classification
Hai-Shun Du, Qing-Pu Hu, Dian-Feng Qiao, Ioannis Pitas
2015,  vol. 12,  no. 6, pp. 579-587,  doi: 10.1007/s11633-015-0901-2
Abstract PDF SpringerLink
Distributed Control of Chemical Process Networks
Michael J. Tippett, Jie Bao
2015,  vol. 12,  no. 4, pp. 368-381,  doi: 10.1007/s11633-015-0895-9
Abstract PDF SpringerLink
Analysis of Fractional-order Linear Systems with Saturation Using Lyapunov s Second Method and Convex Optimization
Esmat Sadat Alaviyan Shahri, Saeed Balochian
2015,  vol. 12,  no. 4, pp. 440-447,  doi: 10.1007/s11633-014-0856-8
Abstract PDF SpringerLink
Appropriate Sub-band Selection in Wavelet Packet Decomposition for Automated Glaucoma Diagnoses
Chandrasekaran Raja, Narayanan Gangatharan
2015,  vol. 12,  no. 4, pp. 393-401,  doi: 10.1007/s11633-014-0858-6
Abstract PDF SpringerLink
Generalized Norm Optimal Iterative Learning Control with Intermediate Point and Sub-interval Tracking
David H. Owens, Chris T. Freeman, Bing Chu
2015,  vol. 12,  no. 3, pp. 243-253,  doi: 10.1007/s11633-015-0888-8
Abstract PDF SpringerLink
Flexible Strip Supercapacitors for Future Energy Storage
Rui-Rong Zhang, Yan-Meng Xu, David Harrison, John Fyson, Fu-Lian Qiu, Darren Southee
2015,  vol. 12,  no. 1, pp. 43-49,  doi: 10.1007/s11633-014-0866-6
Abstract PDF SpringerLink
Advances in Vehicular Ad-hoc Networks (VANETs): Challenges and Road-map for Future Development
Elias C. Eze, Si-Jing Zhang, En-Jie Liu, Joy C. Eze
2016,  vol. 13,  no. 1, pp. 1-18,  doi: 10.1007/s11633-015-0913-y
Abstract PDF SpringerLink
Finite-time Control for a Class of Networked Control Systems with Short Time-varying Delays and Sampling Jitter
Chang-Chun Hua, Shao-Chong Yu, Xin-Ping Guan
2015,  vol. 12,  no. 4, pp. 448-454,  doi: 10.1007/s11633-014-0849-7
Abstract PDF SpringerLink
Backstepping Control of Speed Sensorless Permanent Magnet Synchronous Motor Based on Slide Model Observer
Cai-Xue Chen, Yun-Xiang Xie, Yong-Hong Lan
2015,  vol. 12,  no. 2, pp. 149-155,  doi: 10.1007/s11633-015-0881-2
Abstract PDF SpringerLink
Extracting Parameters of OFET Before and After Threshold Voltage Using Genetic Algorithms
Imad Benacer, Zohir Dibi
2016,  vol. 13,  no. 4, pp. 382-391,  doi: 10.1007/s11633-015-0918-6
Abstract PDF SpringerLink
A High-order Internal Model Based Iterative Learning Control Scheme for Discrete Linear Time-varying Systems
Wei Zhou, Miao Yu, De-Qing Huang
2015,  vol. 12,  no. 3, pp. 330-336,  doi: 10.1007/s11633-015-0886-x
Abstract PDF SpringerLink
Fault Information Recognition for On-board Equipment of High-speed Railway Based on Multi-Neural Network Collaboration
Lu-Jie Zhou, Jian-Wu Dang, Zhen-Hai Zhang
Accepted Manuscript  doi: 10.1007/s11633-021-1298-8
Abstract PDF
Abstract:
It is of great significance to guarantee the efficient statistics of high-speed railway on-board equipment fault information, which also improves the efficiency of fault analysis. Considering this background, this paper presents an empirical exploration of named entity recognition (NER) of on-board equipment fault information. Based on the historical fault records of on-board equipment, a fault information recognition model based on multi-neural network collaboration is proposed. First, considering Chinese recorded data characteristics, a method of constructing semantic features and additional features based on character granularity is proposed. Then, the two feature representations are concatenated and passed into the gated convolutional layer to extract the dependencies from multiple different subspaces and adjacent characters in parallel. Next, the local features are transmitted to the bidirectional long short-term memory (BiLSTM) to learn long-term dependency information. On top of BiLSTM, the sequential conditional random field (CRF) is used to jointly decode the optimized tag sequence of the whole sentence. The model is tested and compared with other representative baseline models. The results show that the proposed model not only considers the language characteristics of on-board fault records, but also has obvious advantages on the performance of fault information recognition.
A Novel Heterogeneous Actor-Critic Algorithm with Recent Emphasizing Replay Memory
Bao Xi, Rui Wang, Ying-Hao Cai, Tao Lu, Shuo Wang
Accepted Manuscript  doi: 10.1007/s11633-021-1296-x
Abstract PDF
Abstract:
Reinforcement learning (RL) algorithms have been demonstrated to solve a variety of continuous control tasks. However, the training efficiency and performance of such methods limit further applications. In this paper, we propose an off-policy heterogeneous actor-critic (HAC) algorithm, which contains soft Q-function and ordinary Q-function. The soft Q-function encourages the exploration of a Gaussian policy, and the ordinary Q-function optimizes the mean of the Gaussian policy to improve the training efficiency. Experience replay memory is another vital component of off-policy RL methods. We propose a new sampling technique that emphasizes recently experienced transitions to boost the policy training. Besides, we integrate HAC with hindsight experience replay (HER) to deal with sparse reward tasks, which are common in the robotic manipulation domain. Finally, we evaluate our methods on a series of continuous control benchmark tasks and robotic manipulation tasks. The experimental results show that our method outperforms prior state-of-the-art methods in terms of training efficiency and performance, which validate the effectiveness of our method.
Robust Optimal Higher-order-observer-based Dynamic Sliding Mode Control for VTOL Unmanned Aerial Vehicles
Yashar Mousavi, Amin Zarei, Arash Mousavi, Mohsen Biari
Accepted Manuscript  doi: 10.1007/s11633-021-1282-3
Abstract PDF
Abstract:
This paper investigates the precise trajectory tracking of unmanned aerial vehicles (UAV) capable of vertical take-off and landing (VTOL) subjected to external disturbances. For this reason, a robust higher-order-observer-based dynamic sliding mode controller (HOB-DSMC) is developed and optimized using the fractional-order firefly algorithm (FOFA). In the proposed scheme, the sliding surface is defined as a function of output variables, and the higher-order observer is utilized to estimate the unmeasured variables, which effectively alleviate the undesirable effects of the chattering phenomenon. A neighboring point close to the sliding surface is considered, and as the tracking error approaches this point, the second control is activated to reduce the control input. The stability analysis of the closed-loop system is studied based on Lyapunov stability theorem. For a better study of the proposed scheme, various trajectory tracking tests are provided, where accurate tracking and strong robustness can be simultaneously ensured. Comparative simulation results validate the proposed control strategy′s effectiveness and its superiorities over conventional sliding mode controller (SMC) and integral SMC approaches.
Current Issue

2021 Vol.18 No.2

Table of Contents

ISSN 1476-8186

E-ISSN 1751-8520

CN 11-5350/TP

Editors-in-chief
Tieniu TAN, Chinese Academy of SciencesGuoping LIU, University of South WalesHuosheng HU, University of Essex
Global Visitors