Home  |  About Journal  |  Editorial Board  |  For Authors  |  For Referees  |  For Readers  |  Subscription  |  Contract Us
International Journal of Automation and Computing 2018, Vol. 15 Issue (2) :181-193    DOI: 10.1007/s11633-018-1120-4
Special Issue on Automation and Computing Advancements for Future Industries Current Issue | Next Issue | Archive | Adv Search << Previous Articles | Next Articles >>
A Fuzzy Neural Network Based Dynamic Data Allocation Model on Heterogeneous Multi-GPUs for Large-scale Computations
Chao-Long Zhang1,3, Yuan-Ping Xu1, Zhi-Jie Xu2,3, Jia He2, Jing Wang4, Jian-Hua Adu1
1 School of Software Engineering, Chengdu University of Information Technology, Chengdu 610225, China;
2 School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China;
3 School of Computing & Engineering, University of Huddersfield, Queensgate, Huddersfield, HD1 3DH, UK;
4 Department of Computing, Sheffield Hallam University, Sheffield, S1 2NT, UK
Download: [PDF 1381KB] HTML()   Export: BibTeX or EndNote (RIS)      Supporting Info
Abstract The parallel computation capabilities of modern graphics processing units (GPUs) have attracted increasing attention from researchers and engineers who have been conducting high computational throughput studies. However, current single GPU based engineering solutions are often struggling to fulfill their real-time requirements. Thus, the multi-GPU-based approach has become a popular and cost-effective choice for tackling the demands. In those cases, the computational load balancing over multiple GPU "nodes" is often the key and bottleneck that affect the quality and performance of the real-time system. The existing load balancing approaches are mainly based on the assumption that all GPU nodes in the same computer framework are of equal computational performance, which is often not the case due to cluster design and other legacy issues. This paper presents a novel dynamic load balancing (DLB) model for rapid data division and allocation on heterogeneous GPU nodes based on an innovative fuzzy neural network (FNN). In this research, a 5-state parameter feedback mechanism defining the overall cluster and node performance is proposed. The corresponding FNN-based DLB model will be capable of monitoring and predicting individual node performance under different workload scenarios. A real-time adaptive scheduler has been devised to reorganize the data inputs to each node when necessary to maintain their runtime computational performance. The devised model has been implemented on two dimensional (2D) discrete wavelet transform (DWT) applications for evaluation. Experiment results show that this DLB model enables a high computational throughput while ensuring real-time and precision requirements from complex computational tasks.
Email this article
Add to my bookshelf
Add to citation manager
Email Alert
Articles by authors
KeywordsHeterogeneous GPU cluster   dynamic load balancing   fuzzy neural network   adaptive scheduler   discrete wavelet transform     
Received: 2017-10-09; Revised: 2018-02-09; published: 2018-02-09

This work was supported by National Natural Science Foundation of China (No. 61203172), the SSTP of Sichuan (Nos. 2018YYJC0994 and 2017JY0011) and Shenzhen STPP (No. GJHZ20160301164521358).

Corresponding Authors: Yuan-Ping Xu     Email: ypxu@cuit.edu.cn
About author: Chao-Long Zhang received the B. Eng. and M. Sc. degrees in software engineering from Chengdu University of Information Technology, China in 2014 and 2017, respectively. E-mail:chaolong.zhang@hud.ac.uk;Yuan-Ping Xu received the B. Eng. degree in computer science and technology from Southwest Jiaotong University.E-mail:ypxu@cuit.edu.cn;Zhi-Jie Xu received the B. Eng. degree in communication engineering from the Xi'an University of Science and Technology, China in 1991.E-mail:z.xu@hud.ac.uk;Jia He received B. Eng. and M. sc. degrees in computer science and technology from Southwest Normal University of China. E-mail:hejia@cuit.edu.cn;Jing Wang received the Ph. D. degree from University of Huddersfield, UK in 2012. E-mail:jing.wang@shu.ac.uk;Jian-Hua Adu received B. Sc. degree in applied physics from Minzu University of China.E-mail:adujh@126.com
Cite this article:   
Chao-Long Zhang, Yuan-Ping Xu, Zhi-Jie Xu, Jia He, Jing Wang, Jian-Hua Adu. A Fuzzy Neural Network Based Dynamic Data Allocation Model on Heterogeneous Multi-GPUs for Large-scale Computations[J]. International Journal of Automation and Computing , vol. 15, no. 2, pp. 181-193, 2018.
http://www.ijac.net/EN/10.1007/s11633-018-1120-4      或     http://www.ijac.net/EN/Y2018/V15/I2/181
[1] D. B. Kirk and W. H. Wen-mei, Programming massively parallel processors:a hands-on approach. Newnes, 2012.
[2] R. Couturier, Designing Scientific Applications on GPUs. CRC Press, 2013.
[3] S. W. Keckler, W. J. Dally, B. Khailany, M. Garland, and D. Glasco, "GPUs and the future of parallel computing," IEEE Micro, vol. 31, no. 5, pp. 7-17, 2011.
[4] C. W. Lee, J. Ko, and T.-Y. Choe, "Two-way partitioning of a recursive Gaussian filter in CUDA," EURASIP J. Image Video Process,. 2014.
[5] J. A. Belloch, A. Gonzalez, F. J. Martínez-Zaldívar, and A. M. Vidal, "Real-time massive convolution for audio applications on GPU," J. Supercomput., vol. 58, no. 3, pp. 449-457, 2011.
[6] F. Nasse, C. Thurau, and G. A. Fink, "Face Detection Using GPU-Based Convolutional Neural Networks," in International Conference on Computer Analysis of Images and Patterns, 2009.
[7] NVIDIA, "CUDA C Programming Guide v8.0," 2016.[Online]. Available:http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.htm.
[8] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," Adv. Neural Inf. Process. Syst., pp. 1-9, 2012.
[9] C. Szegedy et al., "Going Deeper with Convolutions," 2014.
[10] K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition," Inf. Softw. Technol., vol. 51, no. 4, pp. 769-784, 2015.
[11] X. J. Jiang and D. J. Whitehouse, "Technological shifts in surface metrology," CIRP Ann. Technol., vol. 61, no. 2, 2012.
[12] W. Jianjun, L. Wenlong, L. Xiaojun, and J. Xiangqian, "High-speed parallel wavelet algorithm based on CUDA and its application in three-dimensional surface texture analysis," in Electric Information and Control Engineering (ICEICE), 2011 International Conference on, pp. 2249-2252, 2011.
[13] S. Chen and X. Li, "A hybrid GPU/CPU FFT library for large FFT problems," in Performance Computing and Communications Conference (IPCCC), 2013 IEEE 32nd International, 2013.
[14] C. Zhang, Y. Xu, J. He, J. Lu, L. Lu, and Z. Xu, "Multi-GPUs Gaussian filtering for real-time big data processing," in 201610th International Conference on Software, Knowledge, Information Management & Applications (SKIMA), 2016.
[15] S. Schaetz and M. Uecker, "A multi-GPU programming library for real-time applications," in International Conference on Algorithms and Architectures for Parallel Processing, 2012.
[16] J. A. Stuart and J. D. Owens, "Multi-GPU MapReduce on GPU clusters," in Parallel & Distributed Processing Symposium (IPDPS), 2011 IEEE International, 2011.
[17] M. Grossman, M. Breternitz, and V. Sarkar, "HadoopCL:Mapreduce on distributed heterogeneous platforms through seamless integration of hadoop and opencl," in Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International, 2013.
[18] M. Boyer, K. Skadron, S. Che, and N. Jayasena, "Load balancing in a changing world," in Proceedings of the ACM International Conference on Computing Frontiers-CF'13, 2013.
[19] L. Chen, O. Villa, S. Krishnamoorthy, and G. R. Gao, "Dynamic load balancing on single-and multi-GPU systems," in 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), 2010.
[20] A. Acosta, R. Corujo, V. Blanco, and F. Almeida, "Dynamic load balancing on heterogeneous multicore/multiGPU systems," in 2010 International Conference on High Performance Computing & Simulation, 2010.
[21] A. Acosta, V. Blanco, and F. Almeida, "Towards the Dynamic Load Balancing on Heterogeneous Multi-GPU Systems.," 2012 IEEE 10th Int. Symp. Parallel Distrib. Process. with Appl., pp. 646-653, 2012.
[22] B. Pérez, E. Stafford, J. L. Bosque, and R. Beivide, "Energy efficiency of load balancing for data-parallel applications in heterogeneous systems," J. Supercomput., vol. 73, no. 1, 2017.
[23] R. Kaleem, R. Barik, T. Shpeisman, B. T. Lewis, C. Hu, and K. Pingali, "Adaptive heterogeneous scheduling for integrated GPUs," in Proceedings of the 23rd international conference on Parallel architectures and compilation-PACT'14, 2014.
[24] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," arXiv Prepr, 2015.
[25] H. Zermane and H. Mouss, "Development of an internet and fuzzy based control system of manufacturing process," Int. J. Autom. Comput., vol. 14, no. 6, pp. 706-718, 2017.
[26] J. Li, Q. Wang, C. Wang, N. Cao, K. Ren, and W. Lou, "Fuzzy Keyword Search over Encrypted Data in Cloud Computing," in 2010 Proceedings IEEE INFOCOM, 2010.
[27] S. Krinidis and V. Chatzis, "A Robust Fuzzy Local Information C-Means Clustering Algorithm," IEEE Trans. Image Process., vol. 19, no. 5, pp. 1328-1337, 2010.
[28] M. Algabri, H. Mathkour, and H. Ramdane, "Mobile Robot Navigation and Obstacle-avoidance using ANFIS in Unknown Environment," Int. J. Comput. Appl., vol. 91, no. 14, 2014.
[29] R. J. Kuo, S. Y. Hong, and Y. C. Huang, "Integration of particle swarm optimization-based fuzzy neural network and artificial neural network for supplier selection," Appl. Math. Model., vol. 34, no. 12, pp. 3976-3990, 2010.
[30] C. L. P. Chen, Yan-Jun Liu, and Guo-Xing Wen, "Fuzzy Neural Network-Based Adaptive Control for a Class of Uncertain Nonlinear Stochastic Systems," IEEE Trans. Cybern., vol. 44, no. 5, pp. 583-593, 2014.
[31] A. Saffar, R. Hooshmand, and A. Khodabakhshian, "A new fuzzy optimal reconfiguration of distribution systems for loss reduction and load balancing using ant colony search-based algorithm," Appl. Soft Comput., vol. 11, no. 5, 2011.
[32] N. Susila, S. Chandramathi, and R. Kishore, "A Fuzzy-based Firefly Algorithm for Dynamic Load Balancing in Cloud Computing Environment," J. Emerg. Technol. Web Intell., vol. 6, no. 4, pp. 435-440, 2014.
[33] A. N. Toosi and R. Buyya, "A Fuzzy Logic-Based Controller for Cost and Energy Efficient Load Balancing in Geo-distributed Data Centers," in 2015 IEEE/ACM 8th International Conference on Utility and Cloud Computing (UCC), 2015.
[34] H. Muhamedsalih, X. Jiang, and F. Gao, "Accelerated surface measurement using wavelength scanning interferometer with compensation of environmental noise," in Procedia Engineering:12th CIRP Conference on Computer Aided Tolerancing, vol. 10, pp. 70-76, 2013.
[35] S.-H. Lee and J. S. Lim, "Forecasting KOSPI based on a neural network with weighted fuzzy membership functions," Expert Syst. Appl., vol. 38, no. 4, pp. 4259-4263, 2011.
[36] W. Sweldens, "The lifting scheme:A construction of second generation wavelets," SIAM J. Math. Anal., vol. 29, no. 2, 1998.
[37] S. Mittal and J. S. Vetter, "A Survey of CPU-GPU Heterogeneous Computing Techniques," ACM Comput. Surv., vol. 47, no. 4, pp. 1-35, 2015.
Copyright 2010 by International Journal of Automation and Computing