Home  |  About Journal  |  Editorial Board  |  For Authors  |  For Referees  |  For Readers  |  Subscription  |  Contract Us
International Journal of Automation and Computing 2018, Vol. 15 Issue (5) :582-592    DOI: 10.1007/s11633-018-1128-9
Special Issue on Intelligent Control and Computing in Advanced Robotics Current Issue | Next Issue | Archive | Adv Search << Previous Articles | Next Articles >>
Learning to Transform Service Instructions into Actions with Reinforcement Learning and Knowledge Base
Meng-Yang Zhang1,2, Guo-Hui Tian1,2, Ci-Ci Li1,2, Jing Gong1
1. School of Control Science and Engineering, Shandong University, Jinan 253000, China;
2. Shenzhen Research Institute, Shandong University, Shenzhen 518000, China
Download: [PDF 3365KB] HTML()   Export: BibTeX or EndNote (RIS)      Supporting Info
Abstract In order to improve the learning ability of robots, we present a reinforcement learning approach with a knowledge base for mapping natural language instructions to executable action sequences. A simulated platform with physical engine is built as interactive environment. Based on the knowledge base, a reward function with immediate rewards and delayed rewards is designed to handle sparse reward problems. Also, a list of object states is produced by retrieving the knowledge base, as a standard to define the quality of action sequences. Experimental results demonstrate that our approach yields good performance on accuracy of action sequences production.
Service
Email this article
Add to my bookshelf
Add to citation manager
Email Alert
RSS
Articles by authors
KeywordsNatural language   robot   knowledge base   reinforcement learning   object state     
Received: 2018-01-19; published: 2018-03-20
Fund:

This work was supported by National Natural Science Foundation of China (No.61773239) and Shenzhen Future Industry Special Fund (No. JCYJ20160331174814755).

Corresponding Authors: Guo-Hui Tian     Email: g.h.tian@sdu.edu.cn
About author: Meng-Yang Zhang research interests include intelligent space technology and service robot, reinforcement learning, and knowledge construction based on ontology. E-mail:zhangmengyang007@163.com ORCID iD:0000-0003-4267-1761;Guo-Hui Tian research interests include service robot, intelligent space, cloud robotics and brain-inspired intelligent robotics. E-mail:g.h.tian@sdu.edu.cn (Corresponding author) ORCID iD:0000-0001-8332-3064;Ci-Ci Li research interests include home service robot and object cognition. E-mail:201413043@mail.sdu.edu.cn;Jing Gong research interests include home service robot, natural language processing and cloud robot system. E-mail:gongjing689@gmail.com
Cite this article:   
Meng-Yang Zhang, Guo-Hui Tian, Ci-Ci Li, Jing Gong. Learning to Transform Service Instructions into Actions with Reinforcement Learning and Knowledge Base[J]. International Journal of Automation and Computing , vol. 15, no. 5, pp. 582-592, 2018.
URL:  
http://www.ijac.net/EN/10.1007/s11633-018-1128-9      或     http://www.ijac.net/EN/Y2018/V15/I5/582
 
[1] W. Wang, Q. F. Zhao, T. H. Zhu. Research of natural language understanding in human-service robot interaction. Microcomputer Applications, vol. 3, no. 1, pp. 45-49, 2015
[2] L. F. Shang, Z. D. Lu, H. Li. Neural responding machine for short-text conversation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, IEEE, Beijing, China,pp. 1577-1586, 2015.
[3] J. M. Ji, X. P. Chen. A weighted causal theory for acquiring and utilizing open knowledge. International Journal of Approximate Reasoning, vol. 55, no. 9, pp. 2071-2082, 2014. DOI: 10.1016/j.ijar.2014.03.002
[4] M. Tenorth, M. Beetz. Know rob-knowledge processing for autonomous personal robots. In Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, IEEE, St. Louis, USA, pp. 4261-4266, 2009.
[5] M. Waibel, M. Beetz, J. Civera, R. D'Andrea, J. Elfring, D. Galvez-Lopez, K. Haussermann, R. Janssen, J. M. M. Montiel, A. Perzylo, B. Schiessle, M. Tenorth, O. Zweigle, R. van de Molengraft. Roboearth. IEEE Robotics and Automation Magazine, vol. 18, no. 2, pp. 69-82, 2011. DOI: 10.1109/MRA.2011.941632
[6] R. Reiter. Knowledge in Action: Logical Foundations for Specifying and Implementing Dynamical Systems, Cambridge, USA: MIT Press, 2001.
[7] D. McDermott. The formal semantics of processes in PDDL. In Proceedings of the 23th International Conference on Automated Planning Scheduling, Rome, Italy, 2003.
[8] M. Fox, D. Long. PDDL2.1: An extension to PDDL for expressing temporal planning domains. Journal of Artificial Intelligence Research, vol. 20, pp. 61-124, 2003. DOI: 10.1613/jair.1129
[9] L. P. Kaelbling, M. L. Littman, A. R. Cassandra. Planning and acting in partially observable stochastic domains. Artificial Intelligence, vol. 101, no. 1-2, pp. 99-134, 1998. DOI: 10.1016/S0004-3702(98)00023-X
[10] I. A. Hameed. Using natural language processing (NLP) for designing socially intelligent robots. In Proceedings of Joint IEEE International Conference on Development and Learning and Epigenetic Robotics, IEEE, Cergy-Pontoises, France, pp. 268-269, 2016.
[11] M. Tenorth, D. Nyga, M. Beetz. Understanding and executing instructions for everyday manipulation tasks from the World Wide Web. In Proceedings of IEEE International Conference on Robotics and Automation, IEEE, Anchorage, USA, pp. 1486-1491, 2010.
[12] M. Tenorth, U. Klank, D. Pangercic, M. Beetz. Web-enabled robots. IEEE Robotics & Automation Magazine, vol. 18, no. 2, pp. 58-68, 2011. DOI: 10.1109/MRA.2011.940993 IEEE Robotics target="_blank">
[13] Y. LeCun, Y. G. Bengio, G. Hinton. Deep learning. Nature, vol. 521, no. 7553, pp. 436-444, 2015. DOI: 10.1038/nature14539
[14] L. Deng, D. Yu. Deep learning: Methods and applications. Foundations and Trends in Signal Processing, vol. 7, no. 3-4, pp. 197-387, 2014. DOI: 10.1561/2000000039
[15] G. Hinton, L. Deng, D. Yu, G. Dahl, A. R. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, B. Kingsbury. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 82-97, 2012. DOI: 10.1109/MSP.2012.2205597
[16] A. Krizhevsky, I. Sutskever, G. E. Hinton. ImageNet classification with deep convolutional neural networks. In Proceedings of Advances in Neural Information Processing Systems, Lake Tahoe, USA, pp. 1097-1105, 2012.
[17] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, D. Hassabis. Human-level control through deep reinforcement learning. Nature, vol. 518, no. 7540, pp. 529-533, 2015. DOI: 10.1038/nature14236
[18] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, D. Hassabis. Mastering the game of Go with deep neural networks and tree search. Nature, vol. 529, no. 7587, pp. 484-489, 2016. DOI: 10.1038/nature16961
[19] T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, D. Wierstra. Continuous control with deep reinforcement learning. Computer Science, vol. 529, no. 7587, pp. 484-489, 2015
[20] Y. Duan, X. Chen, R. Houthooft, J. Schulman, P. Abbeel. Benchmarking deep reinforcement learning for continuous control. In Proceedings of the 33rd International Conference on Machine Learning, ACM, New York, USA, pp. 1329-1338, 2016.
[21] R. S. Sutton, A. G. Barto. Reinforcement Learning: An Introduction, Cambridge, UK: MIT Press, 1998.
[22] J. He, M. Ostendorf, X. D. He, J. S. Chen, J. F. Gao, L. H. Li, L. Deng. Deep reinforcement learning with a combinatorial action space for predicting popular Reddit threads. http://arxir.org/abs/1606.03667.
[23] D. Dowty. Compositionality as an empirical problem. Direct Compositionality, C. Barker, P. I. Jacobson, Eds., Oxford, UK: Oxford University Press, pp. 23-101, 2007.
[24] K. S. Tai, R. Socher, C. D. Manning. Improved semantic representations from tree-structured long short-term memory networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, pp. 1556-1566, 2015.
[25] S. R. Bowman, J. Gauthier, A. Rastogi, R. Gupta, C. D. Manning, C. Potts. A fast unified model for parsing and sentence understanding. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, pp. 1466-1477, 2016.
[26] R. Kaplan, C. Sauer, A. Sosa. Beating Atari with natural language guided reinforcement learning. Computer Science.http://adsabs.harvard.edu/abs/2017arXiv170405539K.
[27] F. Wu, Z. W. Xu, Y. Yang. An end-to-end approach to natural language object retrieval via context-aware deep reinforcement learning. http://arxir.org/abs/1703.07579.
[28] S. R. K. Branavan, H. Chen, L. S. Zettlemoyer, R. Barzilay. Reinforcement learning for mapping instructions to actions. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Suntec, Singapore, pp. 82-90, 2009.
[29] A. Pritzel, B. Uria, S. Srinivasan, A. Puigdomenech, O. Vinyals, D. Hassabis, D. Wierstra, C. Blundell. Neural episodic control. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, pp. 963-975, 2017.
[30] A. S. Vezhnevets, S. Osindero, T. Schaul, N. Heess, M. Jaderberg, D. Silver, K. Kavukcuoglu. Feudal networks for hierarchical reinforcement learning. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, 2017.
[31] M. Jaderberg, V. Mnih, W. M. Czarnecki, T. Schaul, J. Z. Leibo, D. Silver, K. Kavukcuoglu. Reinforcement learning with unsupervised auxiliary tasks. Computer Science. http://adsabs.harvard.edu/abs/2016arXiv161105397J.
[32] G. Lample, D. S. Chaplot. Playing FPS games with deep reinforcement learning. In Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, USA, pp. 2140-2146, 2017.
[33] Q. Y. Gu, I. Ishii. Review of some advances and applications in real-time high-speed vision: our views and experiences. International Journal of Automation and Computing, vol. 13, no. 4, pp. 305-318, 2016. DOI: 10.1007/s11633-016-1024-0
[34] S. Miyashita, X. Y. Lian, X. Zeng, T. Matsubara, K. Uehara. Developing game AI agent behaving like human by mixing reinforcement learning and supervised learning. In Proceedings of the 18th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, IEEE, Kanazawa, Japan, pp. 489-494, 2017.
[35] Y. K. Zhu, R. Mottaghi, E. Kolve, J. J. Lim, A. Gupta, F. F. Li, A. Farhadi. Target-driven visual navigation in indoor scenes using deep reinforcement learning. In Proceedings of IEEE International Conference on Robotics and Automation, IEEE, Singapore, pp. 3357-3364, 2017.
[36] Q. V. Le. Building high-level features using large scale unsupervised learning. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, Vancouver, Canada, pp. 8595-8598, 2013.
[37] R. S. Sutton, D. McAllester, S. Singh, Y. Mansour. Policy gradient methods for reinforcement learning with function approximation. In Proceedings of Advances in Neural Information Processing Systems, Denver, USA, pp. 1057-1063, 2000.
[38] D. R. Liu, H. L. Li, D. Wang. Feature selection and feature learning for high-dimensional batch reinforcement learning: A survey. International Journal of Automation and Computing, vol. 12, no. 3, pp. 229-242, 2015. DOI: 10.1007/s11633-015-0893-y
[39] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, D. Hassabis. Mastering the game of Go with deep neural networks and tree search. Nature, vol. 529, no. 7587, pp. 484-489, 2016. DOI: 10.1038/nature16961
[40] T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, D. Wierstra. Continuous control with deep reinforcement learning. Computer Science, vol. 529, no. 7587, pp. 484-489, 2015
[41] Y. Duan, X. Chen, R. Houthooft, J. Schulman, P. Abbeel. Benchmarking deep reinforcement learning for continuous control. In Proceedings of the 33rd International Conference on Machine Learning, ACM, New York, USA, pp. 1329-1338, 2016.
[42] R. S. Sutton, A. G. Barto. Reinforcement Learning: An Introduction, Cambridge, UK: MIT Press, 1998.
[43] J. He, M. Ostendorf, X. D. He, J. S. Chen, J. F. Gao, L. H. Li, L. Deng. Deep reinforcement learning with a combinatorial action space for predicting popular Reddit threads. http://arxir.org/abs/1606.03667.
[44] D. Dowty. Compositionality as an empirical problem. Direct Compositionality, C. Barker, P. I. Jacobson, Eds., Oxford, UK: Oxford University Press, pp. 23-101, 2007.
[45] K. S. Tai, R. Socher, C. D. Manning. Improved semantic representations from tree-structured long short-term memory networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, pp. 1556-1566, 2015.
[46] S. R. Bowman, J. Gauthier, A. Rastogi, R. Gupta, C. D. Manning, C. Potts. A fast unified model for parsing and sentence understanding. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, pp. 1466-1477, 2016.
[47] R. Kaplan, C. Sauer, A. Sosa. Beating Atari with natural language guided reinforcement learning. Computer Science.http://adsabs.harvard.edu/abs/2017arXiv170405539K.
[48] F. Wu, Z. W. Xu, Y. Yang. An end-to-end approach to natural language object retrieval via context-aware deep reinforcement learning. http://arxir.org/abs/1703.07579.
[49] S. R. K. Branavan, H. Chen, L. S. Zettlemoyer, R. Barzilay. Reinforcement learning for mapping instructions to actions. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Suntec, Singapore, pp. 82-90, 2009.
[50] A. Pritzel, B. Uria, S. Srinivasan, A. Puigdomenech, O. Vinyals, D. Hassabis, D. Wierstra, C. Blundell. Neural episodic control. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, pp. 963-975, 2017.
[51] A. S. Vezhnevets, S. Osindero, T. Schaul, N. Heess, M. Jaderberg, D. Silver, K. Kavukcuoglu. Feudal networks for hierarchical reinforcement learning. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, 2017.
[52] M. Jaderberg, V. Mnih, W. M. Czarnecki, T. Schaul, J. Z. Leibo, D. Silver, K. Kavukcuoglu. Reinforcement learning with unsupervised auxiliary tasks. Computer Science. http://adsabs.harvard.edu/abs/2016arXiv161105397J.
[53] G. Lample, D. S. Chaplot. Playing FPS games with deep reinforcement learning. In Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, USA, pp. 2140-2146, 2017.
[54] Q. Y. Gu, I. Ishii. Review of some advances and applications in real-time high-speed vision: our views and experiences. International Journal of Automation and Computing, vol. 13, no. 4, pp. 305-318, 2016. DOI: 10.1007/s11633-016-1024-0
[55] S. Miyashita, X. Y. Lian, X. Zeng, T. Matsubara, K. Uehara. Developing game AI agent behaving like human by mixing reinforcement learning and supervised learning. In Proceedings of the 18th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, IEEE, Kanazawa, Japan, pp. 489-494, 2017.
[56] Y. K. Zhu, R. Mottaghi, E. Kolve, J. J. Lim, A. Gupta, F. F. Li, A. Farhadi. Target-driven visual navigation in indoor scenes using deep reinforcement learning. In Proceedings of IEEE International Conference on Robotics and Automation, IEEE, Singapore, pp. 3357-3364, 2017.
[57] Q. V. Le. Building high-level features using large scale unsupervised learning. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, Vancouver, Canada, pp. 8595-8598, 2013.
[58] R. S. Sutton, D. McAllester, S. Singh, Y. Mansour. Policy gradient methods for reinforcement learning with function approximation. In Proceedings of Advances in Neural Information Processing Systems, Denver, USA, pp. 1057-1063, 2000.
[59] D. R. Liu, H. L. Li, D. Wang. Feature selection and feature learning for high-dimensional batch reinforcement learning: A survey. International Journal of Automation and Computing, vol. 12, no. 3, pp. 229-242, 2015. DOI: 10.1007/s11633-015-0893-y
[60] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, D. Hassabis. Mastering the game of Go with deep neural networks and tree search. Nature, vol. 529, no. 7587, pp. 484-489, 2016. DOI: 10.1038/nature16961
[61] T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, D. Wierstra. Continuous control with deep reinforcement learning. Computer Science, vol. 529, no. 7587, pp. 484-489, 2015
[62] Y. Duan, X. Chen, R. Houthooft, J. Schulman, P. Abbeel. Benchmarking deep reinforcement learning for continuous control. In Proceedings of the 33rd International Conference on Machine Learning, ACM, New York, USA, pp. 1329-1338, 2016.
[63] R. S. Sutton, A. G. Barto. Reinforcement Learning: An Introduction, Cambridge, UK: MIT Press, 1998.
[64] J. He, M. Ostendorf, X. D. He, J. S. Chen, J. F. Gao, L. H. Li, L. Deng. Deep reinforcement learning with a combinatorial action space for predicting popular Reddit threads. http://arxir.org/abs/1606.03667.
[65] D. Dowty. Compositionality as an empirical problem. Direct Compositionality, C. Barker, P. I. Jacobson, Eds., Oxford, UK: Oxford University Press, pp. 23-101, 2007.
[66] K. S. Tai, R. Socher, C. D. Manning. Improved semantic representations from tree-structured long short-term memory networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, pp. 1556-1566, 2015.
[67] S. R. Bowman, J. Gauthier, A. Rastogi, R. Gupta, C. D. Manning, C. Potts. A fast unified model for parsing and sentence understanding. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, pp. 1466-1477, 2016.
[68] R. Kaplan, C. Sauer, A. Sosa. Beating Atari with natural language guided reinforcement learning. Computer Science.http://adsabs.harvard.edu/abs/2017arXiv170405539K.
[69] F. Wu, Z. W. Xu, Y. Yang. An end-to-end approach to natural language object retrieval via context-aware deep reinforcement learning. http://arxir.org/abs/1703.07579.
[70] S. R. K. Branavan, H. Chen, L. S. Zettlemoyer, R. Barzilay. Reinforcement learning for mapping instructions to actions. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Suntec, Singapore, pp. 82-90, 2009.
[71] A. Pritzel, B. Uria, S. Srinivasan, A. Puigdomenech, O. Vinyals, D. Hassabis, D. Wierstra, C. Blundell. Neural episodic control. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, pp. 963-975, 2017.
[72] A. S. Vezhnevets, S. Osindero, T. Schaul, N. Heess, M. Jaderberg, D. Silver, K. Kavukcuoglu. Feudal networks for hierarchical reinforcement learning. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, 2017.
[73] M. Jaderberg, V. Mnih, W. M. Czarnecki, T. Schaul, J. Z. Leibo, D. Silver, K. Kavukcuoglu. Reinforcement learning with unsupervised auxiliary tasks. Computer Science. http://adsabs.harvard.edu/abs/2016arXiv161105397J.
[74] G. Lample, D. S. Chaplot. Playing FPS games with deep reinforcement learning. In Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, USA, pp. 2140-2146, 2017.
[75] Q. Y. Gu, I. Ishii. Review of some advances and applications in real-time high-speed vision: our views and experiences. International Journal of Automation and Computing, vol. 13, no. 4, pp. 305-318, 2016. DOI: 10.1007/s11633-016-1024-0
[76] S. Miyashita, X. Y. Lian, X. Zeng, T. Matsubara, K. Uehara. Developing game AI agent behaving like human by mixing reinforcement learning and supervised learning. In Proceedings of the 18th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, IEEE, Kanazawa, Japan, pp. 489-494, 2017.
[77] Y. K. Zhu, R. Mottaghi, E. Kolve, J. J. Lim, A. Gupta, F. F. Li, A. Farhadi. Target-driven visual navigation in indoor scenes using deep reinforcement learning. In Proceedings of IEEE International Conference on Robotics and Automation, IEEE, Singapore, pp. 3357-3364, 2017.
[78] Q. V. Le. Building high-level features using large scale unsupervised learning. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, Vancouver, Canada, pp. 8595-8598, 2013.
[79] R. S. Sutton, D. McAllester, S. Singh, Y. Mansour. Policy gradient methods for reinforcement learning with function approximation. In Proceedings of Advances in Neural Information Processing Systems, Denver, USA, pp. 1057-1063, 2000.
[80] D. R. Liu, H. L. Li, D. Wang. Feature selection and feature learning for high-dimensional batch reinforcement learning: A survey. International Journal of Automation and Computing, vol. 12, no. 3, pp. 229-242, 2015. DOI: 10.1007/s11633-015-0893-y
[81] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, D. Hassabis. Mastering the game of Go with deep neural networks and tree search. Nature, vol. 529, no. 7587, pp. 484-489, 2016. DOI: 10.1038/nature16961
[82] T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, D. Wierstra. Continuous control with deep reinforcement learning. Computer Science, vol. 529, no. 7587, pp. 484-489, 2015
[83] Y. Duan, X. Chen, R. Houthooft, J. Schulman, P. Abbeel. Benchmarking deep reinforcement learning for continuous control. In Proceedings of the 33rd International Conference on Machine Learning, ACM, New York, USA, pp. 1329-1338, 2016.
[84] R. S. Sutton, A. G. Barto. Reinforcement Learning: An Introduction, Cambridge, UK: MIT Press, 1998.
[85] J. He, M. Ostendorf, X. D. He, J. S. Chen, J. F. Gao, L. H. Li, L. Deng. Deep reinforcement learning with a combinatorial action space for predicting popular Reddit threads. http://arxir.org/abs/1606.03667.
[86] D. Dowty. Compositionality as an empirical problem. Direct Compositionality, C. Barker, P. I. Jacobson, Eds., Oxford, UK: Oxford University Press, pp. 23-101, 2007.
[87] K. S. Tai, R. Socher, C. D. Manning. Improved semantic representations from tree-structured long short-term memory networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, pp. 1556-1566, 2015.
[88] S. R. Bowman, J. Gauthier, A. Rastogi, R. Gupta, C. D. Manning, C. Potts. A fast unified model for parsing and sentence understanding. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, pp. 1466-1477, 2016.
[89] R. Kaplan, C. Sauer, A. Sosa. Beating Atari with natural language guided reinforcement learning. Computer Science.http://adsabs.harvard.edu/abs/2017arXiv170405539K.
[90] F. Wu, Z. W. Xu, Y. Yang. An end-to-end approach to natural language object retrieval via context-aware deep reinforcement learning. http://arxir.org/abs/1703.07579.
[91] S. R. K. Branavan, H. Chen, L. S. Zettlemoyer, R. Barzilay. Reinforcement learning for mapping instructions to actions. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Suntec, Singapore, pp. 82-90, 2009.
[92] A. Pritzel, B. Uria, S. Srinivasan, A. Puigdomenech, O. Vinyals, D. Hassabis, D. Wierstra, C. Blundell. Neural episodic control. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, pp. 963-975, 2017.
[93] A. S. Vezhnevets, S. Osindero, T. Schaul, N. Heess, M. Jaderberg, D. Silver, K. Kavukcuoglu. Feudal networks for hierarchical reinforcement learning. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, 2017.
[94] M. Jaderberg, V. Mnih, W. M. Czarnecki, T. Schaul, J. Z. Leibo, D. Silver, K. Kavukcuoglu. Reinforcement learning with unsupervised auxiliary tasks. Computer Science. http://adsabs.harvard.edu/abs/2016arXiv161105397J.
[95] G. Lample, D. S. Chaplot. Playing FPS games with deep reinforcement learning. In Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, USA, pp. 2140-2146, 2017.
[96] Q. Y. Gu, I. Ishii. Review of some advances and applications in real-time high-speed vision: our views and experiences. International Journal of Automation and Computing, vol. 13, no. 4, pp. 305-318, 2016. DOI: 10.1007/s11633-016-1024-0
[97] S. Miyashita, X. Y. Lian, X. Zeng, T. Matsubara, K. Uehara. Developing game AI agent behaving like human by mixing reinforcement learning and supervised learning. In Proceedings of the 18th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, IEEE, Kanazawa, Japan, pp. 489-494, 2017.
[98] Y. K. Zhu, R. Mottaghi, E. Kolve, J. J. Lim, A. Gupta, F. F. Li, A. Farhadi. Target-driven visual navigation in indoor scenes using deep reinforcement learning. In Proceedings of IEEE International Conference on Robotics and Automation, IEEE, Singapore, pp. 3357-3364, 2017.
[99] Q. V. Le. Building high-level features using large scale unsupervised learning. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, Vancouver, Canada, pp. 8595-8598, 2013.
[100] R. S. Sutton, D. McAllester, S. Singh, Y. Mansour. Policy gradient methods for reinforcement learning with function approximation. In Proceedings of Advances in Neural Information Processing Systems, Denver, USA, pp. 1057-1063, 2000.
[101] D. R. Liu, H. L. Li, D. Wang. Feature selection and feature learning for high-dimensional batch reinforcement learning: A survey. International Journal of Automation and Computing, vol. 12, no. 3, pp. 229-242, 2015. DOI: 10.1007/s11633-015-0893-y
[1] Xia-Li Li, Li-Cheng Wu, Tian-Yi Lan.A 3D-printed Robot Hand with Three Linkage-driven Underactuated Fingers[J]. International Journal of Automation and Computing , 2018,15(5): 593-602
[2] Tian-Miao Wang, Yong Tao, Hui Liu.Current Researches and Future Development Trend of Intelligent Robot: A Review[J]. International Journal of Automation and Computing , 2018,15(5): 525-546
[3] Huan-Zhao Chen, Guo-Hui Tian, Guo-Liang Liu.A Selective Attention Guided Initiative Semantic Cognition Algorithm for Service Robot[J]. International Journal of Automation and Computing , 2018,15(5): 559-569
[4] Tobias Tiemerding, Sergej Fatikow.Software for Small-scale Robotics: A Review[J]. International Journal of Automation and Computing , 2018,15(5): 515-524
[5] Tarek Ababsa, Noureddine Djedl, Yves Duthen.Genetic Programming-based Self-reconfiguration Planning for Metamorphic Robot[J]. International Journal of Automation and Computing , 2018,15(4): 431-442
[6] Basant Kumar Sahu, Bidyadhar Subudhi, Madan Mohan Gupta.Stability Analysis of an Underactuated Autonomous Underwater Vehicle Using Extended-Routh's Stability Method[J]. International Journal of Automation and Computing , 2018,15(3): 299-309
[7] Zhi-Jing Li, Hai-Bin Wu, Jian-Ming Yang, Ming-Hao Wang, Jin-Hua Ye.A Position and Torque Switching Control Method for Robot Collision Safety[J]. International Journal of Automation and Computing , 2018,15(2): 156-168
[8] Fusaomi Nagata, Keigo Watanabe, Maki K. Habib.Machining Robot with Vibrational Motion and 3D Printer-like Data Interface[J]. International Journal of Automation and Computing , 2018,15(1): 1-12
[9] Sun-Chun Zhou, Rui Yan, Jia-Xin Li, Ying-Ke Chen, Huajin Tang.A Brain-inspired SLAM System Based on ORB Features[J]. International Journal of Automation and Computing , 2017,14(5): 564-575
[10] Chao Ma, Hong Qiao, Rui Li, Xiao-Qing Li.Flexible Robotic Grasping Strategy with Constrained Region in Environment[J]. International Journal of Automation and Computing , 2017,14(5): 552-563
[11] Saad Kashem, Hutomo Sufyan.A Novel Design of an Aquatic Walking Robot Having Webbed Feet[J]. International Journal of Automation and Computing , 2017,14(5): 576-588
[12] Mahmood Mazare, Mostafa Taghizadeh, M. Rasool Najafi.Kinematic Analysis and Design of a 3-DOF Translational Parallel Robot[J]. International Journal of Automation and Computing , 2017,14(4): 432-441
[13] A. Mallikarjuna Rao, K. Ramji, B. S. K. Sundara Siva Rao, V. Vasu, C. Puneeth.Navigation of Non-holonomic Mobile Robot Using Neuro-fuzzy Logic with Integrated Safe Boundary Algorithm[J]. International Journal of Automation and Computing , 2017,14(3): 285-294
[14] Ming-He Jin, Cheng Zhou, Ye-Chao Liu, Zi-Qi Liu, Hong Liu.Reaction Torque Control of Redundant Free-floating Space Robot[J]. International Journal of Automation and Computing , 2017,14(3): 295-306
[15] Hafizul Azizi Ismail, Michael S. Packianather, Roger I. Grosvenor.Multi-objective Invasive Weed Optimization of the LQR Controller[J]. International Journal of Automation and Computing , 2017,14(3): 321-339
Copyright 2010 by International Journal of Automation and Computing