|Citation:|| Abhijit Guha, Debabrata Samanta.
Hybrid Approach to Document Anomaly Detection: An Application to Facilitate RPA in Title Insurance. International Journal of Automation and Computing, vol. 18, no. 1, pp.55-72, 2021. https://doi.org/10.1007/s11633-020-1247-y
Anomaly detection (AD) is an important aspect of various domains and title insurance (TI) is no exception. Robotic process automation (RPA) is taking over manual tasks in TI business processes, but it has its limitations without the support of artificial intelligence (AI) and machine learning (ML). With increasing data dimensionality and in composite population scenarios, the complexity of detecting anomalies increases and AD in automated document management systems (ADMS) is the least explored domain. Deep learning, being the fastest maturing technology can be combined along with traditional anomaly detectors to facilitate and improve the RPAs in TI. We present a hybrid model for AD, using autoencoders (AE) and a one-class support vector machine (OSVM). In the present study, OSVM receives input features representing real-time documents from the TI business, orchestrated and with dimensions reduced by AE. The results obtained from multiple experiments are comparable with traditional methods and within a business acceptable range, regarding accuracy and performance.
X. D. Xu, H. W. Liu, M. H. Yao. Recent progress of anomaly detection. Complexity, vol. 2019, Article number 2686378, 2019. DOI: 10.1155/2019/2686378.
Y. Hao, Z. J. Xu, Y. Liu, J. Wang, J. L. Fan. Effective crowd anomaly detection through spatio-temporal texture analysis. International Journal of Automation and Computing, vol. 16, no. 1, pp. 27–39, 2019. DOI: 10.1007/s11633-018-1141-z.
Z. G. Ding, D. J. Du, M. R. Fei. An isolation principle based distributed anomaly detection method in wireless sensor networks. International Journal of Automation and Computing, vol. 12, no. 4, pp. 402–412, 2015. DOI: 10.1007/s11633-014-0847-9.
V. Chandola, A. Banerjee, V. Kumar. Anomaly detection: A survey. ACM Computing Surveys, vol. 41, no. 3, Article number 15, 2009. DOI: 10.1145/1541880.1541882.
S. S. Khan, M. G. Madden. One-class classification: Taxonomy of study and review of techniques. The Knowledge Engineering Review, vol. 29, no. 3, pp. 345–374, 2014. DOI: 10.1017/S026988891300043X.
M. Kemmler, E. Rodner, E. S. Wacker, J. Denzler. One-class classification with Gaussian processes. Pattern Recognition, vol. 46, no. 12, pp. 3507–3518, 2013. DOI: 10.1016/j.patcog.2013.06.005.
Q. Leng, H. G. Qi, J. Miao, W. T. Zhu, G. P. Su. One-class classification with extreme learning machine. Mathematical Problems in Engineering, vol. 2015, Article number 412957, 2015. DOI: 10.1155/2015/412957.
P. F. Liang, W. T. Li, H. Tian, J. L. Hu. One-class classification using a support vector machine with a quasi-linear kernel. IEEJ Transactions on Electrical and Electronic Engineering, vol. 14, no. 3, pp. 449–456, 2019. DOI: 10.1002/tee.22826.
D. Y. Oh, I. D. Yun. Residual error based anomaly detection using auto-encoder in SMD machine sound. Sensors, vol. 18, Article number 1308, 2018. DOI: 10.3390/s18051308.
J. Mourao-Miranda, D. R. Hardoon, T. Hahn, A. F. Marquand, S. C. R. Williams, J. Shawe-Taylor, M. Brammer. Patient classification as an outlier detection problem: An application of the one-class support vector machine. NeuroImage, vol. 58, no. 3, pp. 793–804, 2011. DOI: 10.1016/j.neuroimage.2011.06.042.
L. M. Manevitz, M. Yousef. One-class SVMs for document classification. Journal of Machine Learning Research, vol. 2, no. 1, pp. 139–154, 2001.
T. Sukchotrat, S. B. Kim, F. Tsung. One-class classification-based control charts for multivariate process monitoring. ⅡE Transactions, vol. 42, no. 2, pp. 107–120, 2009. DOI: 10.1080/07408170903019150.
P. Perera, V. M. Patel. Learning deep features for one-class classification. IEEE Transactions on Image Processing, vol. 28, no. 11, pp. 5450–5463, 2019. DOI: 10.1109/TIP.2019.2917862.
L. Ruff, R. Vandermeulen, N. Goernitz, L. Deecke, S. A. Siddiqui, A. Binder, E. Muller, M. Kloft. Deep one-class classification. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, pp.4393–4402, 2018.
B. Scholkopf, R. Williamson, A. Smola, J. Shawe-Taylor, J. Platt. Support vector method for novelty detection. In Proceedings of the 12th International Conference on Neural Information Processing Systems, ACM, Denver, USA, pp.582–588, 1999.
D. M. J. Tax, R. P. W. Duin. Support vector data description. Machine Learning, vol. 54, no. 1, pp. 45–66, 2004. DOI: 10.1023/B:MACH.0000008084.60811.49.
I. Goodfellow, Y. Bengio, A. Courville. Deep Learning, Cambridge, USA: MIT Press, 2016.
M. Goldstein, S. Uchida. A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data. PLOS One, vol. 11, no. 4, Article number e0152173, 2016. DOI: 10.1371/journal.pone.0152173.
W. X. Li, V. Mahadevan, N. Vasconcelos. Anomaly detection and localization in crowded scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 1, pp. 18–32, 2014. DOI: 10.1109/TPAMI.2013.111.
M. Sabokrou, M. Fayyaz, M. Fathy, Z. Moayed, R. Klette. Deep-anomaly: Fully convolutional neural network for fast anomaly detection in crowded scenes. Computer Vision and Image Understanding, vol. 172, pp. 88–97, 2018. DOI: 10.1016/j.cviu.2018.02.006.
G. Kim, S. Lee, S. Kim. A novel hybrid intrusion detection method integrating anomaly detection with misuse detection. Expert Systems with Applications, vol. 41, no. 4, pp. 1690–1700, 2014. DOI: 10.1016/j.eswa.2013.08.066.
U. Fiore, F. Palmieri, A. Castiglione, A. De Santis. Network anomaly detection with the restricted Boltzmann machine. Neurocomputing, vol. 122, pp. 13–23, 2013. DOI: 10.1016/j.neucom.2012.11.050.
W. Li, Q. Du. Collaborative representation for hyperspectral anomaly detection. IEEE Transactions on Geoscience and Remote Sensing, vol. 53, no. 3, pp. 1463–1474, 2015. DOI: 10.1109/TGRS.2014.2343955.
P. Papadimitriou, A. Dasdan, H. Garcia-Molina. Web graph similarity for anomaly detection. Journal of Internet Services and Applications, vol. 1, no. 1, pp. 19–30, 2010. DOI: 10.1007/s13174-010-0003-x.
C. W. Ten, J. B. Hong, C. C. Liu. Anomaly detection for cybersecurity of the substations. IEEE Transactions on Smart Grid, vol. 2, no. 4, pp. 865–873, 2011. DOI: 10.1109/TSG.2011.2159406.
S. Ahmad, A. Lavin, S. Purdy, Z. Agha. Unsupervised real-time anomaly detection for streaming data. Neurocomputing, vol. 262, pp. 134–147, 2017. DOI: 10.1016/j.neucom.2017.04.070.
T. Schlegl, P. Seebock, S. M. Waldstein, U. Schmidt-Erfurth, G. Langs. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In Proceedings of the 25th International Conference on Information Processing in Medical Imaging, Springer, Boone, USA, pp.146–157, 2017. DOI: 10.1007/978-3-319-59050-9_12.
M. Du, F. F. Li, G. N. Zheng, V. Srikumar. DeepLog: Anomaly detection and diagnosis from system logs through deep learning. In Proceedings of ACM SIGSAC Conference on Computer and Communications Security, ACM, Dallas, USA, pp.1285–1298, 2017. DOI: 10.1145/3133956. 3134015.
H. M. Lu, Y. J. Li, S. L. Mu, D. Wang, H. Kim, S. Serikawa. Motor anomaly detection for unmanned aerial vehicles using reinforcement learning. IEEE Internet of Things Journal, vol. 5, no. 4, pp. 2315–2322, 2018. DOI: 10.1109/JIOT.2017.2737479.
P. V. Bindu, P. S. Thilagam. Mining social networks for anomalies: Methods and challenges. Journal of Network and Computer Applications, vol. 68, pp. 213–229, 2016. DOI: 10.1016/j.jnca.2016.02.021.
W. Z. Yan, L. J. Yu. On accurate and reliable anomaly detection for gas turbine combustors: A deep learning approach. https://arxiv.org/abs/1908.09238, 2019.
R. M. Alguliyev, R. M. Aliguliyev, Y. N. Imamverdiyev, L. V. Sukhostat. An anomaly detection based on optimization. International Journal of Intelligent Systems and Applications, vol. 9, no. 12, pp. 87–96, 2017. DOI: 10.5815/ijisa.2017.12.08.
M. H. Hassoun. Fundamentals of Artificial Neural Networks, Cambridge, USA: MIT Press, 1995.
M. D. Tissera, M. D. McDonnell. Deep extreme learning machines: Supervised autoencoding architecture for classification. Neurocomputing, vol. 174, pp. 42–49, 2016. DOI: 10.1016/j.neucom.2015.03.110.
R. Chalapathy, A. K. Menon, S. Chawla. Anomaly detection using one-class neural networks. https://arxiv.org/abs/1802.06360, 2018.
S. M. Erfani, S. Rajasegarar, S. Karunasekera, C. Leckie. High-dimensional and large-scale anomaly detection using a linear one-class SVM with deep learning. Pattern Recognition, vol. 58, pp. 121–134, 2016. DOI: 10.1016/j.patcog.2016.03.028.
J. An, S. Cho. Variational autoencoder based anomaly detection using reconstruction probability, Technical Report, SNU Data Mining Center, Korea, 2015.
W. Li, G. D. Wu, Q. Du. Transferred deep learning for anomaly detection in hyperspectral imagery. IEEE Geoscience and Remote Sensing Letters, vol. 14, no. 5, pp. 597–601, 2017. DOI: 10.1109/LGRS.2017.2657818.
B. R. Kiran, D. M. Thomas, R. Parakkal. An overview of deep learning based methods for unsupervised and semi-supervised anomaly detection in videos. Journal of Imaging, vol. 4, no. 2, Article number 36, 2018. DOI: 10.3390/jimaging4020036.
T. A. Tang, L. Mhamdi, D. McLernon, S. A. R. Zaidi, M. Ghogho. Deep learning approach for network intrusion detection in software defined networking. In Proceedings of International Conference on Wireless Networks and Mobile Communications, IEEE, Fez, Morocco, pp.258–263, 2016. DOI: 10.1109/WINCOM.2016.7777224.
H. L. Yu, D. Sun, X. Y. Xi, X. B. Yang, S. Zheng, Q. Wang. Fuzzy one-class extreme auto-encoder. Neural Processing Letters, vol. 50, no. 1, pp. 701–727, 2019. DOI: 10.1007/s11063-018-9952-z.
D. Zimmerer, S. A. A. Kohl, J. Petersen, F. Isensee, K. H. Maier-Hein. Context-encoding variational autoencoder for unsupervised anomaly detection. [Online], Available: https://arxiv.org/abs/1812.05941, 2018.
M. Jeragh, M. AlSulaimi. Combining auto encoders and one class support vectors machine for fraudulant credit card transactions detection. In Proceedings of the 2nd World Conference on Smart Trends in Systems, Security and Sustainability, IEEE, London, UK, pp.178–184, 2018. DOI: 10.1109/WorldS4.2018.8611624.
Y. C. Xiao, H. G. Wang, L. Zhang, W. L. Xu. Two methods of selecting Gaussian kernel parameters for one-class SVM and their application to fault detection. Knowledge-Based Systems, vol. 59, pp. 75–84, 2014. DOI: 10.1016/j.knosys.2014.01.020.
I. Irigoien, B. Sierra, C. Arenas. Towards application of one-class classification methods to medical data. The Scientific World Journal, vol. 2014, Article number 730712, 2014. DOI: 10.1155/2014/730712.
H. Yu. SVMC: Single-class classification with support vector machines. In Proceedings of the 18th International Joint Conference on Artificial Intelligence, ACM, Acapulco, Mexico, pp.567–572, 2003.
M. Hejazi, Y. P. Singh. One-class support vector machines approach to anomaly detection. Applied Artificial Intelligence, vol. 27, no. 5, pp. 351–366, 2013. DOI: 10.1080/08839514.2013.785791.
W. Khreich, B. Khosravifar, A. Hamou-Lhadj, C. Talhi. An anomaly detection system based on variable N-gram features and one-class SVM. Information and Software Technology, vol. 91, pp. 186–197, 2017. DOI: 10.1016/j.infsof.2017.07.009.
C. Gautam, R. Balaji, K. Sudharsan, A. Tiwari, K. Ahuja. Localized multiple kernel learning for anomaly detection: One-class classification. Knowledge-based Systems, vol. 165, pp. 241–252, 2019. DOI: 10.1016/j.knosys.2018.11.030.
B. Krawczyk, M. Wozniak, B. Cyganek. Clustering-based ensembles for one-class classification. Information Sciences, vol. 264, pp. 182–195, 2014. DOI: 10.1016/j.ins.2013.12.019.
Y. Goldberg, O. Levy. word2vec explained: Deriving Mikolov et al.′s negative-sampling word-embedding method. [Online], Available: https://arxiv.org/abs/1402.3722, 2014.
L. Van Der Maaten, G. Hinton. Visualizing data using t-SNE. Journal of Machine Learning Research, vol. 9, pp. 2579–2605, 2008.
L. Manevitz, M. Yousef. One-class document classification via neural networks. Neurocomputing, vol. 70, no. 7–9, pp. 1466–1481, 2007. DOI: 10.1016/j.neucom.2006.05.013.