Volume 14 Number 6
December 2017
Article Contents
Merras Mostafa, El Hazzat Soulaiman, Saaidi Abderrahim, Satori Khalid and Gadhi Nazih Abderrazak. 3D Face Reconstruction Using Images from Cameras with Varying Parameters. International Journal of Automation and Computing, vol. 14, no. 6, pp. 661-671, 2017. doi: 10.1007/s11633-016-0999-x
Cite as: Merras Mostafa, El Hazzat Soulaiman, Saaidi Abderrahim, Satori Khalid and Gadhi Nazih Abderrazak. 3D Face Reconstruction Using Images from Cameras with Varying Parameters. International Journal of Automation and Computing, vol. 14, no. 6, pp. 661-671, 2017. doi: 10.1007/s11633-016-0999-x

3D Face Reconstruction Using Images from Cameras with Varying Parameters

Author Biography:
  • Soulaiman El Hazzat received the bachelor and master degrees from SMBAFez University in 2003 and 2012, respectively. He is currently a Ph. D. degree candidate in the LIIAN Laboratory at SMBAFez University.
        His research interests include camera self-calibration and 3D reconstruction.
        E-mail:soulaiman.elhazzat@yahoo.fr

    Abderrahim Saaidi received the Ph. D. degree from SMBA-Fez University in 2010. He is currently a professor of computer science at SMBA-Taza University. He is a member of the LIIAN and LSI Laboratories.
        His research interests include camera self calibration, 3D reconstruction, genetic algorithms, swarm intelligence, 3D modeling, face detection and real-time rendering.
        E-mail:abderrahim.saaidi@usmba.ac.ma

    Khalid Satori received the Ph. D. degree from the National Institute for the Applied Sciences INSA at Lyon in 1993. He is currently a professor of computer science at SMBA-Fez University. He is the director of the LIIAN Laboratory.
        His research interests include real-time rendering, image-based rendering, virtual reality, biomedical signal, camera self calibration, genetic algorithms, 3D reconstruction and 3D modeling.
        E-mail:khalidsaorim3i@yahoo.fr

    Abderrazak Gadhi Nazih received the Ph. D. degree from Cadi Ayyad University in 2002. He is currently a professor of applied mathematics at SMBA-Fez University. He is a member of the LIIAN.
        His research interests include variational analysis, unconstrained optimization, constrained optimization and operational research.
        E-mail:ngadhi@hotmail.com

  • Corresponding author: Mostafa Merras received the B. Sc. and M. Sc. degrees from SMBA-Fez University in 2006 and 2009, respectively. He is currently a Ph. D. degree candidate in the LIIAN Laboratory at SMBA Fez University.
        His research interests include camera calibration and self calibration, optimization, genetic algorithms, swarm intelligence, 3D reconstruction, 3D modeling and real-time rendering
        E-mail:merras.mostafa@gmail.com (Corresponding author)
        ORCID iD:0000-0002-3020-726X
  • Received: 2014-10-29
  • Accepted: 2015-07-24
  • Published Online: 2017-07-25
  • In this paper, we present a new technique of 3D face reconstruction from a sequence of images taken with cameras having varying parameters without the need to grid. This method is based on the estimation of the projection matrices of the cameras from a symmetry property which characterizes the face, these projections matrices are used with points matching in each pair of images to determine the 3D points cloud, subsequently, 3D mesh of the face is constructed with 3D Crust algorithm. Lastly, the 2D image is projected on the 3D model to generate the texture mapping. The strong point of the proposed approach is to minimize the constraints of the calibration system:we calibrated the cameras from a symmetry property which characterizes the face, this property gives us the opportunity to know some points of 3D face in a specific well-chosen global reference, to formulate a system of linear and nonlinear equations according to these 3D points, their projection in the image plan and the elements of the projections matrix. Then to solve these equations, we use a genetic algorithm which consists of finding the global optimum without the need of the initial estimation and allows to avoid the local minima of the formulated cost function. Our study is conducted on real data to demonstrate the validity and the performance of the proposed approach in terms of robustness, simplicity, stability and convergence.
  • 加载中
  • [1] A. N. Ansari, M. Abdel-Mottaleb. 3D face modeling using two orthogonal views and a generic face model. In Proceedings of International Conference on Multimedia and Expo, IEEE, Baltimore, USA, vol. 3, pp. 289-292, 2003.
    [2] B. Achermann, H. Bunke. Classifying range images of human faces with Hausdorff distance. In Proceedings of the 15th International Conference on Pattern Recognition, IEEE, Barcelona, Spain, vol. 2, pp. 809-813, 2000.
    [3] J. J. Atick, P. A. Griffin, A. N. Redlich. Statistical approach to shape from shading:Reconstruction of 3D face surfaces from single 2D images. Neural Computation, vol. 8, no. 6, pp. 1321-1340, 1996.  doi: 10.1162/neco.1996.8.6.1321
    [4] F. Pighin, J. Hecker, D. Lischinski, R. Szeliski, D. H. Salesin. Synthesizing realistic facial expressions from photographs. In Proceedings of the 25th Annual Conference on Computer Graphics and Interactive Techniques, ACM, New York, USA, pp. 75-84, 1998.
    [5] R. Lengagne, P. Fua, O. Monga. 3D face modeling from stereo and differential constraints. In Proceedings of the 3rd IEEE International Conference on Automatic Face and Gesture Recognition, IEEE, Nara, Japan, pp. 148-153, 1998.
    [6] M. Lin, B. Li, Q. H. Liu. Identification of eye movements from non-frontal face images for eye-controlled systems. International Journal of Automation and Computing, vol. 11, no. 5, pp. 543-554, 2014.  doi: 10.1007/s11633-014-0827-0
    [7] L. Wang, R. F. Li, K. Wang, J. Chen. Feature representation for facial expression recognition based on FACS and LBP. International Journal of Automation and Computing, vol. 11, no. 5, pp. 459-468, 2014.  doi: 10.1007/s11633-014-0835-0
    [8] V. Chouvatut, S. Madarasmi, M. Tuceryan. Face reconstruction and camera pose using multi-dimensional descent. World Academy of Sciences, Engineering and Technology, vol. 60, pp. 730-735, 2009.
    [9] R. H. Liang, Z. G. Pan, C. Chen. New algorithm for 3D facial model reconstruction and its application in virtual reality. Journal of Computer Science and Technology, vol. 19, no. 4, pp. 501-509, 2004.  doi: 10.1007/BF02944751
    [10] W. C. Cheng, C. F. Chang. 3D face reconstruction using the stereo camera and neural network regression. In Proceedings of International Conference on Artificial Intelligence and Applications, Taichung, Taiwan, 2009.
    [11] D. Scharstein, R. Szeliski. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision, vol. 47, no. 1-3, pp. 7-42, 2002.
    [12] N. Barbalios, N. Nikolaidis, I. Pitas. 3D human face modeling from uncalibrated images using spline based deformation. In Proceedings of Computer Vision Theory and Applications-VISAPP, pp. 455-459, 2008.
    [13] R. Hassanpour, V. Atalay. Delaunay triangulation based 3D human face modeling from uncalibrated images. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshop, IEEE, Washington DC, USA, pp. 75, 2004.
    [14] S. Y. Ho, H. L. Huang. Facial modeling from an uncalibrated face image using a coarse-to-fine genetic algorithm. Pattern Recognition, vol. 34, pp. 1015-1031, 2001.  doi: 10.1016/S0031-3203(00)00044-3
    [15] N. Amenta. The crust algorithm for 3D surface reconstruction. In Proceedings of the 15th Annual Symposium on Computational Geometry, ACM, New York, USA, pp. 423-424, 1999.
    [16] A. A-Nasser, M. H. Mahoor, M. Abdel-Mottaleb. 3D face mesh modeling for 3D face recognition. State of The art in Face Recognition, M. I. Chacon Ed., pp. 250, 2009.
    [17] A. Saaidi, H. Tairi, K. Satori. Fast stereo matching using rectification and correlation techniques. In Proceedings of the 2nd International Symposium on Communications, Control and Signal Processing, Marrakech, Morocco, 2006.
    [18] J. H. Holland. Adaptation in Natural and Artificial Systems, Cambridge, USA: MIT Press, 1992.
    [19] M. Merras, N. El Akkad, A. Saaidi, A. G. Nazih, K. Satori. Camera self calibration with varying parameters by an unknown three dimensional scene using the improved genetic algorithm. 3D Research, vol. 6, no. 1, pp. 1-14, 2015.
    [20] C. Harris, M. Stephens. A combined corner and edge detector. In Proceedings of the 4th Alvey Vision Conference, pp. 147-151, 1988.
    [21] C. Schmid, R. Mohr, C. Bauckhage. Comparing and evaluating interest points. In Proceedings of the 6th International Conference on Computer Vision, IEEE, Bombay, India, pp. 230-235, 1998.
    [22] N. Amenta, M. Bern. Surface reconstruction by Voronoi filtering. Discrete and Computational Geometry, vol. 22, no. 4, pp. 481-504, 1999.  doi: 10.1007/PL00009475
    [23] N. Amenta, S. Choi, R. K. Kolluri. The power crust, unions of balls, and the medial axis transform. Computational Geometry, vol. 19, no. 2-3, pp. 127-153, 2001.  doi: 10.1016/S0925-7721(01)00017-7
    [24] X. J. Zhou, Z. X. Zhao. The skin deformation of a 3D virtual human. International Journal of Automation and Computing, vol. 6, no. 4, pp. 344-350, 2009.  doi: 10.1007/s11633-009-0344-8
    [25] E. Amstutz, T. Teshima, M. Kimura, M. Mochimaru, H. Saito. PCA-based 3D shape reconstruction of human foot using multiple viewpoint cameras. International Journal of Automation and Computing, vol. 5, no. 3, pp. 217-225, 2008.  doi: 10.1007/s11633-008-0217-6
    [26] N. Amenta, S. Choi, R. K. Kolluri. The power crust. In Proceedings of the 6th ACM Symposium on Solid Modeling and Applications, ACM, New York, USA, pp. 249-266, 2001.
    [27] Y. Yao, S. Sukumar, B. Abidi, D. Page, A. Koschan, M. Abidi. Automated scene-specific selection of feature detectors for 3D face reconstruction. In Proceedings of the 3rd International Symposium, Lecture Notes in Computer Science, Springer, Lake Tahoe, USA, vol. 4841, pp. 476-487, 2007.
    [28] S. J. Lee, K. R. Park, J. Kim. A SfM-based 3D face reconstruction method robust to self-occlusion by using a shape conversion matrix. Pattern Recognition, vol. 40, no. 7, pp. 1470-1486, 2011.
  • 加载中
  • [1] Bing-Xing Wu, Suat Utku Ay, Ahmed Abdel-Rahim. Pedestrian Height Estimation and 3D Reconstruction Using Pixel-resolution Mapping Method Without Special Patterns . International Journal of Automation and Computing, 2019, 16(4): 449-461.  doi: 10.1007/s11633-019-1170-2
    [2] Qiang Fu, Xiang-Yang Chen, Wei He. A Survey on 3D Visual Tracking of Multicopters . International Journal of Automation and Computing, 2019, 16(6): 707-719.  doi: 10.1007/s11633-019-1199-2
    [3] Dong-Jie Li, Yang-Yang Li, Jun-Xiang Li, Yu Fu. Gesture Recognition Based on BP Neural Network Improved by Chaotic Genetic Algorithm . International Journal of Automation and Computing, 2018, 15(3): 267-276.  doi: 10.1007/s11633-017-1107-6
    [4] Jian-Wei Li, Wei Gao, Yi-Hong Wu. Elaborate Scene Reconstruction with a Consumer Depth Camera . International Journal of Automation and Computing, 2018, 15(4): 443-453.  doi: 10.1007/s11633-018-1114-2
    [5] Shuo Liu,  Wen-Hua Chen,  Jiyin Liu. Robust Assignment of Airport Gates with Operational Safety Constraints . International Journal of Automation and Computing, 2016, 13(1): 31-41.  doi: 10.1007/s11633-015-0914-x
    [6] Wei-Hua Chen,  Yuan-Yuan Liu,  Fu-Hua Zhang,  Yong-Ze Yu,  Hai-Ping Chen,  Qing-Xi Hu. Osteochondral Integrated Scaffolds with Gradient Structure by 3D Printing Forming . International Journal of Automation and Computing, 2015, 12(2): 220-228.  doi: 10.1007/s11633-014-0853-y
    [7] Zhong-Da Tian,  Xian-Wen Gao,  Kun Li. A Hybrid Time-delay Prediction Method for Networked Control System . International Journal of Automation and Computing, 2014, 11(1): 19-24.  doi: 10.1007/s11633-014-0761-1
    [8] Elder M. Hemerly. Automatic Georeferencing of Images Acquired by UAV’s . International Journal of Automation and Computing, 2014, 11(4): 347-352.  doi: 10.1007/s11633-014-0799-0
    [9] Huan Liu,  Ying Xiao,  Wei-Dong Tang,  Yan-Hui Zhou. Illumination-robust and Anti-blur Feature Descriptors for Image Matching in Abdomen Reconstruction . International Journal of Automation and Computing, 2014, 11(5): 469-479.  doi: 10.1007/s11633-014-0829-y
    [10] Hong-Jun Song, Yang-Zhou Chen, Yuan-Yuan Gao. Velocity Calculation by Automatic Camera Calibration Based on Homogenous Fog Weather Condition . International Journal of Automation and Computing, 2013, 10(2): 143-156.  doi: 10.1007/s11633-013-0707-z
    [11] Li-Jie Zhao, Tian-You Chai, De-Cheng Yuan. Selective Ensemble Extreme Learning Machine Modeling of Effluent Quality in Wastewater Treatment Plants . International Journal of Automation and Computing, 2012, 9(6): 627-633 .  doi: 10.1007/s11633-012-0688-3
    [12] Liang He, Zhi Chen, Jing-Dong Xu. Optimizing Data Collection Path in Sensor Networks with Mobile Elements . International Journal of Automation and Computing, 2011, 8(1): 69-77.  doi: 10.1007/s11633-010-0556-y
    [13] Anna Witkowska,  Roman Smierzchalski. Nonlinear Backstepping Ship Course Controller . International Journal of Automation and Computing, 2009, 6(3): 277-284.  doi: 10.1007/s11633-009-0277-2
    [14] Xiao-Jing Zhou, Zheng-Xu Zhao. The Skin Deformation of a 3D Virtual Human . International Journal of Automation and Computing, 2009, 6(4): 344-350.  doi: 10.1007/s11633-009-0344-8
    [15] Siva S. Sivatha Sindhu, S. Geetha, M. Marikannan, A. Kannan. A Neuro-genetic Based Short-term Forecasting Framework for Network Intrusion Prediction System . International Journal of Automation and Computing, 2009, 6(4): 406-414.  doi: 10.1007/s11633-009-0406-y
    [16] Edmée Amstutz, Tomoaki Teshima, Makoto Kimura, Masaaki Mochimaru, Hideo Saito. PCA-based 3D Shape Reconstruction of Human Foot Using Multiple Viewpoint Cameras . International Journal of Automation and Computing, 2008, 5(3): 217-225.  doi: 10.1007/s11633-008-0217-6
    [17] Kai Leung Yung, Wai Hung Ip, Ding-Wei Wang. Soft Computing Based Procurement Planning of Time-variable Demand in Manufacturing Systems . International Journal of Automation and Computing, 2007, 4(1): 80-87.  doi: 10.1007/s11633-007-0080-x
    [18] Ming-Min Zhang,  Zhi-Geng Pan,  Li-Feng Ren,  Peng Wang. Image-based Virtual Exhibit and Its Extension to 3D . International Journal of Automation and Computing, 2007, 4(1): 18-24.  doi: 10.1007/s11633-007-0018-3
    [19] Erfu Yang, Hongjun Xiang, Dongbing Gu, Zhenpeng Zhang. A Comparative Study of Genetic Algorithm Parameters for the Inverse Problem-based Fault Diagnosis of Liquid Rocket Propulsion Systems . International Journal of Automation and Computing, 2007, 4(3): 255-261.  doi: 10.1007/s11633-007-0255-5
    [20] Jindong Liu,  Huosheng Hu. A 3D Simulator for Autonomous Robotic Fish . International Journal of Automation and Computing, 2004, 1(1): 42-50.  doi: 10.1007/s11633-004-0042-5
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Figures (13)  / Tables (2)

Metrics

Abstract Views (395) PDF downloads (10) Citations (0)

3D Face Reconstruction Using Images from Cameras with Varying Parameters

  • Corresponding author: Mostafa Merras received the B. Sc. and M. Sc. degrees from SMBA-Fez University in 2006 and 2009, respectively. He is currently a Ph. D. degree candidate in the LIIAN Laboratory at SMBA Fez University.
        His research interests include camera calibration and self calibration, optimization, genetic algorithms, swarm intelligence, 3D reconstruction, 3D modeling and real-time rendering
        E-mail:merras.mostafa@gmail.com (Corresponding author)
        ORCID iD:0000-0002-3020-726X

Abstract: In this paper, we present a new technique of 3D face reconstruction from a sequence of images taken with cameras having varying parameters without the need to grid. This method is based on the estimation of the projection matrices of the cameras from a symmetry property which characterizes the face, these projections matrices are used with points matching in each pair of images to determine the 3D points cloud, subsequently, 3D mesh of the face is constructed with 3D Crust algorithm. Lastly, the 2D image is projected on the 3D model to generate the texture mapping. The strong point of the proposed approach is to minimize the constraints of the calibration system:we calibrated the cameras from a symmetry property which characterizes the face, this property gives us the opportunity to know some points of 3D face in a specific well-chosen global reference, to formulate a system of linear and nonlinear equations according to these 3D points, their projection in the image plan and the elements of the projections matrix. Then to solve these equations, we use a genetic algorithm which consists of finding the global optimum without the need of the initial estimation and allows to avoid the local minima of the formulated cost function. Our study is conducted on real data to demonstrate the validity and the performance of the proposed approach in terms of robustness, simplicity, stability and convergence.

Merras Mostafa, El Hazzat Soulaiman, Saaidi Abderrahim, Satori Khalid and Gadhi Nazih Abderrazak. 3D Face Reconstruction Using Images from Cameras with Varying Parameters. International Journal of Automation and Computing, vol. 14, no. 6, pp. 661-671, 2017. doi: 10.1007/s11633-016-0999-x
Citation: Merras Mostafa, El Hazzat Soulaiman, Saaidi Abderrahim, Satori Khalid and Gadhi Nazih Abderrazak. 3D Face Reconstruction Using Images from Cameras with Varying Parameters. International Journal of Automation and Computing, vol. 14, no. 6, pp. 661-671, 2017. doi: 10.1007/s11633-016-0999-x
  • Faces play an important role in all types of communication, and especially in the man-machine communication. Nonetheless, human beings interact with computers solely by using the mouse and keyboard. More advanced technologies are emerging as touch screens: they have a more intuitive way to communicate with a machine, especially for persons not accustomed to computers. Thus, we find such screens everywhere: in ATM banks, ticket-selling stations and even at home with graphic tablets. Even if this technology has been proven to be of noticeable reliability, it does not allow the computer to respond in a similar manner. Therefore, one of the most natural ways to interact with the machines would then be to adapt to the techniques of human communication based on facial expressions.

    During the last ten years, researchers in computer vision have made significant progress on face modeling and their comprehension. Current technologies allow obtaining the 3D face model, making it liable to analyzing and expressing an emotion. While the man-machine interaction is not the sole area where the face is important, the methods of the image processing and artificial intelligence can also be used for face identification and recognition [1-7] in security and surveillance.

    In the circulating literature, the recognition is achieved by using the images of face; their success depends on the viewpoint and illumination conditions. However, the face recognition in 3D is more efficient and accurate because the face geometry in 3D does depend neither on the displacement of cameras nor on the acquisition conditions. That is why the reconstruction of the human face is a task more difficult than the reconstruction of any other geometric shape. Within this context, our approach is integrated, which reconstructs a 3D face from a sequence of calibrated images.

    In related research, there are two approaches to the 3D face reconstruction. The first is to reconstruct the 3D face from a sequence of calibrated images [5, 8-11]; these methods use the calibration grid to estimate the camera parameters. The limitation of these methods stems from the necessity to always accompany the grid with the 3D face, which makes these methods more sensitive to noise and calibration errors. Another set of approaches use the deformable models of the face [12-14]; the major problem with these methods is that they tend to adapt the generic models to 3D face; which influences on the face reconstruction quality.

    To rectify these problems, we propose, in this paper, a new approach of the reconstruction and meshing of the face without the need of a calibration grid, or a deformable model. Within this scope of vision, we profit from the features of the face, namely eyes and symmetry to formulate the cost function from some 3D points, known in a well-defined landmark. This cost function is minimized by a genetic algorithm (GA) to obtain the optimal elements of the projection matrices. Then, the 3D point cloud reconstruction is performed by using the projection matrices and the coordinates of the matching points. We apply the Delaunay triangulation [8] and 3D Crust algorithm [15] to refine the reconstruction and meshing of the face model, finally, a 2D image is projected on the 3D model to generate the texture mapping. The figure below shows our system of the reconstruction and modeling of the 3D face.

    The remainder of this paper is organized as follows. In Section 2, we present the previous work. Section 3 describes our method of 3D face reconstruction. We present experimental results and analysis in Section 4. The conclusion is presented in Section 5.

  • Several works address the face reconstruction techniques. The authors of [12] propose a face reconstruction technique from un-calibrated images which are taken by cameras with random positions and orientations relative to the human face. Two steps are necessary for the conduct of this approach. Initially, an algorithm of robust motion structure is applied to manually select the image characteristics to estimate their 3D coordinates. Ultimately, these 3D coordinates are used to deform a face generic model through the use of Spline's smoothing, for the sake of adapting it to the characteristics of the human face.

    In [13], the authors propose an algorithm for the generation of a 3D face model from un-calibrated images by the Delaunay triangulation. In this approach, the input images are taken by cameras with a small rotation around a unique axis which may cause degenerate solutions during the self-calibration. The description of this problem is performed a priori in the camera; afterwards, a degenerate face model is realized from 3D coordinates obtained from the scene reconstruction by the Delaunay triangulation.

    In [16], the authors propose a 3D mesh technique for face recognition. This method comprises two stages. The first is to model the face by three characteristic points; these points are used to align the generic model of the 3D face. Thereafter, each triangle from the mesh model, aligned with the three vertices, is treated as a surface plan upon which its coordinates that correspond in 3D are adapted (deformed) by using least appropriate squares. Via triangular vertices subdivisions, a higher resolution model is generated from the coordinates of the model aligned and adjusted. Finally, the model and its triangular surfaces are adjusted again to form a smoother mesh model that resembles and capture the surface feature of the face. In [14], the authors propose an optimization approach using genetic algorithms to faces modeling from un-calibrated images using flexible generic parameters of the face model. Several other articles discuss the reconstruction of the face from the calibrated images [9-11], the authors of these articles use a grid whose points are known to estimate the cameras parameters to reconstruct the 3D face.

  • In the area of 3D face reconstruction by stereo vision, much work has been proposed [9-11, 13]. The main problem of these approaches is the minimization of the constraints on the used cameras; namely, the nature of translation and rotation imposed on the cameras, in the case of face reconstruction from un-calibrated images; or the use of the grid to estimate the projection matrices in the case of face reconstruction from calibrated images. Indeed, these problems render the facial reconstruction under constraint and have an influence on the accuracy and quality of the face reconstruction. In this paper, we propose an algorithm for 3D facial reconstruction that requires no calibration grid and with any type of cameras. Our approach is to incorporate the smooth facial feature, symmetric and topological properties in the calibration process. The main steps of our algorithm are:

    Figure 1.  System overview of 3D face reconstruction from a sequence of images captured with cameras having the varying parameters

    Step 1. Acquisition of a sequence of images by cameras with varying parameters.

    Step 2. Estimation of the cameras parameters used, from the characteristics of the face (six 3D points are sufficient) using a linear and nonlinear solution based on a genetic approach.

    Step 3. Dense matching in each pair of images.

    Step 4. 3D points cloud reconstruction by using the points matching and the estimated projection matrices.

    Step 5. Facial mesh using the Delaunay triangulation and 3D Crust.

    Step 6. Texture mapping to generate a 3D model

  • The face is a 3D object which is characterized by a geometric property of the symmetry that will simplify the cameras calibration procedure without using the grid that influences the calculation time and the estimation quality. In this article, the coordinates of some 3D points can be recognized in a particular Euclidean reference well-chosen and defined on the basis of this symmetry property. One notes subsequently, this landmark by $\left( {{\begin{array}{*{20}c} O \hfill & X \hfill & Y \hfill & Z \hfill \\ \end{array} }} \right)$, such that $O$ is the segment center connecting the eyes (to the below part of the nose root flat), and the axes $\left[{OX} \right)$, $\left[{OY} \right)$, $\left[{OZ} \right)$ are respectively confounded with the segments $\left[{OA} \right]$, $\left[{OB} \right]$ and $\left[{OZ} \right]$ such as the points $A$, $B$ and $C$ are defined in Fig. 2. The coordinates of the plotted points on the 3D face model (see Fig. 2) are as follows:

    Figure 2.  Setting up of a face and the coordinates system of the points of the scene: (a) Definition of the scene landmark and the precision of the 3D points, (b) The 3D points presented in 3D face model

    \begin{array}{l} B\left( {\begin{array}{*{20}{c}} 0&0&1 \end{array}} \right),A\left( {\begin{array}{*{20}{c}} 1&0&0 \end{array}} \right),D\left( {\begin{array}{*{20}{c}} 2&0&0 \end{array}} \right),\\ E\left( {\begin{array}{*{20}{c}} 3&0&0 \end{array}} \right),F\left( {\begin{array}{*{20}{c}} { - 1}&0&0 \end{array}} \right),G\left( {\begin{array}{*{20}{c}} { - 2}&0&0 \end{array}} \right),\\ H\left( {\begin{array}{*{20}{c}} { - 3}&0&0 \end{array}} \right),C\left( {\begin{array}{*{20}{c}} 0&1&1 \end{array}} \right),I\left( {\begin{array}{*{20}{c}} 0&2&1 \end{array}} \right),\\ J\left( {\begin{array}{*{20}{c}} 0&3&1 \end{array}} \right) \end{array}

    Six points are sufficient to calibrate the used cameras. With the symmetry property that characterizes the faces and with the right choice of the scene landmark, one can know the coordinates of the other points that are already shown in Fig. 2. One can also interpolate the forehead as a plan whose some points can be know by a translation of the same distance along the axis $\left[{OY} \right)$, to increase the number of reference points.

    Indeed, knowledge of the coordinates of 3D points defined in Fig. 2, with the reference of the scene $\left( {{\begin{array}{*{20}c} O \hfill & X \hfill & Y \hfill & Z \hfill \\ \end{array} }} \right)$ and their projection in the image plane (see Fig. 3) solves the linear equations of (5) to obtain the parameters of the projection matrix of cameras. To ensure the accuracy of calibration results, we introduce these equations (5) in a nonlinear cost function that needs to be optimized by a genetic algorithm to find the optimal parameters of cameras.

    Figure 3.  Detection of some point on the images, by using the interactive manner in the left image, and the automatic manner in the right image by the fundamental matrix and the ZNCC correlation

  • 1) Linear solution

    To project the 3D face in the image plane, we used the pinhole camera model, by which a 3D point $\left( {{\begin{array}{*{20}c} X \hfill & Y \hfill & Z \hfill & 1 \hfill \\ \end{array} }} \right)$ of the face is projected in the image plane by the following equation:

    \begin{align} \gamma \left( {{\begin{array}{*{20}c} u \hfill \\ v \hfill \\ 1 \hfill \\ \end{array} }} \right)=P\left( {{\begin{array}{*{20}c} X \hfill \\ Y \hfill \\ Z \hfill \\ 1 \hfill \\ \end{array} }} \right). \end{align}

    (1)

    With $\left( {{\begin{array}{*{20}c} u \hfill & v \hfill \\ \end{array} }} \right)$ coordinates point of the image which is the projection of the point $\left( {{\begin{array}{*{20}c} X \hfill & Y \hfill & Z \hfill \\ \end{array} }} \right)$ of the scene in the image plane, $\gamma $ is the scale factor, $P$ is projection matrix which can be denoted by

    \begin{align} P=\left( {{\begin{array}{*{20}c} {\mathop p\nolimits_{11} } \hfill & {\mathop p\nolimits_{12} } \hfill & {\mathop p\nolimits_{13} } \hfill & {\mathop p\nolimits_{14} } \hfill \\ {\mathop p\nolimits_{21} } \hfill & {\mathop p\nolimits_{22} } \hfill & {\mathop p\nolimits_{23} } \hfill & {\mathop p\nolimits_{24} } \hfill \\ {\mathop p\nolimits_{31} } \hfill & {\mathop p\nolimits_{32} } \hfill & {\mathop p\nolimits_{23} } \hfill & {\mathop p\nolimits_{34} } \hfill \\ \end{array} }} \right).\end{align}

    (2)

    Substituting (2) into (1), the projection of the points of the face in the image $i$ is obtained by following formula

    \begin{align} \mathop \gamma \nolimits_{ik} \left( {{\begin{array}{*{20}c} {\mathop u\nolimits_{ik} } \hfill \\ {\mathop v\nolimits_{ik} } \hfill \\ \, 1 \hfill \\ \end{array} }} \right)=\left( {{\begin{array}{*{20}c} {\mathop p\nolimits_{11}^i } \hfill & {\mathop p\nolimits_{12}^i } \hfill & {\mathop p\nolimits_{13}^i } \hfill & {\mathop p\nolimits_{14}^i } \hfill \\ {\mathop p\nolimits_{21}^i } \hfill & {\mathop p\nolimits_{22}^i } \hfill & {\mathop p\nolimits_{23}^i } \hfill & {\mathop p\nolimits_{24}^i } \hfill \\ {\mathop p\nolimits_{31}^i } \hfill & {\mathop p\nolimits_{32}^i } \hfill & {\mathop p\nolimits_{33}^i } \hfill & {\mathop p\nolimits_{34}^i } \hfill \\ \end{array} }} \right)\mbox{ }\left({{\begin{array}{*{20}c} {\mathop X\nolimits_{ik} } \hfill \\ {\mathop Y\nolimits_{ik} } \hfill \\ {\mathop Z\nolimits_{ik} } \hfill \\ \, 1 \hfill \\ \end{array} }} \right). \end{align}

    (3)

    From (3) we obtain

    \begin{align} \left\{ {{\begin{array}{*{20}l} {\mathop \gamma \nolimits_{ik} \mathop u\nolimits_{ik} =\mathop p\nolimits_{11}^i \mathop X\nolimits_k +\mathop p\nolimits_{12}^i \mathop Y\nolimits_k +\mathop p\nolimits_{13}^i \mathop Z\nolimits_k +\mathop p\nolimits_{14}^i } \hfill \\ {\mathop \gamma \nolimits_{ik} \mathop v\nolimits_{ik} =\mathop p\nolimits_{21}^i \mathop X\nolimits_k +\mathop p\nolimits_{22}^i \mathop Y\nolimits_k +\mathop p\nolimits_{23}^i \mathop Z\nolimits_k +\mathop p\nolimits_{24}^i } \hfill \\ {\mathop \gamma \nolimits_{ik} =\mathop p\nolimits_{31}^i \mathop X\nolimits_k +\mathop p\nolimits_{32}^i \mathop Y\nolimits_k +\mathop p\nolimits_{33}^i \mathop Z\nolimits_k +\mathop p\nolimits_{34}^i }. \end{array} }} \right. \end{align}

    (4)

    In (4) we substitute $\mathop \gamma \nolimits_{ik} $ by its expression, one obtains for the image $i$ the equations:

    \begin{align} \left\{ \begin{array}{*{20}l} p_{11}^i X_k + p_{12}^i Y_k + p_{13}^i Z_k + p_{14}^i- p_{31}^i X_k u_{ik}- p_{32}^i Y_k u_{ik}-\\ \quad p_{33}^i Z_k u_{ik} = p_{34}^i u_{ik} \hfill \\ p_{21}^i X_k + p_{22}^i Y_k + p_{23}^i Z_k + p_{24}^i - p_{31}^i X_k v_{ik} - p_{32}^i Y_k v_{ik} -\\ \quad p_{33}^i Z_k v_{ik} = p_{34}^i v_{ik}. \end{array} \right. \end{align}

    (5)

    The previous two equations (see (5)) are valid for each 3D point of the face and are linear at 12 unknowns (projection matrix elements). Therefore at least six 3D points are needed to estimate these unknowns. We use the six points that are defined in the reference of the scene $\left( {{\begin{array}{*{20}c} O \hfill & X \hfill & Y \hfill & Z \hfill \\ \end{array} }} \right)$(see Section 3.1.1), whose coordinates are known and their projections are determined respectively in the left image by the interactive manner and in the right image in an automatic way by the fundamental matrix and the correlation function ZNCC (zero mean normalized cross correlation) [17]. Indeed, the projections of the reference points (points presented in Fig. 2) in the left image are determined interactively by clicking with the mouse on these points, their correspondents in the right image are determined by the correlation measure ZNCC [17] and the accuracy of the matching is achieved by the fundamental matrix.

    2) Non-linear solution

    Our vision system may have some errors due to the interactive detection of the characteristic points of the face, which requires the optimization step to estimate the calibration results, the non-linear cost function to minimize is

    \begin{align} \min\limits_{P_iP_j } \sum\limits_{k=1}^m \sum\limits_{i=1}^{n-1} \sum\limits_{j=i+1}^n \left[\begin{array}{lllllll} {(u_{ik}-u_{ik}' )}^2 + (v_{ik}-v_{ik}')^2 + \\ (u_{jk}-u_{jk}')^2 +(v_{jk}-v_{jk}')^2 \end{array} \right] \end{align}

    (6)

    where $\left( {{\begin{array}{*{20}c} { u_{ik}' } \hfill & {v_{ik}' } \hfill \\ \end{array} }} \right)$ and $\left( {{\begin{array}{*{20}c} {u_{jk}' } \hfill & { v_{jk}' } \hfill \\ \end{array} }} \right)$ represent the projections of the characteristic points $\left( {{\begin{array}{*{20}c} {\mathop X\nolimits_k } \hfill & {\mathop Y\nolimits_k } \hfill & {\mathop Z\nolimits_k } \hfill \\ \end{array} }} \right)$ in the two images $i$ and $j$ by the projection matrices $\mathop P\nolimits_i $ and $\mathop P\nolimits_j $ estimated by (5) and can be expressed by the following formulas:

    \begin{align*} & u_{ik}' =\frac{ p_{11}^i X_k + p_{12}^i Y_k + p_{13}^i Z_k + p_{14}^i }{ p_{31}^i X_k + p_{32}^i Y_k + p_{33}^i Z_k + p_{34}^i }\\ & u_{jk}' =\frac{ p_{11}^j X_k + p_{12}^j Y_k + p_{13}^j Z_k + p_{14}^j }{ p_{31}^j X_k + p_{32}^j Y_k + p_{33}^j Z_k + p_{34}^j }\\ & v_{ik}' =\frac{ p_{21}^i X_k + p_{22}^i Y_k + p_{23}^i Z_k + p_{24}^i }{ p_{31}^i X_k + p_{32}^i Y_k + p_{33}^i Z_k + p_{34}^i }\\ & v_{jk}' =\frac{ p_{21}^j X_k + p_{22}^j Y_k + p_{23}^j Z_k + p_{24}^j }{ p_{31}^j X_k + p_{32}^j Y_k + p_{33}^j Z_k + p_{34}^j }. \end{align*}

    And $\mathop {\left( {{\begin{array}{*{20}c} {\mathop u\nolimits_{ik} } \hfill & {\mathop v\nolimits_{ik} } \hfill \\ \end{array} }} \right)}\nolimits^{\rm T} $ the coordinates of the interactively detected points in the left image and $\mathop {\left( {{\begin{array}{*{20}c} {\mathop u\nolimits_{jk} } \hfill & {\mathop v\nolimits_{jk} } \hfill \\ \end{array} }} \right)}\nolimits^{\rm T} $ the coordinates of the points of the right image are automatically detected by the use of the fundamental matrix. And $n$ is the images number, $m$ the number of points.

    The non-linear cost function (6) requires the step of optimization, the classical optimization methods (Newton, Gauss-Newton, Levernberg Marquardt) are deterministic, the optimization by these algorithms requires a very important initialization step, if the initialization is far from the optimum, then the algorithm does not converge to an optimal solution because the cost function is more complex and contains many local minima. These facts prompted researchers to test approaches based on genetic algorithms (GA), particle swarm optimization (PSO) [18], and these approaches are classified as non-linear stochastic optimization methods and are capable to determine the global optimum in all cases. In this approach we take advantage of the robustness of genetic algorithms to minimize the cost function (see (6)). To optimize results by GA, we opted for a favored generation of the initial population with number of individuals randomly generated following the initial bounds of cameras parameters used. Each individual represents a potential solution to the calibration problem, and each individual is composed of genes representing the parameters of the camera to estimate.

    Knowing the cameras parameters are real numbers bounded by the intervals supplied in input, therefore, for two images, the individuals of the population are encoded by a real coding in a vector of 20 parameters representing the intrinsic and extrinsic parameters of both left and right cameras. We denote by $h$ the vector consisting of the cameras parameters to be optimized.

    \begin{align} h=\mathop {\left[{\begin{array}{l} \mathop {}\nolimits_{\mathop f\nolimits_i, \mathop \varepsilon \nolimits_i, \mathop u\nolimits_{0i}, \mathop v\nolimits_{0i}, \mathop \varpi \nolimits_i, \mathop \phi \nolimits_i, \mathop \theta \nolimits_i, \mathop t\nolimits_{xi}, \mathop t\nolimits_{yi}, \mathop t\nolimits_{zi}, } \\ \mathop f\nolimits_j, \mathop \varepsilon \nolimits_j, _{\mathop u\nolimits_{0j}}, \mathop v\nolimits_{0j}, \mathop \varpi \nolimits_j, \mathop \phi \nolimits_j, \mathop \theta \nolimits_j, \mathop t\nolimits_{xj}, \mathop t\nolimits_{yj}, \mathop t\nolimits_{zj} \\ \end{array}} \right]}\nolimits^{\rm T} \end{align}

    (7)

    with

    \begin{align*} &\mathop P\nolimits_r =\mathop A\nolimits_r \mathop R\nolimits_r \left( {{\begin{array}{*{20}{c}} 1 \hfill & 0 \hfill & \hfill \\ 0 \hfill & 1 \hfill & {\mathop R\nolimits_r^{\rm T} \mathop t\nolimits_r } \hfill \\ 0 \hfill & 0 \hfill & \hfill \\ \end{array} }} \right), \mbox{ }r=i, j\\ &\mathop A\nolimits_r =\left( {{\begin{array}{*{20}{c}} {\mathop f\nolimits_r } \hfill & {\mathop s\nolimits_r =0} & {\mathop u\nolimits_{0r} } \hfill \\ 0 \hfill & {\mathop \varepsilon \nolimits_r \mathop f\nolimits_r } & {\mathop v\nolimits_{0r} } \hfill \\ 0 \hfill & 0& 1 \\ \end{array} }} \right), \mbox{ } r=i, j\\ &\mathop R\nolimits_r =\left[{{\begin{array}{*{20}c} 1 \hfill & 0 & 0 \\ 0 \hfill & {\cos \mathop \varpi \nolimits_r } \hfill & {-\sin \mathop \varpi \nolimits_r } \hfill \\ 0 \hfill & {\sin \mathop \varpi \nolimits_r } \hfill & {\cos \mathop \varpi \nolimits_r } \hfill \\ \end{array} }} \right]\mbox{ }\left[{{\begin{array}{*{20}c} {\cos \mathop \phi \nolimits_r } \hfill & 0 \hfill & {\sin \mathop \phi \nolimits_r } \hfill \\ 0 & 1 \hfill & 0 \\ {-\sin \mathop \phi \nolimits_r } \hfill & 0 \hfill & {\cos \mathop \phi \nolimits_r } \hfill \\ \end{array} }} \right]\mbox{ }\\ &\qquad \left[{{\begin{array}{*{20}c} {\cos \mathop \theta \nolimits_r } \hfill & {-\sin \mathop \theta \nolimits_r } \hfill & 0 \hfill \\ {\sin \mathop \theta \nolimits_r } \hfill & {\cos \mathop \theta \nolimits_r } \hfill & 0 \hfill \\ 0 & 0 & 1 \hfill \\ \end{array} }} \right]\\ &\mathop t\nolimits_r =\mathop {\left( {{\begin{array}{*{20}c} {\mathop t\nolimits_{xr} } \hfill & {\mathop t\nolimits_{yr} } \hfill & {\mathop t\nolimits_{zr} } \hfill \\ \end{array} }} \right)}\nolimits^{\rm T}, \quad r=i, j. \end{align*}

    $\mathop A\nolimits_r $ and $\left( {{\begin{array}{*{20}c} {\mathop R\nolimits_r } \hfill & {\mathop t\nolimits_r } \hfill \\ \end{array} }} \right)$ are respectively, the matrices of the intrinsic and extrinsic parameters of the used cameras.

    And $\mathop f\nolimits_r $ represents focal distance, $\mathop \varepsilon \nolimits_r $ is the scale factor, $\left( {{\begin{array}{*{20}c} {\mathop u\nolimits_{0r} } \hfill & {\mathop v\nolimits_{0r} } \hfill \\ \end{array} }} \right)$ the coordinate of the image center, $\left( {{\begin{array}{*{20}c} {\mathop \varpi \nolimits_r } \hfill & {\mathop \phi \nolimits_r } \hfill & {\mathop \theta \nolimits_r } \hfill \\ \end{array} }} \right)$ rotation angles, $\mathop t\nolimits_r $ translation vector.

    For reasons of notation, the vector $h$ is denoted by $h=h_1, h_2, \cdots, h_{20}$ where $\mathop h\nolimits_i, i=1, \cdots, 20, $ the cameras parameters previously defined. This vector corresponds to a possible solution to the calibration problem and is part of the set of potential solutions $H$ such as $H=\left\{ {h:\mathop h\nolimits_i \in \left[{\mathop h\nolimits_i^-, \mathop h\nolimits_i^+ } \right];\mbox{ }i=1, 2, \cdots, 20} \right\}, $ where $\mathop h\nolimits_i^- $ and $\mathop h\nolimits_i^+ $ are the boundaries of the interval of variation given for each parameter obtained. These bounds are transmitted to the GA as input and are initially obtained by the knowledge we have of the camera (wide bounds can be set for the unknown parameters). An optimal solution to $h$ is obtained by minimizing the cost function described in (6) (for more details on our genetic algorithms see [19]).

    The genetic algorithm performs the optimization by the repetition of the genetic operations until the stopping criterion is satisfied. The stopping criterion is that the minimum cost function remains the same for a given time. The best individual is then defined as the optimal solution to the calibration problem.

  • To reconstruct the 3D point cloud, the matching of points of interest between two images is the main key. In this approach, the extraction of the matching points is performed in three steps: The first is the detection of points of interest by Harris algorithm [20, 21], the second is the matching points in the image pairs by the correlation ZNCC [17], the last step is to spread the matching in an iterative manner, to achieve a dense matching in the images.

    After having detected the matchings and regularized (eliminate false matchings), the 3D point cloud of the face can be restored by the triangulation technique of the points. Since the two lines in 3D space are passing through the optical centers of cameras hence they are associated to 2D points which can be expressed by the projection matrices of the cameras and the points matching. The 3D point is the intersection of two lines. In the practical case, due to the error of calibration and matching, the two straight lines cannot intersect at a point exactly in this case, the middle segment of the two perpendicular lines are selected (see Fig. 4).

    Figure 4.  Intersecting lines defining the 3D location of the point $M$ in the space

  • The mesh is a piecewise linear surface that approximates the continuous surface of the original model. Such an approximation is characterized by geometric information that defines the position of the vertices in Euclidean space and topological information describing the adjacency relationship between the summits translating the manner in which they are interconnected [22].

    The triangular surface meshes are the most used for the representation of objects in three-dimensional space. They are fast becoming the standard representation for modeling geometric objects thanks to their simplicity and efficiency. They are composed of $k$-simplices which may be the vertices (0-simplex), the ridges (segment connecting two peaks: one-simplex) or triangles (2-simplex). This is called simplifiable surface [22-25]. In this approach we use a Crust algorithm that considers enriching the Delaunay triangulation [22, 23, 26].

    Crust algorithm [15, 26] is a new method for surface reconstruction of arbitrary topology from unorganized 3D point cloud, the importance of this method is to reconstruct the closed and open surfaces. In addition, this method does not have in its basic version, the parameters to be set. This algorithm is adapted to our goals. This algorithm can be summarized in the following steps:

    1) Build the Delaunay triangulation DT (Q) on the point cloud $Q$.

    2) For each point $q$ of cloud $Q$, we define two poles: the first is the vertex $V (q)$ farthest from $q$, the second is the vertex of the Voronoi cell the farthest from the first pole. We recall that the Voronoi diagram is the dual of the Delaunay triangulation.

    3) Construction of the Delaunay triangulation of the union of the points cloud and the poles defined in the previous step.

    4) Keep only those triangles for which all three vertices are sample points in $Q$.

    5) Reorient and reorganize the facets obtained in the previous step consistently. This step allows in particular to respect the property of the faces vicinity which specifies that each ridge of the mesh has exactly two adjacent faces if it is an inside ridge to the mesh, and one face if it is ridge of the edge of the mesh.

  • To demonstrate the robustness of our approach, a 512 $\times $ 512 image sequence of a 3D face captured by a digital CCD camera characterized by varying parameters. Fig. 5 shows a pair of images used.

    Figure 5.  Two real images of a 3D face captured with two cameras

  • After the interactive extraction of some feature points in the each pair of images (at least six points) (see Fig. 3), the estimation of the camera parameters is performed by two methods, the first is to solve the equations system (5) linearly, and Table 1 shows the obtained projection matrices.

    Table 1.  Projection matrix obtained by the linear solution

    Table 2 shows the optimal projection matrices obtained by minimizing the cost function of (6) by the genetic algorithm; the importance of an optimization with GA is to eliminate the local minima and that it requires no initialization step.

    Table 2.  Optimal projection matrices obtained by minimizing of the cost function (see (6)) with genetic algorithm

    Fig. 6 shows the relative error between the linear solution and the nonlinear solution of the projection matrix of the cameras used.

    Figure 6.  Relative error according to the elements of the projection matrix

    We note that the maximum error is 1 %, which shows that the approach presented more precision. This makes sense because the face features we used to estimate the projection matrices are clearer and known in all faces, we can say that this error is a calculation error and has no influence on the reconstruction results; reconstruction obtained demonstrates the robustness and accuracy of our method.

  • In this section, we used a correlation measure ZNCC to matching of interest points previously detected by the Harris detector (see Fig. 7), then a propagation step which involves increasing the pairings in an iterative manner to get dense correspondence. Fig. 7 shows the results of interest points detection in the images, and Fig. 8 shows the results of the propagation of all matched points.

    Figure 7.  Interest points detection in the images pairs by Harris detector

    Figure 8.  Dense matching: after matching the points by ZNCC correlation, the propagation method is based on a set of reliable matches (germs). The germ with the best score ZNCC is removed from the current list of matches (germs), the new matches are searched in its neighborhood

  • Fig. 9 shows the 3D point cloud reconstruction by triangulation technique using the projection matrices and points matching.

    Figure 9.  Multiple views of 3D points cloud reconstruction by the triangulation technique

  • In this section, we applied the Crust algorithm to reconstruct the mesh of the point cloud, and then a 2D image is projected orthogonally on the 3D model to generate the texture mapping. Fig. 10 shows the results obtained.

    Figure 10.  3D face reconstruction obtained by our approach. (a) Simplified mesh; (b) 3D face model after texture mapping obtained in multiple views point.

    To confirm the validity and performance of our approach, we tested our method on another sequence of face images, Fig. 11 shows the results. The results show that the approach presented is capable of making a robust and accurate reconstruction of the face with different pose and expression.

    Figure 11.  Result of our approach applied in another sequence of face images: (a) Two images pair; (b) Dense matching; (c) Reconstruction of the points cloud by triangulation technique and the mesh by 3D crust algorithm; (d) 3D face texturing.

  • To evaluate the performance of our approach, we compared the proposed technique with two other approaches [27, 28]. The choice of these two approaches is made for the comparison due to the following reasons: in [27], the authors allow reconstructing the 3D face from un-calibrated images, we choose this approach to show the influence of the cameras parameters optimization on the quality of the reconstruction. In [28], the authors allow to reconstruct the face based on the structure from motion (SFM), by using shape conversion matrix (SCM), in this method the cameras parameters and the positions of 3D points are estimated simultaneously, this approach has recently become one of the fundamental methods for robust facial reconstruction. For this we have implemented and executed the methods [27, 28] on our test image, and then we calculate the position of the 3D points. In these methods, only the 3D coordinates of the visible points are reconstructed. Then we estimated the reprojection error of 3D points calculated by the following formula

    \begin{align} e=\frac{1}{mn}\sum\limits_{i=1}^m {\sum\limits_{j=1}^n {\mathop \varepsilon \nolimits_{ij} } } \mathop {\left\| {\mathop m\nolimits_{ij} -\mathop P\nolimits_i \mathop M\nolimits_j } \right\|}\nolimits^2. \end{align}

    (8)

    where $\mathop \varepsilon \nolimits_{ij} $ is a binary factor of visibility, $m$ is the image number, $n$ is the number of the 3D points, $\mathop P\nolimits_i $ is the corresponding projection matrix of the $i$-th image, $\mathop M\nolimits_j $ is the $j$-th 3D points reconstructed and $\mathop m\nolimits_{ij} $ is the projection of the point $\mathop M\nolimits_j $ onto $i$-th image.

    To test the accuracy of our method, we estimated the reprojection error of 3D points calculated according to the number of points matching, for three approaches. Fig. 12 shows the results obtained.

    Figure 12.  Estimation of the reprojection error of 3D points calculated for the three approaches (our method, Yao [27] and Lee [28]) according to the number of matching points, for three approaches

    To test the robustness of our approach compared to the other approaches such as, [27, 28], we estimated the reprojection error of 3D points calculated for the three approaches, under a Gaussian noise distribution. We applied a Gaussian noise distributed at zero means and with different standard deviations along the two axes $x$ and $y$ to perturb the face projection into image plane. Fig. 13 shows the reprojection error of the 3D points against the Gaussian noise for the three methods.

    From Fig. 12, we found that, the reprojection error of 3D points gradually decreases when the number of matching increases. For our approach, this error remains more or less constant starting at 180 matching and not exceeds 50 pixels, for approach of [28], the minimal error is 110 pixels, also for the method of [27], and the minimal error according the number of matching is 200 pixels, which shows that our approach is more accurate than the two other approaches. Fig. 13 shows that the reprojection error in our approach remains stagnant and does not exceed 20 pixels up to value of noise with $\sigma =1, $ from this value the error increases gradually and does not exceed 200 pixels, unlike the approach of [27] which is very sensitive to noise. We conclude that our method is more robust to noise.

    Figure 13.  Estimation of the reprojection error of the 3D points calculated according to the distributed Gaussian noise for three approaches

    We have developed in this paper a new facial reconstruction approach from two or more images based on a vision system without any constraints on the cameras used. We have used more robust and accurate algorithms in all stages of reconstruction which is implemented and executed in Java programming language. The good results obtained by our method are partly due to obtaining the 3D point cloud by a reliable triangulation technique with minimized reconstruction error, and secondly due to the robustness and performance of the calibration method that we have developed to optimize camera parameters. The advantages of the proposed technique are numerous:

    1) The cameras used are characterized by varying parameters, and thus we present a more efficient calibration procedure with no constraints on the cameras.

    2) We used a genetic algorithm to optimize the parameters of cameras in order to accelerate the convergence and avoid local minima of the cost function to obtain a good estimate of the camera parameters.

    3) The presented method allows reconstruction of the faces from the images without the need of calibration grid or a generic model. Indeed the disadvantage of the methods using the generic model is that the results are more similar to the generic model to the face models.

  • In this paper, we presented a new approach to 3D facial reconstruction from a sequence of images. Our approach focuses on two major points: The first is the camera calibration from symmetry property of the face; this approach avoids the use of the calibration grid. We also used a genetic algorithm to find the optimal projection matrices without the need of an initialization step; this algorithm can quickly be converged to an optimal solution and it permits to avoid local minima of the cost function, which influences the obtained results of the 3D points cloud. The second is the use of the Crust algorithm for the mesh of the point cloud; in fact, the points cloud even if dense, it contains holes that come from false matching and errors of optimization. This requires a mesh of the point cloud and texture mapping. Our method is tested on real faces; the results show that the proposed technique is more precise and robust.

Reference (28)

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return