Volume 16 Number 4
August 2019
Article Contents
Atlas Khan, Yan-Peng Qu and Zheng-Xue Li. Convergence Analysis of a New MaxMin-SOMO Algorithm. International Journal of Automation and Computing, vol. 16, no. 4, pp. 534-542, 2019. doi: 10.1007/s11633-016-0996-0
Cite as: Atlas Khan, Yan-Peng Qu and Zheng-Xue Li. Convergence Analysis of a New MaxMin-SOMO Algorithm. International Journal of Automation and Computing, vol. 16, no. 4, pp. 534-542, 2019. doi: 10.1007/s11633-016-0996-0

Convergence Analysis of a New MaxMin-SOMO Algorithm

Author Biography:
  • Yan-Peng Qu received the Ph. D. degree in computational mathematics from Dalian University of Technology, China in 2012. He is a lecturer with the Information Science and Technology College at Dalian Maritime University, China.
    His research interests include rough and fuzzy set theory, pattern recognition, neural networks, classiflcation and feature selection.
    E-mail: yanpengqu@dlmu.edu.cn

    Zheng-Xue Li received the Ph. D. degree in mathematics from Jilin University, Changchun, China in 2001. He is currently an associate professor with Dalian University of Technology, China.
    His research interests include nonlinear algorithm analysis and intelligent information processing.
    E-mail: lizx@dlut.edu.cn

  • Corresponding author: Atlas Khan received the B. Sc. and M. Sc. degrees in mathematics from Gomal University DI Khan Pakistan, in 2005 and 2007, respectively, and M. Phil. degree in mathematics from Quaid-i-Azam University, Pakistan in 2010. He obtained the Ph. D. degree from Department of Applied Mathematics, Dalian University of Technology, China in 2013. Since August 2013, he is doing post-docotor in bioinformatics with Department of Computing and Mathematics, University of Sao Paulo, Brazil. He has published a number of papers in international journals and conferences.
    His research interests include bioinformatios, computational biology, neural networks and coding theory.
    E-mail: atlas.khan@ficlrp.usp.br (Corresponding author)
    ORCID iD: 0000-0002-6651-2725
  • Received: 2014-06-03
  • Accepted: 2015-07-03
  • Published Online: 2017-02-21
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Figures (5)  / Tables (2)

Metrics

Abstract Views (705) PDF downloads (19) Citations (0)

Convergence Analysis of a New MaxMin-SOMO Algorithm

  • Corresponding author: Atlas Khan received the B. Sc. and M. Sc. degrees in mathematics from Gomal University DI Khan Pakistan, in 2005 and 2007, respectively, and M. Phil. degree in mathematics from Quaid-i-Azam University, Pakistan in 2010. He obtained the Ph. D. degree from Department of Applied Mathematics, Dalian University of Technology, China in 2013. Since August 2013, he is doing post-docotor in bioinformatics with Department of Computing and Mathematics, University of Sao Paulo, Brazil. He has published a number of papers in international journals and conferences.
    His research interests include bioinformatios, computational biology, neural networks and coding theory.
    E-mail: atlas.khan@ficlrp.usp.br (Corresponding author)
    ORCID iD: 0000-0002-6651-2725

Abstract: The convergence analysis of MaxMin-SOMO algorithm is presented. The SOM-based optimization (SOMO) is an optimization algorithm based on the self-organizing map (SOM) in order to find a winner in the network. Generally, through a competitive learning process, the SOMO algorithm searches for the minimum of an objective function. The MaxMin-SOMO algorithm is the generalization of SOMO with two winners for simultaneously finding two winning neurons i.e., first winner stands for minimum and second one for maximum of the objective function. In this paper, the convergence analysis of the MaxMin-SOMO is presented. More specifically, we prove that the distance between neurons decreases at each iteration and finally converge to zero. The work is verified with the experimental results.

Atlas Khan, Yan-Peng Qu and Zheng-Xue Li. Convergence Analysis of a New MaxMin-SOMO Algorithm. International Journal of Automation and Computing, vol. 16, no. 4, pp. 534-542, 2019. doi: 10.1007/s11633-016-0996-0
Citation: Atlas Khan, Yan-Peng Qu and Zheng-Xue Li. Convergence Analysis of a New MaxMin-SOMO Algorithm. International Journal of Automation and Computing, vol. 16, no. 4, pp. 534-542, 2019. doi: 10.1007/s11633-016-0996-0
  • The self-organizing feature map (SOFM)[1-5] is a biologically-inspired method for constructing a structured representation of data from an input space by prototypes, called weight vector in topological order fashion. The weight vector is associated with selected elements, the neurons, of an image space where metric relationships are defined between the elements. For any given data-set, the SOFM algorithm selects weight vectors and assigns them to neurons in the network. The weight vectors as a function of neuron coordinates are called the feature map. The applications of self-organizing map range widely from simulations used for the purpose of understanding and modeling of computational maps in the brain to subsystems for engineering applications such as speech recognition, vector quantization and cluster analysis[6-9].

    Earlier studies have developed optimization algorithms for solving optimization problems. For solving continuous and discrete problems, the most common optimization algorithms are genetic algorithms (GAs)[10, 11], evolutionary programming[12], and evolutionary strategies[13]. GAs use operators, reproduction crossover, and mutation to yield the optimal solutions; still, the evolutionary programming does not equip crossover function. Comparing with the other two, the evolutionary approaches use a different view of choosing operators. The particle swarm optimization (PSO) introduced in 1995 is similar to other evolutionary algorithms that initializes with a population of random solutions, however it is based on social behaviour rather the natural selection[14-16]. Recently, in PSO several modifications are made and applied in different research fields[17, 18]. Applications of self organizing map (SOM) are wide-spread and range from vector quantization, adaptive equalization, and cluster analysis[19]. SOM related studies demonstrate feasibility to solve the optimization problem of travelling-salesman problem (TSP)[20, 21].

    Su et al.[22-26] proposed a new optimization algorithm named SOM-based optimization (SOMO) to deal with continuous optimization issues Recently Wu and Khan[27] generalized the SOMO algorithm to find $m$ winning neurons in a single learning process. Wu and Khan[28] proposed a MaxMin-SOMO algorithm with multiple winners, for simultaneously finding two winners, i.e., first winner stands for minimum and second winner for maximum of the objective function. Khan et al.[29] proposed the convergence of SOMO with distance measure.

    In this paper, the convergence issue of the MaxMin-SOMO algorithm with multiple winners is addressed. Particularly, we prove that with a specific distance measure, the distance between the winners and the other neurons in the network decreases with every iteration. And such distance tends to zero in the learning progress. Numerical examples are presented to support the theoretical findings in the paper.

    The rest of the paper is organized as follows, in Section 2, a brief introduction of the SOMO with multiple winners is given. The convergence of SOMO with multiple winners is presented in Section 3. In Section 4, we make some numerical experiments to verify our theoretical results. Conclusions are given in Section 5.

  • In this section, we discuss the MaxMin-SOMO algorithm for finding one minimum and one maximum of a function simultaneously. It is an easy matter to generalize the algorithm for finding two or more minima and maxima respectively. The MaxMin-SOMO algorithm has the same training steps as those of the original SOMO algorithm, except the step of weights updating process with $m$ winners. The MaxMin-SOMO algorithm is as follows:

    Step 1. The initialization of the MaxMin-SOMO is the same as that of SOMO i.e.

    Step 1.1. The weight vectors of the four neurons on the corners are initialized as follows:

    $ \begin{eqnarray} &\underline{W}_{1, j}= \displaystyle\frac{\underline{W}_{1, N}-\underline{W}_{1, 1}}{N-1}(j-1)+\underline{W}_{1, 1}\nonumber=\\ &\displaystyle\frac{j-1}{N-1}\underline{W}_{1, N}+\displaystyle\frac{N-j}{N-1}\underline{W}_{1, 1}\nonumber\\ &\underline{W}_{M, j}= \displaystyle\frac{\underline{W}_{M, N}-\underline{W}_{M, 1}}{N-1}(j-1)+\underline{W}_{M, 1}\nonumber=\\ & \displaystyle\frac{j-1}{N-1}\underline{W}_{M, N}+\frac{N-j}{N-1}\underline{W}_{M, 1}\nonumber\\ &\underline{W}_{i, 1} = \displaystyle\frac{\underline{W}_{M, 1}-\underline{W}_{1, 1}}{M-1}(i-1)+\underline{W}_{1, 1}\nonumber=\\ &\displaystyle\frac{i-1}{M-1}\underline{W}_{M, 1}+\frac{M-i}{M-1}\underline{W}_{1, 1}\nonumber\\ & \underline{W}_{i, N}= \displaystyle\frac{\underline{W}_{M, N}-\underline{W}_{1, N}}{M-1}(i-1)+\underline{W}_{1, N}\nonumber=\\ &\displaystyle\frac{i-1}{M-1}\underline{W}_{M, N}+\frac{M-i}{M-1}\underline{W}_{1, N} \end{eqnarray} $

    (1)

    where $i=2, \cdots, M-1$, $j=2, \cdots, N-1$ and $M$ and $N$ are some positive integers.

    Step 1.2. Initialization of the remaining neurons

    $ \begin{eqnarray} \underline{W}_{i, j}=\frac{\underline{W}_{i, N}-\underline{W}_{i, 1}}{N-1}(j-1)+\underline{W}_{i, 1} \nonumber\\=\frac{j-1}{N-1}\underline{W}_{i, N}+\frac{N-j}{N-1}\underline{W}_{1, N} \end{eqnarray} $

    (2)

    where $i=2, \cdots, M-1$, $j=2, \cdots, N-1$.

    Step 1.3. Random noise

    A small amount of random noise is added to each weight so as to keep the weight vectors from being linearly dependent

    $ \begin{eqnarray} \underline{W}_{i, j}=\underline{W}_{i, j}+\underline{t} \end{eqnarray} $

    (3)

    for $1\leq i\leq M$ and $1\leq j\leq N$, where $\underline{t}$ denotes a small random noise.

    Step 2. This step aims to find two different winners, denoted by $(i^{*}_{1}, j^{*}_{1})$ and $(i^{*}_{2}, j^{*}_{2})$ with the best objective function values among the neurons. For each neuron $(i, j)$, its corresponding weight $\underline{W}_{i, j}$ is a vector in ${\bf R}^n$, where $1\leq i\leq M$ and $1\leq j\leq N$. For a special input $\underline{x} =(\underline{x}_{1}, \cdots \underline{x}_{n})^{\rm T}=(1, 1, \cdots, 1)^{\rm T}\in {\bf R}^n$, the first winner out of all the neurons is defined as

    $ \begin{eqnarray*} (i_{1}^*, j_{1}^*)= \operatorname*\arg\limits_{1\leq i\leq M, 1\leq j\leq N}\min f(\underline{W}_{i, j}\times \underline{x}_{1}, \cdots, \underline{W}_{i, j}\times \underline{x}_{n}) \nonumber=\\ \operatorname*\arg\limits_{1\leq i\leq M, 1\leq j\leq N}\min f(\underline{W}_{i, j}\times 1, \cdots, \underline{W}_{i, j}\times 1) =~~~\\ \operatorname*\arg\limits_{1\leq i\leq M, 1\leq j\leq N}\min f(\underline{W}_{i, j}) \end{eqnarray*} $

    (4)

    and similarly the second winner

    $ \begin{eqnarray*} (i_{2}^*, j_{2}^*)= \operatorname*\arg\limits_{1\leq i\leq M, 1\leq j\leq N}\min f(\underline{W}_{i, j}\times \underline{x}_{1}, \cdots, \underline{W}_{i, j}\times \underline{x}_{n}) \nonumber=\\ \operatorname*\arg\limits_{1\leq i\leq M, 1\leq j\leq N}\min f(\underline{W}_{i, j}\times 1, \cdots, \underline{W}_{i, j}\times 1) \nonumber=~~~\\ \operatorname*\arg\limits_{1\leq i\leq M, 1\leq j\leq N}\min f(\underline{W}_{i, j}). \end{eqnarray*} $

    (5)

    The idea of MaxMin-SOMO training is applied to the network such that the weight vector $\underline{W}_{i_{1}^*, j_{1}^*}$ of the first winner will get closer and closer to the minimum point and similarly the weight vector $\underline{W}_{i_{2}^*, j_{2}^*}$ will get closer to the maximum point during the iterative training process.

    Step 3. Weights updating rule

    The weights updating rule of the winners and its neighbors is as follows:

    For the neurons $(i, j)$ in the neighborhood of the first winner $(i_{1}^{*}, j_{1}^{*})$ satisfying $p_{1}\leq i\leq p_{2}$, $q_{1}\leq j\leq q_{2}$, where

    $ \begin{equation} p_{1}=\max(i_{1}^{*}-R_{1}, 1) \end{equation} $

    (6)

    $ \begin{equation} p_{2}=\min(i_{1}^{*}+R_{1}, M) \end{equation} $

    (7)

    $ \begin{equation} q_{1}=\max(j_{1}^{*}-R_{1}, 1) \end{equation} $

    (8)

    $ \begin{equation} q_{2}=\min(j_{1}^{*}+R_{1}, N) \end{equation} $

    (9)

    and $R_{1}$ is sizes of the neighbourhoods, the weights updating rule is as follows:

    $ \begin{eqnarray} & \underline{W}_{i, j}(t+1) = \underline{W}_{i, j}(t)+\eta_{1} \beta_{1}({i_{1}^{*}, j_{1}^{*}, i, j})[\underline{W}_{i_{1}^{*}, j_{1}^{*}}(t)-\nonumber\\ &~~~~~~~~\underline{W}_{i, j}(t)] + \lambda_{1}(1-\beta_{1}({i_{1}^{*}, j_{1}^{*}, i, j}))\underline {p} \end{eqnarray} $

    (10)

    where

    $ \begin{equation} \beta_{1}( {i_{1}^{*}, j_{1}^{*}, i, j})=1-\frac{d(i_{1}^{*}, j_{1}^{*}, i, j)}{\sqrt{M^{2}+N^{2}}}. \end{equation} $

    (11)

    For the rest of $(i, j)$ neurons

    $ \begin{eqnarray} &\underline{W}_{i, j}(t+1) =\underline{W}_{i, j}(t)+\eta_{2} \beta_{2}({i_{2}^{*}, j_{2}^{*}, i, j})[\underline{W}_{i_{2}^{*}, j_{2}^{*}}(t)-\nonumber\\ &\underline{W}_{i, j}(t)] +\lambda_{2}(1-\beta_{2}({i_{2}^{*}, j_{2}^{*}, i, j}))\underline {p} \end{eqnarray} $

    (12)

    where

    $ \begin{equation} \beta_{2}( {i_{2}^{*}, j_{2}^{*}, i, j})=1-\frac{d(i_{2}^{*}, j_{2}^{*}, i, j)}{\sqrt{M^{2}+N^{2}}}. \end{equation} $

    (13)

    The learning rates, $\eta_{1}$, $\eta_{2}$, $\lambda_{1}$ and $\lambda_{2}$, are real-valued parameters which could be either constants pre-defined by the user or time-varying parameters. The vector $\underline {p}$ is called the perturbation vector.

    Step 4. Go to Step 2 until a pre-specified number of generations is achieved or the termination criteria are satisfied.

  • In this section, the convergence analysis of SOMO with multiple winners is presented. For convergence, we define the maximum distances between any two neurons, i.e.

    $ d_{k}=\max\limits_{i, j, {i'}, {j'}}||\underline{W}_{i, j}(k)-\underline{W}_{ {i'}, {j'}}(k)|| $

    and

    $ \overline{d_{k}}=\max\limits_{\overline{i}, \overline{j}, \overline{ {i'}}, \overline{ {j'}}} ||\underline{W}_{\overline{i}, \overline{j}}(k)-\underline{W}_{\overline{ {i'}}, \overline{ {j'}}}(k)|| $

    at $k$-th iteration respectively, where $d_{k}$ and $\overline{d_{k}}$ are the maximum distances between any two neurons in the neighborhood of first and second winner respectively. Then we prove that both $d_{k}$ and $\overline{d_{k}}$ decrease after each iteration simultaneously and finally converge zero, i.e., $d_{k}\rightarrow 0$, $\overline{d_{k}}\rightarrow 0$ as $k\rightarrow\infty$. And we also prove that the function values of the winners in the network decrease after every iteration. The following two theorems are the main idea of the paper.

    Theorem 1. Let $d_{k}$ be the maximum distance of any two neurons in the neighborhood of first winner $(i_{1}^{*}, j_{1}^{*})$ and $\overline{{d}_{k}}$ be the maximum distance of any two neurons in the neighborhood of second winner $(i_{2}^{*}, j_{2}^{*})$ at $k$-th iteration. Assume that $\frac{d(i_{1}^{*}, j_{1}^{*}, i', j')}{\sqrt{M^{2}+N^{2}}}<1$ and $\frac{d(i_{2}^{*}, j_{2}^{*}, \overline{i'}, \overline{j'})}{\sqrt{M^{2}+N^{2}}}<1$ for all neurons $i'$, $j'$, $\overline{i^{\acute{}}}$, $\overline{j^{\acute{}}}$, all iterations and $\lambda_{1k}||\underline{p}_{k}||\leq\frac{1}{4}(1-C_{0})d_{k}$, $\lambda_{2k}||\underline{p}_{k}||\leq\frac{1}{4}(1-C_{0})\overline{d_{k}}$. Then $d_{k}\rightarrow 0$ and $\overline{d_{k}}\rightarrow 0$, as $k\rightarrow\infty$.

    Proof. Let $(i_{1}, j_{1})$ and $(i_{2}, j_{2})$ be two neurons in the neighborhood of first winner $(i_{1}^{\ast}, j_{1}^{\ast})$. The weight updating for $(i_{1}, j_{1})$ and $(i_{2}, j_{2})$ are as follows:

    $ \begin{eqnarray*} \underline{W}_{i_{1}, j_{1}}(k+1)= \underline{W}_{i_{1}, j_{1}}(k)+\eta\beta (i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1})\Big[\underline{W}_{i_{1}^{*}, j_{1}^{*}}(k)-\nonumber\\ \underline{W}_{i_{1}, j_{1}}(k)\Big] +\lambda_{1k}(1-\beta( i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1}))\underline {p}_{k} \end{eqnarray*} $

    (14)

    $ \begin{eqnarray*} \underline{W}_{i_{2}, j_{2}}(k+1)= \underline{W}_{i_{2}, j_{2}}(k)+\eta\beta (i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2})\Big[\underline{W}_{i_{1}^{*}, j_{1}^{*}}(k)-\nonumber\\ \underline{W}_{i_{2}, j_{2}}(k)\Big] +\lambda_{1k}(1-\beta( i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2}))\underline {p}_{k}. \end{eqnarray*} $

    (15)

    Let $d_{k}$ be the maximum distance between any two neurons in network at $k$-th iteration and $d_{k+1}=\max\limits_{i_{1}, j_{1}, i_{2}, j_{2}}||\underline{W}_{i_{2}, j_{2}}(k+1)-\underline{W}_{i_{1}, j_{1}}(k+1)||$ at $k+1$st iteration. We have to show that $d_{k+1}<d_{k}$. Subtracting (14) from (14), we obtain

    $ \begin{align} &\underline{W}_{i_{2}, j_{2}}(k+1)-\underline{W}_{i_{1}, j_{1}}(k+1)= \nonumber\\ &\quad \underline{W}_{i_{2}, j_{2}}(k)+\eta\beta (i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2})[\underline{W}_{i_{1}^{*}, j_{1}^{*}}(k)-\underline{W}_{i_{2}, j_{2}}(k)]- \nonumber\\ &\quad\underline{W}_{i_{1}, j_{1}}(k)-\eta\beta(i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1})[\underline{W}_{i_{1}^{*}, j_{1}^{*}}(k)-\underline{W}_{i_{1}, j_{1}}(k)] + \nonumber\\ &\quad \lambda_{1k}(1-\beta(i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2}))\underline{p}_{k}-\lambda_{1k}(1-\beta(i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1}))\underline{p}_{k} = \nonumber\\ &\quad\underline{W}_{i_{2}, j_{2}}(k)+\eta\beta(i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2})[\underline{W}_{i_{1}^{*}, j_{1}^{*}}(k)-\underline{W}_{i_{2}, j_{2}}(k)] - \nonumber\\ &\quad\underline{W}_{i_{1}, j_{1}}(k)-\eta\beta(i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1})[\underline{W}_{i_{1}^{*}, j_{1}^{*}}(k)-\underline{W}_{i_{1}, j_{1}}(k)] + \nonumber\\ &\quad\lambda_{k}((1-\beta( i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2}))\underline{p}_{k}-(1-\beta(i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1}))\underline{p}_{k}). \label{W3} \end{align} $

    (16)

    We need to take three cases to prove $d_{k+1}<d_{k}$.

    Case 1. If

    $ \begin{equation} \beta(i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2})=\beta (i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1})\end{equation} $

    (17)

    then (14) becomes

    $ \begin{align} \underline{W}_{i_{2}, j_{2}}& (k+1) -\underline{W}_{i_{1}, j_{1}}(k+1)=\underline{W}_{i_{2}, j_{2}}(k)- \nonumber\\ & \underline{W}_{i_{1}, j_{1}}(k)- \eta\beta(i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2}) [\underline{W}_{i_{2}, j_{2}}(k)- \nonumber\\ &\underline{W}_{i_{1}, j_{1}}(k)] +\lambda_{1k}((1-\beta(i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2}))\underline{p}_{k}- \nonumber\\ &(1-\beta(i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1})) \underline{p}_{k})= \nonumber\\ &(1-\eta\beta(i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2}))\underline{W}_{i_{2}, j_{2}}(k)-\underline{W}_{i_{1}, j_{1}}(k) + \nonumber\\ &\lambda_{1k}((1-\beta(i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2}))\underline{p}_{k} -\nonumber\\&(1-\beta( i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1}))\underline{p}_{k}). \end{align} $

    (18)

    Taking the norm on both sides of the above equation, we have

    $ \begin{align} ||&\underline{W}_{i_{2}, j_{2}} (k+1)-\underline{W}_{i_{1}, j_{1}}(k+1)||\leq \nonumber\\ &(1-\eta\beta(i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2})) ||\underline{W}_{i_{2}, j_{2}}(k)-\underline{W}_{i_{1}, j_{1}}(k)|| +\nonumber\\ &||\lambda_{1k}((1-\beta( i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2}))\underline {p}_{k}-(1-\beta( i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1}))\underline{p}_{k})|| \nonumber\\ & ||\underline{W}_{i_{2}, j_{2}}(k+1)-\underline{W}_{i_{1}, j_{1}}(k+1)||\leq \nonumber\\ &(1-\eta\beta (i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2}))d_{k} +||\lambda_{1k}((1-\beta( i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2}))\underline {p}_{k}- \nonumber\\ &(1-\beta( i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1}))\underline{p}_{k})||. \end{align} $

    (19)

    By assumption

    $ \begin{eqnarray} \frac{d(i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2})}{\sqrt{M^{2}+N^{2}}}<\alpha<1 \end{eqnarray} $

    (20)

    we have

    $ \begin{eqnarray*} \beta (i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2})=1-\frac {d(i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2})}{\sqrt{M^{2}+N^{2}}}>\overline{\alpha}>0 {\rm ~ for ~ all} ~ i_{2}, j_{2}. \end{eqnarray*} $

    Thus

    $ \begin{align} C_{0}=&~(1-\eta\beta(i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2}))<1-\eta\overline{\alpha}<1 \nonumber\\ \Rightarrow d_{k+1}&\leq C_{0}d_{k}+\lambda_{1k}||((1-\beta( i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2}))\underline {p}_{k}- \nonumber\\ &(1-\beta( i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1}))\underline{p}_{k})||\leq \nonumber\\ & C_{0}d_{k}+||\lambda_{1k}(1-\beta( i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2}))\underline {p}_{k}||+ \nonumber\\ &||\lambda_{1k}(1-\beta(i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1}))\underline{p}_{k}||.\label{2347} \end{align} $

    (21)

    Again by assumption

    $ \begin{eqnarray} \lambda_{1k}||p_{k}||\leq\frac{1}{4}(1-C_{0})d_{k} \end{eqnarray} $

    (22)

    we have

    $ \begin{align} d_{k+1} \leq&~ C_{0}d_{k}+\frac{1}{4}(1-C_{0})d_{k}+ \nonumber\\ &\frac{1}{4}(1-C_{0})d_{k}= \nonumber\\ &C_{0}d_{k}+\frac{2}{4}(1-C_{0})d_{k}= \nonumber\\ &C_{0}d_{k}+\frac{1}{2}(1-C_{0})d_{k}= \nonumber\\ &\frac{1}{2}(1+C_{0})d_{k} \end{align} $

    (23)

    $ \begin{align} \Rightarrow d_{k+1} \leq&~\frac{1}{2}(1+C_{0})d_{k}. \end{align} $

    (24)

    Case 2. If

    $ \beta(i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2})>\beta (i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1}) $

    then add and subtract

    $ \eta\beta(i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2})[\underline{W}_{i_{1}^{*}, j_{1}^{*}}(k)-\underline{W}_{i_{1}, j_{1}}(k)] $

    from (16), which follows:

    $ \begin{align*} \underline{W}_{i_{2}, j_{2}}&(k+1)-\underline{W}_{i_{1}, j_{1}}(k+1)= \nonumber\\ \underline{W}_{i_{2}, j_{2}}&(k)+\eta\beta(i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2}) [\underline{W}_{i_{1}^{*}, j_{1}^{*}}(k)-\underline{W}_{i_{2}, j_{2}}(k)]- \nonumber\\ \underline{W}_{i_{1}, j_{1}}&(k)-\eta\beta(i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1}) [\underline{W}_{i_{1}^{*}, j_{1}^{*}} (k)-\underline{W}_{i_{1}, j_{1}}(k)]+ \nonumber\\ &\eta\beta(i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2})[\underline{W}_{i_{1}^{*}, j_{1}^{*}}(k)-\underline{W}_{i_{1}, j_{1}}(k)]- \nonumber\\ &\eta\beta(i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2})[\underline{W}_{i_{1}^{*}, j_{1}^{*}}(k)-\underline{W}_{i_{1}, j_{1}}(k)]+ \nonumber\\ \lambda_{1k}((1&-\beta(i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2}))\underline{p}_{k}-(1-\beta( i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1}))\underline{p}_{k})= \nonumber\\ &\underline{W}_{i_{2}, j_{2}}(k)-\underline{W}_{i_{1}, j_{1}}(k)-\\ &\eta\beta(i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2})[\underline{W}_{i_{2}, j_{2}}(k)-\underline{W}_{i_{1}, j_{1}}(k)]+ \nonumber\\ \eta(\beta (i_{1}^{*}, &j_{1}^{*}, i_{2}, j_{2})-\beta(i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1})) [\underline{W}_{i_{1}^{*}, j_{1}^{*}}(k)-\underline{W}_{i_{1}, j_{1}}(k)]+ \nonumber\\ \lambda_{1k}((1&-\beta(i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2}))\underline{p}_{k}-(1-\beta( i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1}))\underline{p}_{k}). \end{align*} $

    Taking the norm on both sides of the above equation, we have

    $ \begin{align} ||\underline{W}_{i_{2}, j_{2}}&(k+1)-\underline{W}_{i_{1}, j_{1}}(k+1)||\leq \nonumber\\ &(1-\eta(\beta (i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2})) ||\underline{W}_{i_{2}, j_{2}}(k)-\underline{W}_{i_{1}, j_{1}}(k)||+ \nonumber\\ &\eta(\beta(i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2} -\beta(i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1}))||[\underline{W}_{i_{1}^{*}, j_{1}^{*}}(k)- \nonumber\\ &\underline{W}_{i_{1}, j_{1}}(k)]||+||\lambda_{1k}((1-\beta( i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2}))\underline {p}_{k}-\nonumber\\ &(1-\beta( i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1}))\underline{p}_{k})||. \end{align} $

    (25)

    $ \begin{align} ||\underline{W}_{i_{2}, j_{2}}&(k+1)-\underline{W}_{i_{1}, j_{1}}(k+1)|| =\nonumber\\ (1-&\eta\beta (i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1}))d_{k} +||\lambda_{1k}((1-\beta( i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2}))\underline {p}_{k}- \nonumber\\ &(1-\beta( i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1}))\underline{p}_{k})||. \end{align} $

    (26)

    By assumption

    $ \begin{eqnarray} \frac{d(i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1})}{\sqrt{M^{2}+N^{2}}}<\alpha<1 \end{eqnarray} $

    (27)

    we have

    $ \begin{eqnarray*} \beta (i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1})=1-\frac {d(i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1})}{\sqrt{M^{2}+N^{2}}}>\overline{\alpha}>0 \end{eqnarray*} $

    for all $i_{1}$, $j_{1}$.

    $ \begin{align} \Rightarrow d_{k+1}\leq&~ C_{0}d_{k}+\lambda_{1k}||((1-\beta( i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2}))\underline {p}_{k}- \nonumber\\ &(1-\beta( i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1}))\underline{p}_{k})||\leq \nonumber\\ & C_{0}d_{k}+||\lambda_{1k}(1-\beta( i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2}))\underline {p}_{k}||+ \nonumber\\ &||\lambda_{1k}(1-\beta(i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1}))\underline{p}_{k}||. \end{align} $

    (28)

    Again by assumption

    $ \begin{eqnarray} \lambda_{1k}||p_{k}||\leq\frac{1}{4}(1-C_{0})d_{k} \end{eqnarray} $

    (29)

    we have

    $ \begin{align} d_{k+1}\leq&~ C_{0}d_{k}+\frac{1}{4}(1-C_{0})d_{k}+\frac{1}{4}(1-C_{0})d_{k}= \nonumber\\ &C_{0}d_{k}+\frac{2}{4}(1-C_{0})d_{k}= \nonumber\\ &C_{0}d_{k}+\frac{1}{2}(1-C_{0})d_{k}= \nonumber\\ &\frac{1}{2}(1+C_{0})d_{k} \end{align} $

    (30)

    $ \begin{align} \Rightarrow d_{k+1}&\leq \frac{1}{2}(1+C_{0})d_{k}. \end{align} $

    (31)

    Case 3. If

    $ \beta(i_{1}^{*}, j_{1}^{*}, i_{2}, j_{2})<\beta (i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1}) $

    then add and subtract $\eta\beta(i_{1}^{*}, j_{1}^{*}, i_{1}, j_{1})[\underline{W}_{i_{1}^{*}, j_{1}^{*}}(k)-\underline{W}_{i_{2}, j_{2}}(k)]$ from (16) and the proof of Case 3 is similar to Case 2.

    $ \begin{eqnarray} \Rightarrow d_{k+1}\leq\frac{1}{2}(1+C_{0})d_{k} \end{eqnarray} $

    From the above three cases, i.e.

    $ \begin{align} d_{k+1}\leq &~\frac{1}{2}(1+C_{0})d_{k}\leq (\frac{1}{2}(1+C_{0}))^{2}d_{k-1}, \cdots, \leq \nonumber\\ &(\frac{1}{2}(1+C_{0}))^{k}d_{1}. \end{align} $

    (33)

    Hence $d_{k}\rightarrow0$, as $k\rightarrow\infty$.

    For the rest of $(i, j)$ neurons. Let $(\overline{i_{1}}, \overline{j_{1}})$ and $(\overline{i_{2}}, \overline{j_{2}})$ be two neurons in the neighborhood of second winner $(i_{2}^{\ast}, j_{2}^{\ast})$. The weight updating for $(\overline{i_{1}}, \overline{j_{1}})$ and $(\overline{i_{2}}, \overline{j_{2}})$ are as follows:

    $ \begin{eqnarray*} &\underline{W}_{\overline{i_{1}}, \overline{j_{1}}}(k+1) =\underline{W}_{\overline{i_{1}}, \overline{j_{1}}}(k)+\eta\beta (i_{2}^{*}, j_{2}^{*}, \overline{i_{1}}, \overline{j_{1}})[\underline{W}_{i_{2}^{*}, j_{2}^{*}}(k)-\nonumber\\ &~~~~~\underline{W}_{\overline{i_{1}}, \overline{j_{1}}}(k)] +\lambda_{1k}(1-\beta( i_{2}^{*}, j_{2}^{*}, \overline{i_{1}}, \overline{j_{1}}))\underline {p}_{k}. \end{eqnarray*} $

    (34)

    $ \begin{eqnarray*} \underline{W}_{\overline{i_{2}}, \overline{j_{2}}}(k+1)=\underline{W}_{\overline{i_{2}}, \overline{j_{2}}}(k)+\eta\beta (i_{2}^{*}, j_{2}^{*}, \overline{i_{2}}, \overline{j_{2}})[\underline{W}_{i_{2}^{*}, j_{2}^{*}}(k)-\nonumber\\ ~~~~~~ ~\underline{W}_{\overline{i_{2}}, \overline{j_{2}}}(k)] +\lambda_{1k}(1-\beta( i_{2}^{*}, j_{2}^{*}, \overline{i_{2}}, \overline{j_{2}}))\underline {p}_{k}. \end{eqnarray*} $

    (35)

    Let $\overline{d_{k}}$ be the maximum distance between any two neurons in the neighborhood of second winner at $k$-th iteration and

    $ \overline{d_{k+1}}=\max\limits_{\overline{i_{1}}, \overline{j_{1}}, \overline{i_{2}}, \overline{j_{2}}}||\underline{W}_{\overline{i_2}, \overline{j_2}}(k+1)-\underline{W}_{\overline{i_{1}}, \overline{j_{1}}}(k+1)|| $

    at $k+1$st iteration. The proof of the distance $\overline{d_{k}}$ is similar to the $d_{k}$.

    Hence $d_{k}$ and $\overline{d_{k}}$ approach to zero simultaneously as $k\rightarrow\infty$.

    Theorem 2. Let $\underline{W}_{i, j}(k)$ be $M\times N$ neurons in the network, for $1\leq i \leq M$, $1\leq j\leq N$. Let $\underline{W}_{i_{1}^{*}, j_{1}^{*}}(k)$, $\underline{W}_{i_{2}^{*}, j_{2}^{*}}(k)$ be the two winners in the network at $k$-th iteration. Assume that $\lambda_{1k}||\underline{p}_{k}||\leq\frac{1}{4}(1-C_{0})d_{k}$, $\lambda_{2k}||\underline{p}_{k}||\leq\frac{1}{4}(1-C_{0})\overline{d_{k}}$. Then the function values $f(\underline{W}_{i_{1}^{*}, j_{1}^{*}}(k+1))<f(\underline{W}_{i_{1}^{*}, j_{1}^{*}}(k))$, $f(\underline{W}_{i_{2}^{*}, j_{2}^{*}}(k+1))<f(\underline{W}_{i_{2}^{*}, j_{2}^{*}}(k))$, as $k\rightarrow\infty$.

    Proof. The weight updating of the neurons in the neighborhood of first winner $(i_{1}^{*}, j_{1}^{*})$ is as follows:

    $ \begin{align} \underline{W}_{i, j}(k+1)=&~\underline{W}_{i, j}(k)+\eta\beta( {i_{1}^{*}, j_{1}^{*}, i, j})[\underline{W}_{i_{1}^{*}, j_{1}^{*}}(k)- \nonumber\\ &~\underline{W}_{i, j}(k)]+\lambda_{1k}(1-\beta( {i_{1}^{*}, j_{1}^{*}, i, j}))\underline {p}_{k}. \label{eq5} \end{align} $

    (36)

    Let

    $ \begin{align} f(\underline{W}_{i, j}&(k+1))=f(\underline{W}_{i, j}(k)-\eta\beta( {i_{1}^{*}, j_{1}^{*}, i, j})[\underline{W}_{i_{1}^{*}, j_{1}^{*}}(k)-~~ \nonumber\\ &\underline{W}_{i, j}(k)) +\lambda_{1k}(1-\beta( {i_{1}^{*}, j_{1}^{*}, i, j}))\underline {p})= \nonumber\\ f((1-&~\eta\beta( {i_{1}^{*}, j_{1}^{*}, i, j})\underline{W}_{i, j}(k)+ \eta\beta({i_{1}^{*}, j_{1}^{*}, i, j})\underline{W}_{i_{1}^{*}, j_{1}^{*}}(k) + \nonumber\\ &\lambda_{1k}(1-\beta( {i_{1}^{*}, j_{1}^{*}, i, j}))\underline {p}_{k}). \label{bgetw} \end{align} $

    (37)

    By assumption

    $ \begin{equation} \lambda_{k}||\underline {p}_{k}||\leq\frac{1}{4}(1-C_{0})d_{k}. \end{equation} $

    (38)

    Thus the above equation (37) becomes

    $ \begin{eqnarray} f(\underline{W}_{i, j}(k+1))\leq f((1-\eta\beta( {i_{1}^{*}, j_{1}^{*}, i, j})\underline{W}_{i, j}(k)+ \nonumber\\ \eta\beta({i_{1}^{*}, j_{1}^{*}, i, j})\underline{W}_{i_{1}^{*}, j_{1}^{*}}(k)+\frac{1}{4}(1-C_{0})d_{k}) \end{eqnarray} $

    (39)

    $ \begin{eqnarray} &f(\underline{W}_{i, j}(k+1)) =f((1-\eta\beta( {i_{1}^{*}, j_{1}^{*}, i, j})\underline{W}_{i, j}(k)+ \nonumber\\ &\qquad\quad\eta\beta({i_{1}^{*}, j_{1}^{*}, i, j})\underline{W}_{i_{1}^{*}, j_{1}^{*}}(k) \nonumber\\ &{\rm (since \quad by\quad Theorem\quad 1\quad} d_{k}\rightarrow0) \end{eqnarray} $

    (40)

    $ \begin{align} f(\underline{W}_{i, j}&(k+1))< (1-\eta\beta( {i_{1}^{*}, j_{1}^{*}, i, j}) f(\underline{W}_{i, j}(k))- \nonumber\\ &\eta\beta( {i_{1}^{*}, j_{1}^{*}, i, j}) f(\underline{W}_{i_{1}^{*}, j_{1}^{*}}(k)) \nonumber\\ &{\rm (By \quad convexity }\quad {\rm of} \quad f) = \nonumber\\ &(1-\eta\beta({i_{1}^{*}, j_{1}^{*}, i, j})f (\underline{W}_{i, j}(k))+ \nonumber\\ &\eta\beta( {i_{1}^{*}, j_{1}^{*}, i, j})f(\underline{W}_{i, j}(k))= f(\underline{W}_{i, j}(k))\nonumber\\& \quad({\rm since}\quad f(\underline{W}_{i_{1}^{*}, j_{1}^{*}}\leq f(\underline{W}_{i, j}(k)). \end{align} $

    (41)

    Hence

    $ \begin{eqnarray} f(\underline{W}_{i, j}(k+1))<f(\underline{W}_{i, j}(k)). \end{eqnarray} $

    (42)

    The above equation implies that

    $ \begin{eqnarray} f(\underline{W}_{i_{1}^{*}, j_{1}^{*}}(k+1))<f(\underline{W}_{i_{1}^{*}, j_{1}^{*}}(k)). \end{eqnarray} $

    (43)

    For the rest of $(i, j)$, the weight updating is as follows:

    $ \begin{align} \underline{W}_{\overline{i}, \overline{j}}(k+1)=&~\underline{W}_{\overline{i}, \overline{j}}(k)+\eta\beta( {i_{2}^{*}, j_{2}^{*}, \overline{i}, \overline{j}})[\underline{W}_{i_{2}^{*}, j_{2}^{*}}(k)- \nonumber\\ &~\underline{W}_{\overline{i}, \overline{j}}(k)]+\lambda_{2k}(1-\beta( {i_{2}^{*}, j_{2}^{*}, i, j}))\underline {p}_{k} \nonumber\\ &~{\rm for} \quad 1\leq \overline{i} \leq M, 1\leq \overline{j}\leq N. \end{align} $

    (44)

    The proof as same as above. Hence

    $ \begin{eqnarray} f(\underline{W}_{i_{2}^{*}, j_{2}^{*}}(k+1))<f(\underline{W}_{i_{2}^{*}, j_{2}^{*}}(k)). \end{eqnarray} $

    (45)

    Hence the function values of the both winners are decreasing after every iteration simultaneously.

    In summary, we initialized the neurons on four corners, then after every iteration, the neurons get closer and closer around the first and second winners. We proved in the above theorems, that the neurons i.e., $(i, j)$ for $ 1\leq i \leq M$, $1\leq j\leq N$ are converging to two different points after every iteration, and the function values of winners are decreasing after every iteration simultaneously.

    We remark that the MaxMin-SOMO algorithm has an issue of defining the different neighbourhood for maximum and minimum winners. We intend to investigate this problem in future works.

  • We have carried out simulations with the following functions, to investigate the rate of convergence of the SOMO with multiple winners.

    1) $U_{1}$ function:

    $ \begin{eqnarray} f(\underline{x}) =300\sin(2\pi x)\sin(2\pi y)- \sin(\pi x)\sin(\pi y) \end{eqnarray} $

    (46)

    2) Giunta function:

    $ \begin{align} f(\underline{x})=&\sum^{30}_{i=1}\left(\sin\Big(\frac{16}{15}x_{i}-1\Big)+\sin^{2}\Big(\frac{16}{15}x_{i}-1\Big)+\right.\nonumber\\ & \left.\frac{1}{50}\sin\Big[4\Big(\frac{16}{15}x_{i}-1\Big)\Big]\right)+0.3. \end{align} $

    (47)
  • Table 1 tabulated the global minimum, dimensions and the upper bound of the number of iterations. All simulations conducted with the network size $30\times30$ neurons and the learning parameters are $\eta=0.2$ and $\lambda_{1k}$, $\lambda_{2k}$ are decreasing after each proceeding iteration. Fig. 1 shows the graphs of the two test functions.

    Table 1.  Parameters for the test functions

    Figure 1.  Graphs of the two test functions

  • For each function, each algorithm conducted 30 runs. The best solutions found for each run after pre-specified number of generations recorded. Table 2 tabulates the comparison of the simulation results. The mean column and the standard deviation column represent the mean and the standard deviation of the best solutions of 30 runs. In Table 2 we find the minimum and maximum of a function by running PSO and SOMO twice. But by SOMO with multiple winners we find the both minimum and maximum of a function simultaneously.

    Table 2.  Comparison of MaxMin-SOMO and the original SOMO algorithms for flnding minimum and maximum

    Figs. 2 and 3 show the best performance curves by finding the minimum and maximum of a function respectively. In Fig. 2 we compare the minimum of a function obtained by SOMO and SOMO with multiple winners respectively. In Fig. 3 we compare the maximum of a function obtained by SOMO and SOMO with multiple winners respectively. By SOMO, we find the minimum and maximum of a function separately by running the algorithm twice.

    Figure 2.  Best performance curves corresponding to SOMO algorithm and MaxMin-SOMO algorithm for finding a minimum of a function

    Figure 3.  Best performance curves corresponding to SOMO algorithm and MaxMin-SOMO algorithm for flnding a maximum of a function

    But by SOMO with multiple winners, we find the minimum and maximum of a function simultaneously.

  • The experiments show that the neurons i.e., $\underline{W}_{i, j}$ for $ 1\leq i \leq M, 1\leq j\leq N$ get closer and closer around the first $\underline{W}_{i_{1}^{*}, j_{1}^{*}}(k)$ and second winner $\underline{W}_{i_{2}^{*}, j_{2}^{*}}(k)$ during iteration process. From Figs. 4 and 5 we observe that the distance between the winners $\Big({\rm i.e.}, (i_{1}^{*}, j_{1}^{*}), (i_{2}^{*}, j_{2}^{*})\Big)$ and other neurons are decreasing exponentially. It can be concluded that the distance between winners and other neurons, i.e., $\underline{W}_{i, j}$ for $1\leq i \leq M$, $1\leq j\leq N$ approach to zero as $k\rightarrow\infty$.

    Figure 4.  Value of winners for $U_1$ function

    Figure 5.  Value of winners for the Giunta function

  • Convergence results are established for the SOMO with multiple winners, with a specific distance measure. We define a specific maximum distance between neurons, we proved that this distance approached to zero, i.e, $d_{k}\rightarrow0$ and $\overline{d_{k}}\rightarrow0$ as $k\rightarrow\infty$. Also we showed that all neurons converge to a winner after each iteration process. Two numerical examples for the algorithm are provided to support our theoretical findings and demonstrate that the distance between neurons approaches to zero after each iteration.

  • This work was supported by National Natural Science Foundation of China (Nos. 11171367 and 61502068), the Fundamental Research Funds for the Central Universities of China (No. 3132014094), the China Postdoctoral Science Foundation (Nos. 2013M541213 and 2015T80239) and Fundacao da Amaro a Pesquisa do Estado de Sao Paulo (FAPESP) Brazil (No. 2012/23329-5).

    The authors wish to thank the associate editor and the anonymous reviewers for their helpful and interesting comments. We are very grateful to professor Wu Wei for extremely helpful discussion and comments to improve the quality of the paper. Thanks to professor Ricardo Zorzetto Nicoliello V${\rm\hat{e}}$ncio for his help to improve the text of the paper.

Reference (29)

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return