基于空间分布的数据线性可分离性判定方法研究外文翻译资料
2022-12-24 16:53:02
Computational Statistics and Data Analysis 56 (2012) 4290–4300
Contents lists available at SciVerse ScienceDirect
Computational Statistics and Data Analysis
journal homepage: www.elsevier.com/locate/csda
Separable linear discriminant analysis
Jianhua Zhaoa, Philip L.H. Yu b, Lei Shi a,lowast;, Shulan Li c
a School of Statistics and Mathematics, Yunnan University of Finance and Economics, Kunming, 650221, China
b Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong
c School of Accountancy, Yunnan University of Finance and Economics, Kunming, 650221, China
a r t i c l e i n f o
Article history:
Received 20 April 2011
Received in revised form 28 March 2012 Accepted 6 April 2012
Available online 13 April 2012
Keywords:
Linear discriminant analysis Separable
Two-dimensional data Face recognition
a b s t r a c t
Linear discriminant analysis (LDA) is a popular technique for supervised dimension reduction. Due to the curse of dimensionality usually suffered by LDA when applied to 2D data, several two-dimensional LDA (2DLDA) methods have been proposed in recent years. Among which, the Y2DLDA method, introduced by Ye et al. (2005), is an important development. The idea is to utilize the underlying 2D data structure to seek for an optimal bilinear transformation. However, it is found that the proposed algorithm does not guarantee convergence. In this paper, we show that the utilization of a bilinear transformation for 2D data is equivalent to modeling the covariance matrix of 2D data as separable covariance matrix. Based on this result, we propose a novel 2DLDA method called separable LDA (SLDA). The main contributions of SLDA include (1) it provides interesting theoretical relationships between LDA and some 2DLDA methods; (2) SLDA provides a building block for mixture extension; (3) unlike Y2DLDA, a neatly analytical solution can be obtained as that in LDA. Empirical results show that our proposed SLDA achieves better recognition performance than Y2DLDA while being computationally much more efficient.
copy; 2012 Elsevier B.V. All rights reserved.
1. Introduction
Fisher linear discriminant analysis (LDA) is a popular supervised subspace learning technique and has been widely used in computer vision, pattern recognition and machine learning. It looks for a linear transformation such that in the transformed subspace the between-class covariance relative to the within-class covariance is maximized. Since LDA is simply formulated for 1D data (in which observations are in vector form), when applying LDA for 2D data such as images (in which observations are in matrix form), one has to convert the 2D matrix data into 1D vector ones. Unfortunately, the resulting 1D data are easily trapped into the so-called curse of dimensionality that could deteriorate the performance of LDA.
In recent years, rather than resorting to the vectorization, another group of researchers have suggested performing LDA using 2D data directly. For instance, one-sided two-dimensional LDA (2DLDA) (Liu et al., 1993; Li and Yuan, 2005; Xiong et al., 2005) maximizes a generalized Fisher discriminant criterion that restricts the linear transformation to be a row or column linear transformation. Since this method only extracts the discriminant transformation on one side (row or column) of the 2D data matrix, it typically requires extracting many more features than LDA for recognition and representation. To overcome this disadvantage, a two-sided 2DLDA which extracts the discriminant transformations on both sides of the 2D data matrix is suggested in Yang et al. (2005) and a two-stage solution is proposed to find the column and row transformations sequentially. However, this method is an order-dependent algorithm (Inoue and Urahama, 2006). To find the column and row transformations simultaneously, Ye et al. (2005) proposed a formulation that restricts the linear transformation to be a
lowast; Corresponding author.
E-mail addresses: jhzhao.ynu@gmail.com (J. Zhao), plhyu@hku.hk (P.L.H. Yu), shi_lei65@hotmail.com (L. Shi).
0167-9473/$ – see front matter copy; 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.csda.2012.04.003
bilinear one, i.e., a Kronecker product of the column and row linear transformations. This formulation is denoted as Y2DLDA. Unfortunately, the convergence of Y2DLDA is not guaranteed as detailed in Luo et al. (2009) and Inoue and Urahama (2006). To overcome this problem, a new objective function is defined in Luo et al. (2009) but maximizing the function has to resort to numerical methods and the computation is much more complicated than that of Y2DLDA. A separate solution is suggested in 剩余内容已隐藏,支付完成后下载完整资料
Computational Statistics and Data Analysis 56 (2012) 4290–4300
Contents lists available at SciVerse ScienceDirect
Computational Statistics and Data Analysis
journal homepage: www.elsevier.com/locate/csda
可分离的线性判别分析
Jianhua Zhaoa, Philip L.H. Yu b, Lei Shi a,lowast;, Shulan Li c
a School of Statistics and Mathematics, Yunnan University of Finance and Economics, Kunming, 650221, China
b Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong
c School of Accountancy, Yunnan University of Finance and Economics, Kunming, 650221, China
a r t i c l e i n f o
Article history:
Received 20 April 2011
Received in revised form 28 March 2012 Accepted 6 April 2012
Available online 13 April 2012
Keywords:
线性判别分析
可分离
二维数据
人脸识别
a b s t r a c t
线性判别分析(LDA)是一种用于有监督降维的技术。由于LDA在应用于2D数据时通常会遇到维数困难,近年来已经提出了几种二维LDA(2DLDA)方法。其中,Ye等人介绍的Y2DLDA方法。 (2005年),是一个重要的发展。该想法是利用底层2D数据结构来寻求最佳双线性变换。但是,发现所提出的算法不能保证收敛。在本文中,我们表明,对二维数据的双线性变换的利用等同于将二维数据的协方差矩阵建模为可分离的协方差矩阵。基于此结果,我们提出了一种称为可分离LDA(SLDA)的新型2DLDA方法。 SLDA的主要贡献包括(1)它提供了LDA和一些2DLDA方法之间的理论关系; (2)SLDA为混合物延伸提供了一个构件; (3)与Y2DLDA不同,可以获得与LDA中一样的整齐解析解。实证结果表明,我们提出的SLDA实现了比Y2DLDA更好的识别性能,同时计算效率更高。
copy; 2012 Elsevier B.V. All rights reserved.
1. Introduction
Fisher线性判别分析(LDA)是一种流行的监督子空间学习技术,已广泛应用于计算机视觉,模式识别和机器学习。它寻找线性变换,使得在变换的子空间中,相对于类内协方差的类间协方差被最大化。由于LDA仅针对一维数据(其中观察以矢量形式)制定,因此当将LDA应用于诸如图像的2D数据(其中观察以矩阵形式)时,必须将2D矩阵数据转换为1D矢量数据。不幸的是,由此产生的一维数据很容易陷入所谓的维度诅咒,这可能会降低LDA的性能。
近年来,另一组研究人员建议不使用矢量化,而是直接使用2D数据进行LDA。例如,单侧二维LDA(2DLDA)最大化了广义Fisher判别标准,该标准将线性变换限制为行或列线性变换。由于该方法仅提取2D数据矩阵的一侧(行或列)上的判别变换,因此通常需要提取比LDA更多的特征用于识别和表示。为了克服这个缺点,Yang等人提出了一种双面2DLDA,它提取2D数据矩阵两侧的判别变换。 (2005)提出了两阶段解决方案,以顺序查找列和行变换。然而,该方法是依赖于顺序的算法。为了同时找到列和行的转换,Ye等人。 (2005)提出了一种将线性变换限制为a的公式双线性的,即列和行线性变换的Kronecker乘积。 该配方表示为Y2DLDA。 不幸的是,如Li等人所述,Y2DLDA的收敛性并不能得到保证。 (2009)和Inoue和Urahama(2006)。 为了克服这个问题,在Luo等人中定义了一个新的目标函数。 (2009)但最大化功能必须采用数值方法,计算比Y2DLDA复杂得多。 Noushath等人提出了一个单独的解决方案。 (2006)和Inoue和Urahama(2006)分别在列和行方向上执行LDA。 该公式在本文中称为双向2DLDA。
lowast; Corresponding author.
E-mail addresses: jhzhao.ynu@gmail.com (J. Zhao), plhyu@hku.hk (P.L.H. Yu), shi_lei65@hotmail.com (L. Shi).
0167-9473/$ – see front matter copy; 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.csda.2012.04.003
与LDA相比,这些2DLDA方法的一个吸引人的优点是克服了维数障碍,并且可以大大减轻计算成本。 重要的是,他们的实证结果表明,这些2DLDA方法可以实现比LDA更具竞争力或更好的识别,特别是在小样本情况下。 这个优势应归功于底层2D数据结构的利用。
在本文中,我们建议利用底层的二维数据结构不仅限制线性变换,而且还将二维数据的协方差矩阵建模为可分离的协方差矩阵。 基于此,我们提出了一种新的2DLDA方法,称为可分离LDA(SLDA),因为它是可分离协方差假设下的LDA。 SLDA的主要新颖之处包括
- 1. SLDA建立了LDA与单面2DLDA和双面2DLDA之间的理论关系。 这种关系很有意思,并且已经做出了一些努力来解决这个问题。 例如,Wang等人.(2006)声称片面2DLDA是一种基于块的LDA。(2008)表明片面2DLDA是一些假设下的LDA。 上述两种观点仅仅提供了LDA和片面2DLDA之间的一些关系,据我们所知,到目前为止,LDA和双面2DLDA之间的关系仍然未知。
- 可以使用不同的估算方法实施SLDA。 在本文中,我们通过最大似然估计(MLE)和矩估计(ME)方法开发SLDA,因为这两种方法各有其独特的优点。 MLE的SLDA是新的,类似于LDA和MLE之间的连接。 它还为混合物扩展提供了构建块,以处理多模态或非线性数据(Viroli,2011)。 ME的SLDA实现起来更简单,更重要的是,它被证明等效于双向LDA,因此我们提供了双向LDA有效的理论依据。
- 不同于现有的迭代解决方案,如Y2DLDA(Ye et al。,2005)和Luo等人的解决方案。 (2009),SLDA可以获得一个整齐的分析解决方案,就像LDA一样。 这表明了SLDA的计算效率。
在本文的其余部分安排如下。 第2节简要回顾了LDA和两种现有的2DLDA方法:Y2DLDA和双向2DLDA。 第3节提出了我们的SLDA,它是由第4节中详述的MLE和ME方法实现的。第5节给出了LDA和某些2DLDA方法之间关系的一些理论结果。 第6节构建了一项实证研究,以比较MLE,SLDA,ME与Y2DLDA和几个相关竞争对手的SLDA。 我们在第7节中得出了一些结论和讨论。
LDA和2DLDA
-
- LDA
x isin; Rd随机向量 pi;k 和 micro;k b的先验概率和平均数 Lk, k = 1, . . . , K ,
then the global population mean micro; =
K
k pi;kmicro;k and between-class and within-class covariance matrices are
K
b = pi;k(micro;k minus; micro;)(micro;k minus; micro;)′, w = pi;kE[(x minus; micro;k)(x minus; micro;k)′|x isin; Lk].
k=1
k=1
= times;
考虑线性变换y=V′x, where V is a d q(q lt; d) 矩阵
arg max tr (V′ wV)minus;1(V′ bV) . (1)
V
(1)的闭合形式解由-w1 b的前导q特征向量给出。 更多细节可以在Fukunaga(1990)中找到。 剩下的问题是w和b的估计,其可以通过MLE或ME获得,如下所述。
MLE: 假设所有类都遵循具有不同均值但正常的类内协方差的正态分布
Fig. 1. Error rate versus number of iterations using 2DLDA with qc = qr = 10 on ORL data.
在第4节中,我们将这两种估算方法扩展到SLDA,这两种估算方法的结果不再相同。
-
- Review of Y2DLDA
在本节中,我们将简要回顾 Y2DLDA ( 剩余内容已隐藏,支付完成后下载完整资料