子空间半监督Fisher判别分析外文翻译资料
2023-03-07 18:18:22
Subspace Semi-supervised Fisher Discriminant Analysis
In cases of machine learning and data mining, such as image retrieval, and face recognition, we may increasingly confront with the collection of high-dimensional data. This leads us to consider methods of dimensionality reduction that allow us to represent the data in a lower dimensional space. Techniques for dimensionality reduction have attracted much attention in computer vision and pattern recognition. The most popular dimensionality reduction algorithms include principal component analysis (PCA)[1minus;2]and Fisher discriminant analysis (FDA)[3].PCA is an unsupervised method. It projects the original m-dimensional data into a d (d m)-dimensional subspace in which the data variance is maximized. It computes the eigenvectors of the data covariance matrix, and approximates the original data by a linear combination of the leading eigenvectors. If the data are embedded in a linear subspace, PCA is guaranteed to discover the dimensionality of the subspace and produces a compact representation.Unlike PCA, FDA is a supervised method. In the context of pattern classification, FDA seeks for the best projection subspace such that the ratio of the between-class scatter to the within-class scatter is maximized. For classification task, FDA can achieve significant better performance than PCA.Labeled data, however, often consume much time and are expensive to obtain, as they require the efforts of human annotators[4]. Contrarily, in many cases, it is far easier to obtain large numbers of unlabeled data. The problem of effectively combining unlabeled data with labeled data is therefore of central importance in machine learning[4].Learning from labeled and unlabeled data has attracted an increasing amount of attention recently, and several novel approaches have been proposed. Graph-based semi-supervised learning algorithms[4minus;13]have attracted considerable attention in recent years. These algorithms consider the graph over all the data as a priori knowledge to guide the decision making. The regularization-based technique of Cai[8]is closest in spirit to the intuitions of our paper.Techniques of Belkin[5]and Cai[8]are based on regularization.In this paper, we aim at dimensionality reduction in
semi-supervised case. To cope with the problem of effectively combining unlabeled data with labeled data, we propose a novel semi-supervised dimensionality reduction algorithm called subspace semi-supervised Fisher discriminant analysis (SSFDA). SSFDA exploits the geometric structure of the labeled an unlabeled data and incorporates it as an additional regularization term. SSFDA intends to find an embedding transformation that respects the discriminant structure inferred from the labeled data and the intrinsic geometrical structure inferred from both labeled and unlabeled data.Semi-supervised discriminant analysis (SDA)[8]is the most relevant algorithm to our algorithm. In the following, we list the similarities and major difference between SDA and our algorithm:1) Both SDA and our algorithm are graph-based approaches. Both use a p-nearest neighbor graph to model the relationship between the nearby data points and incorporate the geometric structure of the labeled and unlabeled data as an additional regularization term.2) There is one major difference between SDA and our algorithm. In the SDA algorithm, without considering the labels of the labeled data, the weight matrix of the p-nearest neighbor graph is constructed according to the relationship between nearby points in the original data space. In our algorithm, using the labeled data, we first find a projection subspace by applying the FDA algorithm and embed the labeled and unlabeled data into this subspace. Then the weight matrix of the p-nearest neighbor graph is constructed according to the relationship between nearby data points in the subspace, as well as the labels of the labeled data.The rest of this paper is organized as follows. In Section 1, we provide a brief review of FDA. The proposed SSFDA algorithm for dimensionality reduction is introduced in Section 2. The experimental results are presented in Section 3. Finally, we conclude the paper in Section 4
3 Experimental results .
In this section, the performance of the proposed SSFDA for face recognition is investigated.A face recognition task is handled as a multi-class classification problem. Each test image is mapped to a low-dimensional subspace via the embedding learned from training data, and then it is classified by the nearest neighbor criterion.
3.1 Datasets and compared algorithms
In our experiments, we use the CMU PIE (Pose, Illumination and Expression) databases[16]for face recognition to evaluate our proposed SSFAD algorithm. The CMU PIE face database contains more than 40 000 facial images of 68 people. The face images were captured under varying pose, illumination, and expression. From the dataset that contains five near frontal poses (C05, C07, C09, C27, and C29), we randomly select 20 persons and 80 images for each person in the experiments.In all the experiments, preprocessing to locate the faces was applied. Original images were normalized (in scale and orientation) such that the two eyes were aligned at the same position. Then, the facial areas were cropped into the final images for matching. The size of each cropped image in all the experiments is 32 times; 32 pixels, with 256 gray levels per pixel. Thus, each image is represented by a 1 024-dimensional vector in image space.The image set is then partitioned into the gallery and probe set with different numbers. For ease of representation, Gm/Ll/Pn means m images per person are randomly selected for training and the remaining n images are for testing. Among these m images, l images are randomly selected and labeled which leaves other (m minus; l) images unlabeled. We compare the performance of SSFDA with Fisherface[17](PCA followed by FDA), Laplacianface[9
剩余内容已隐藏,支付完成后下载完整资料
资料编号:[498177],资料为PDF文档或Word文档,PDF文档可免费转换为Word