You are here

Practical Implementations of the Active Set Method for Support Vector Machine Training with Semi-definite Kernels

Download pdf | Full Screen View

Date Issued:
2014
Abstract/Description:
The Support Vector Machine (SVM) is a popular binary classification model due to its superior generalization performance, relative ease-of-use, and applicability of kernel methods. SVM training entails solving an associated quadratic programming (QP) that presents significant challenges in terms of speed and memory constraints for very large datasets; therefore, research on numerical optimization techniques tailored to SVM training is vast. Slow training times are especially of concern when one considers that re-training is often necessary at several values of the model's regularization parameter, C, as well as associated kernel parameters.The active set method is suitable for solving SVM problem and is in general ideal when the Hessian is dense and the solution is sparse-the case for the l1-loss SVM formulation. There has recently been renewed interest in the active set method as a technique for exploring the entire SVM regularization path, which has been shown to solve the SVM solution at all points along the regularization path (all values of C) in not much more time than it takes, on average, to perform training at a single value of C with traditional methods. Unfortunately, the majority of active set implementations used for SVM training require positive definite kernels, and those implementations that do allow semi-definite kernels tend to be complex and can exhibit instability and, worse, lack of convergence. This severely limits applicability since it precludes the use of the linear kernel, can be an issue when duplicate data points exist, and doesn't allow use of low-rank kernel approximations to improve tractability for large datasets. The difficulty, in the case of a semi-definite kernel, arises when a particular active set results in a singular KKT matrix (or the equality-constrained problem formed using the active set is semi-definite). Typically this is handled by explicitly detecting the rank of the KKT matrix. Unfortunately, this adds significant complexity to the implementation; and, if care is not taken, numerical instability, or worse, failure to converge can result. This research shows that the singular KKT system can be avoided altogether with simple modifications to the active set method. The result is a practical, easy to implement active set method that does not need to explicitly detect the rank of the KKT matrix nor modify factorization or solution methods based upon the rank. Methods are given for both conventional SVM training as well as for computing the regularization path that are simple and numerically stable. First, an efficient revised simplex method is efficiently implemented for SVM training (SVM-RSQP) with semi-definite kernels and shown to out-perform competing active set implementations for SVM training in terms of training time as well as shown to perform on-par with state-of-the-art SVM training algorithms such as SMO and SVMLight. Next, a new regularization path-following algorithm for semi-definite kernels (Simple SVMPath) is shown to be orders of magnitude faster, more accurate, and significantly less complex than competing methods and does not require the use of external solvers. Theoretical analysis reveals new insights into the nature of the path-following algorithms. Finally, a method is given for computing the approximate regularization path and approximate kernel path using the warm-start capability of the proposed revised simplex method (SVM-RSQP) and shown to provide significant, orders of magnitude, speed-ups relative to the traditional (")grid search(") where re-training is performed at each parameter value. Surprisingly, it also shown that even when the solution for the entire path is not desired, computing the approximate path can be seen as a speed-up mechanism for obtaining the solution at a single value. New insights are given concerning the limiting behaviors of the regularization and kernel path as well as the use of low-rank kernel approximations.
Title: Practical Implementations of the Active Set Method for Support Vector Machine Training with Semi-definite Kernels.
33 views
19 downloads
Name(s): Sentelle, Christopher, Author
Georgiopoulos, Michael, Committee Chair
Anagnostopoulos, Georgios, Committee CoChair
Kasparis, Takis, Committee Member
Stanley, Kenneth, Committee Member
Young, Cynthia, Committee Member
University of Central Florida, Degree Grantor
Type of Resource: text
Date Issued: 2014
Publisher: University of Central Florida
Language(s): English
Abstract/Description: The Support Vector Machine (SVM) is a popular binary classification model due to its superior generalization performance, relative ease-of-use, and applicability of kernel methods. SVM training entails solving an associated quadratic programming (QP) that presents significant challenges in terms of speed and memory constraints for very large datasets; therefore, research on numerical optimization techniques tailored to SVM training is vast. Slow training times are especially of concern when one considers that re-training is often necessary at several values of the model's regularization parameter, C, as well as associated kernel parameters.The active set method is suitable for solving SVM problem and is in general ideal when the Hessian is dense and the solution is sparse-the case for the l1-loss SVM formulation. There has recently been renewed interest in the active set method as a technique for exploring the entire SVM regularization path, which has been shown to solve the SVM solution at all points along the regularization path (all values of C) in not much more time than it takes, on average, to perform training at a single value of C with traditional methods. Unfortunately, the majority of active set implementations used for SVM training require positive definite kernels, and those implementations that do allow semi-definite kernels tend to be complex and can exhibit instability and, worse, lack of convergence. This severely limits applicability since it precludes the use of the linear kernel, can be an issue when duplicate data points exist, and doesn't allow use of low-rank kernel approximations to improve tractability for large datasets. The difficulty, in the case of a semi-definite kernel, arises when a particular active set results in a singular KKT matrix (or the equality-constrained problem formed using the active set is semi-definite). Typically this is handled by explicitly detecting the rank of the KKT matrix. Unfortunately, this adds significant complexity to the implementation; and, if care is not taken, numerical instability, or worse, failure to converge can result. This research shows that the singular KKT system can be avoided altogether with simple modifications to the active set method. The result is a practical, easy to implement active set method that does not need to explicitly detect the rank of the KKT matrix nor modify factorization or solution methods based upon the rank. Methods are given for both conventional SVM training as well as for computing the regularization path that are simple and numerically stable. First, an efficient revised simplex method is efficiently implemented for SVM training (SVM-RSQP) with semi-definite kernels and shown to out-perform competing active set implementations for SVM training in terms of training time as well as shown to perform on-par with state-of-the-art SVM training algorithms such as SMO and SVMLight. Next, a new regularization path-following algorithm for semi-definite kernels (Simple SVMPath) is shown to be orders of magnitude faster, more accurate, and significantly less complex than competing methods and does not require the use of external solvers. Theoretical analysis reveals new insights into the nature of the path-following algorithms. Finally, a method is given for computing the approximate regularization path and approximate kernel path using the warm-start capability of the proposed revised simplex method (SVM-RSQP) and shown to provide significant, orders of magnitude, speed-ups relative to the traditional (")grid search(") where re-training is performed at each parameter value. Surprisingly, it also shown that even when the solution for the entire path is not desired, computing the approximate path can be seen as a speed-up mechanism for obtaining the solution at a single value. New insights are given concerning the limiting behaviors of the regularization and kernel path as well as the use of low-rank kernel approximations.
Identifier: CFE0005251 (IID), ucf:50600 (fedora)
Note(s): 2014-05-01
Ph.D.
Engineering and Computer Science, Electrical Engineering and Computer Science
Doctoral
This record was generated from author submitted information.
Subject(s): Support Vector Machine -- Active Set Method -- Regularization Path Following Method -- Approximate Regularization Path -- Approximate Kernel Path -- Revised Simplex
Persistent Link to This Record: http://purl.flvc.org/ucf/fd/CFE0005251
Restrictions on Access: public 2014-05-15
Host Institution: UCF

In Collections