You are here
Practical Implementations of the Active Set Method for Support Vector Machine Training with Semidefinite Kernels
 Date Issued:
 2014
 Abstract/Description:
 The Support Vector Machine (SVM) is a popular binary classification model due to its superior generalization performance, relative easeofuse, and applicability of kernel methods. SVM training entails solving an associated quadratic programming (QP) that presents significant challenges in terms of speed and memory constraints for very large datasets; therefore, research on numerical optimization techniques tailored to SVM training is vast. Slow training times are especially of concern when one considers that retraining is often necessary at several values of the model's regularization parameter, C, as well as associated kernel parameters.The active set method is suitable for solving SVM problem and is in general ideal when the Hessian is dense and the solution is sparsethe case for the l1loss SVM formulation. There has recently been renewed interest in the active set method as a technique for exploring the entire SVM regularization path, which has been shown to solve the SVM solution at all points along the regularization path (all values of C) in not much more time than it takes, on average, to perform training at a single value of C with traditional methods. Unfortunately, the majority of active set implementations used for SVM training require positive definite kernels, and those implementations that do allow semidefinite kernels tend to be complex and can exhibit instability and, worse, lack of convergence. This severely limits applicability since it precludes the use of the linear kernel, can be an issue when duplicate data points exist, and doesn't allow use of lowrank kernel approximations to improve tractability for large datasets. The difficulty, in the case of a semidefinite kernel, arises when a particular active set results in a singular KKT matrix (or the equalityconstrained problem formed using the active set is semidefinite). Typically this is handled by explicitly detecting the rank of the KKT matrix. Unfortunately, this adds significant complexity to the implementation; and, if care is not taken, numerical instability, or worse, failure to converge can result. This research shows that the singular KKT system can be avoided altogether with simple modifications to the active set method. The result is a practical, easy to implement active set method that does not need to explicitly detect the rank of the KKT matrix nor modify factorization or solution methods based upon the rank. Methods are given for both conventional SVM training as well as for computing the regularization path that are simple and numerically stable. First, an efficient revised simplex method is efficiently implemented for SVM training (SVMRSQP) with semidefinite kernels and shown to outperform competing active set implementations for SVM training in terms of training time as well as shown to perform onpar with stateoftheart SVM training algorithms such as SMO and SVMLight. Next, a new regularization pathfollowing algorithm for semidefinite kernels (Simple SVMPath) is shown to be orders of magnitude faster, more accurate, and significantly less complex than competing methods and does not require the use of external solvers. Theoretical analysis reveals new insights into the nature of the pathfollowing algorithms. Finally, a method is given for computing the approximate regularization path and approximate kernel path using the warmstart capability of the proposed revised simplex method (SVMRSQP) and shown to provide significant, orders of magnitude, speedups relative to the traditional (")grid search(") where retraining is performed at each parameter value. Surprisingly, it also shown that even when the solution for the entire path is not desired, computing the approximate path can be seen as a speedup mechanism for obtaining the solution at a single value. New insights are given concerning the limiting behaviors of the regularization and kernel path as well as the use of lowrank kernel approximations.
Title:  Practical Implementations of the Active Set Method for Support Vector Machine Training with Semidefinite Kernels. 
33 views
19 downloads 

Name(s): 
Sentelle, Christopher, Author Georgiopoulos, Michael, Committee Chair Anagnostopoulos, Georgios, Committee CoChair Kasparis, Takis, Committee Member Stanley, Kenneth, Committee Member Young, Cynthia, Committee Member University of Central Florida, Degree Grantor 

Type of Resource:  text  
Date Issued:  2014  
Publisher:  University of Central Florida  
Language(s):  English  
Abstract/Description:  The Support Vector Machine (SVM) is a popular binary classification model due to its superior generalization performance, relative easeofuse, and applicability of kernel methods. SVM training entails solving an associated quadratic programming (QP) that presents significant challenges in terms of speed and memory constraints for very large datasets; therefore, research on numerical optimization techniques tailored to SVM training is vast. Slow training times are especially of concern when one considers that retraining is often necessary at several values of the model's regularization parameter, C, as well as associated kernel parameters.The active set method is suitable for solving SVM problem and is in general ideal when the Hessian is dense and the solution is sparsethe case for the l1loss SVM formulation. There has recently been renewed interest in the active set method as a technique for exploring the entire SVM regularization path, which has been shown to solve the SVM solution at all points along the regularization path (all values of C) in not much more time than it takes, on average, to perform training at a single value of C with traditional methods. Unfortunately, the majority of active set implementations used for SVM training require positive definite kernels, and those implementations that do allow semidefinite kernels tend to be complex and can exhibit instability and, worse, lack of convergence. This severely limits applicability since it precludes the use of the linear kernel, can be an issue when duplicate data points exist, and doesn't allow use of lowrank kernel approximations to improve tractability for large datasets. The difficulty, in the case of a semidefinite kernel, arises when a particular active set results in a singular KKT matrix (or the equalityconstrained problem formed using the active set is semidefinite). Typically this is handled by explicitly detecting the rank of the KKT matrix. Unfortunately, this adds significant complexity to the implementation; and, if care is not taken, numerical instability, or worse, failure to converge can result. This research shows that the singular KKT system can be avoided altogether with simple modifications to the active set method. The result is a practical, easy to implement active set method that does not need to explicitly detect the rank of the KKT matrix nor modify factorization or solution methods based upon the rank. Methods are given for both conventional SVM training as well as for computing the regularization path that are simple and numerically stable. First, an efficient revised simplex method is efficiently implemented for SVM training (SVMRSQP) with semidefinite kernels and shown to outperform competing active set implementations for SVM training in terms of training time as well as shown to perform onpar with stateoftheart SVM training algorithms such as SMO and SVMLight. Next, a new regularization pathfollowing algorithm for semidefinite kernels (Simple SVMPath) is shown to be orders of magnitude faster, more accurate, and significantly less complex than competing methods and does not require the use of external solvers. Theoretical analysis reveals new insights into the nature of the pathfollowing algorithms. Finally, a method is given for computing the approximate regularization path and approximate kernel path using the warmstart capability of the proposed revised simplex method (SVMRSQP) and shown to provide significant, orders of magnitude, speedups relative to the traditional (")grid search(") where retraining is performed at each parameter value. Surprisingly, it also shown that even when the solution for the entire path is not desired, computing the approximate path can be seen as a speedup mechanism for obtaining the solution at a single value. New insights are given concerning the limiting behaviors of the regularization and kernel path as well as the use of lowrank kernel approximations.  
Identifier:  CFE0005251 (IID), ucf:50600 (fedora)  
Note(s): 
20140501 Ph.D. Engineering and Computer Science, Electrical Engineering and Computer Science Doctoral This record was generated from author submitted information. 

Subject(s):  Support Vector Machine  Active Set Method  Regularization Path Following Method  Approximate Regularization Path  Approximate Kernel Path  Revised Simplex  
Persistent Link to This Record:  http://purl.flvc.org/ucf/fd/CFE0005251  
Restrictions on Access:  public 20140515  
Host Institution:  UCF 