Global ETD Search

1	TUNING OPTIMIZATION SOFTWARE PARAMETERS FOR MIXED INTEGER PROGRAMMING PROBLEMS Sorrell, Toni P 01 January 2017 (has links) The tuning of optimization software is of key interest to researchers solving mixed integer programming (MIP) problems. The efficiency of the optimization software can be greatly impacted by the solver’s parameter settings and the structure of the MIP. A designed experiment approach is used to fit a statistical model that would suggest settings of the parameters that provided the largest reduction in the primal integral metric. Tuning exemplars of six and 59 factors (parameters) of optimization software, experimentation takes place on three classes of MIPs: survivable fixed telecommunication network design, a formulation of the support vector machine with the ramp loss and L1-norm regularization, and node packing for coding theory graphs. This research presents and demonstrates a framework for tuning a portfolio of MIP instances to not only obtain good parameter settings used for future instances of the same class of MIPs, but to also gain insights into which parameters and interactions of parameters are significant for that class of MIPs. The framework is used for benchmarking of solvers with tuned parameters on a portfolio of instances. A group screening method provides a way to reduce the number of factors in a design and reduces the time it takes to perform the tuning process. Portfolio benchmarking provides performance information of optimization solvers on a class with instances of a similar structure. Parameter tuning experimental design optimization solvers group screening Design of Experiments and Sample Surveys Operational Research
2	HIGHER ORDER OPTIMIZATION TECHNIQUES FOR MACHINE LEARNING Sudhir B. Kylasa (5929916) 09 December 2019 (has links) <div> <div> <div> <p>First-order methods such as Stochastic Gradient Descent are methods of choice for solving non-convex optimization problems in machine learning. These methods primarily rely on the gradient of the loss function to estimate descent direction. However, they have a number of drawbacks, including converging to saddle points (as opposed to minima), slow convergence, and sensitivity to parameter tuning. In contrast, second order methods that use curvature information in addition to the gradient, have been shown to achieve faster convergence rates, theoretically. When used in the context of machine learning applications, they offer faster (quadratic) convergence, stability to parameter tuning, and robustness to problem conditioning. In spite of these advantages, first order methods are commonly used because of their simplicity of implementation and low per-iteration cost. The need to generate and use curvature information in the form of a dense Hessian matrix makes each iteration of second order methods more expensive. </p><p><br></p> <p>In this work, we address three key problems associated with second order methods – (i) what is the best way to incorporate curvature information into the optimization procedure; (ii) how do we reduce the operation count of each iteration in a second order method, while maintaining its superior convergence property; and (iii) how do we leverage high-performance computing platforms to significant accelerate second order methods. To answer the first question, we propose and validate the use of Fisher information matrices in second order methods to significantly accelerate convergence. The second question is answered through the use of statistical sampling techniques that suitably sample matrices to reduce per-iteration cost without impacting convergence. The third question is addressed through the use of graphics processing units (GPUs) in distributed platforms to deliver state of the art solvers.</p></div></div></div><div><div><div> <p>Through our work, we show that our solvers are capable of significant improvement over state of the art optimization techniques for training machine learning models. We demonstrate improvements in terms of training time (over an order of magnitude in wall-clock time), generalization properties of learned models, and robustness to problem conditioning. </p> </div> </div> </div> Computer Engineering Applied Computer Science Computer Software Distributed Computing Optimisation machine learning methods optimization process Distributed Computing System GPU implementations Convex optimization solvers Non-Convex Optimization

Search results

TUNING OPTIMIZATION SOFTWARE PARAMETERS FOR MIXED INTEGER PROGRAMMING PROBLEMS

HIGHER ORDER OPTIMIZATION TECHNIQUES FOR MACHINE LEARNING