Customized StepSize control in unconstrained gradient-descent optimization
Customized StepSize control in unconstrained gradient-descent optimization
I have a type of minimization problem I frequently solve that involves warping a large 2D triangle-mesh (~25,000 vertices) to fit a model. In the mesh, vertices carry the empirical measurements (each of which has a predicted position in the model, which is a continuous field). The potential/objective function for the system is something like this:
potentialFn[{X0_List, triangles_List}, Xideal_List, X_List] := With[
{elens0 = EdgeLengths[X0, triangles],
elens = EdgeLengths[X, triangles],
angs0 = AngleValues[X0, triangles],
angs = AngleValues[X, triangles]},
Plus[
Total[(elens0 - elens)^2],
Total[(angs0 - angs)^2],
Total@Exp[-(X - Xideal)^2],
0.25*Total[1 + (angs0/angs)^2 - 2*(angs0/angs))]]];
The idea is that the potential is equal to the sum of the squared deviations in the distances between vertex neighbors (the edges) plus the sum of the squared deviation in the angles of the mesh, plus the goal function, which is 0 when the vertices are perfectly aligned with the model and otherwise monotonically increasing with distance from an ideal fit, and finally a term that is designed to make the potential approach infinity as any triangle approaches inversion; i.e., the last term (which is similar to models of the van der Waals potential) constrains the potential such that triangle ABC would have to cross a singularity in order to becomes triangle ACB (in terms of a counterclockwise ordering of vertices). Additionally I have well-tested analytical functions that calculate the gradients (but nothing for the Hessians). I've confirmed that the gradient descent works correctly by, among other things, running a gradient descent search with a very small step-size for a very long time.
Most of those details are irrelevant to my question, however. What is important is that I can, for any set of vertex coordinates that are valid (i.e., no triangles have been inverted), calculate a maximum steps-size S such that for any actual step size s (0 < s < S) the resulting vertex coordinates will also be valid; so long as my step sizes always follow this rule, I can guarantee that no triangles will invert. The problem I have is that there doesn't seem to be a way for me to provide this information to the Mathematica line-search algorithm in functions like FindMinimum.
Initially, I thought that something like this would be the solution:
FindMinimum[
potentialFn[{X0, triangles}, X],
{X, X0},
Gradient :> potentialGradient[{X0, triangles}, X],
Method -> {
"ConjugateGradient",
"StepControl" -> {
"LineSearch",
"MaxRelativeStepSize" :> CalculateMaxRelativeStepSize[{X0, triangles}, X]}}]
This, however, gives me an error (FindMinimum::lsmrss, with message "-- Message Text not found -- (CalculateMaxRelativeStepSize[{X0, triangles}, X])") that I can only assume is due to FindMinimum's inability to interpret the delayed-rule. I've spent a lot of time looking through the Mathematica documentation on conjugate-gradient and related unconstrained search and have found nothing that indicates that I can actually control the step-size aside from setting a permanent step-size length relative to the norm of the total gradient. That is fairly useless in this kind of case, unfortunately --- I can use it, but it results in a very slow search.
My question is this: are there existing (undocumented?) methods for providing Mathematica's line-search method with a way to calculate a maximum gradient step-size?
Note: I realize I haven't provided a minimal working example of this problem. I can do so, but this would be quite an undertaking as there is a lot of context around the specific problem instances --- if anyone believes that they can help me with this kind of optimization given an example, I will work on this, but I'd appreciate some indication that the work of translating the problem into a compact instance won't be for naught before I do it.
K2-Think
To address your need for customized step size control in Mathematica's gradient descent optimization, you can implement a custom line search function that dynamically enforces your maximum valid step size. Here's a structured approach:
Approach Overview
- Define Helper Functions: Encapsulate your potential function, gradient, and maximum step size calculation within a scope accessibl...
GPT-5
Short answer: No—FindMinimum’s line-search does not accept a point‑dependent (dynamic) bound on the step length. The "MaxRelativeStepSize" suboption must be a fixed numeric value (or Automatic). RuleDelayed will not be reevaluated at each iteration with the current X, so your attempt triggers a “non‑numeric MaxRelativeStepSize” error (the lsmrss message you saw). There is no documented or supporte...
Claude Opus 4
Based on your question, you're trying to implement a custom step-size control in Mathematica's optimization routines to prevent triangle inversions in your mesh optimization problem. Unfortunately, Mathematica's built-in optimization functions like FindMinimum don't provide a direct way to dynamically control the step size based on problem-specific constraints during the line search.
Here are s...
o3 Pro
In the code that is inside FindMinimum all options that belong to a method are read once only – at the moment the call is made.
Whatever you put into the option value must therefore already be an explicit
number (or a list of numbers) and cannot depend on the point that the algorithm
is visiting later on. For this reason
"MaxRelativeStepSize" :> CalculateMaxRelativeStepSize[ … ]
does ...
o3
FindMinimum (and the other numerical minimizers that are built on the
same kernel code, e.g. NMinimize, FindMaximum …) reads the option
settings only once, before the minimization loop starts.
Consequently every sub-option that is used by the internal step–size/line-search
machinery has to be a number (or one of the recognised keywords such as
Automatic). A delayed rule (:> …) is...
DeepSeek R1
The key issue is that the built-in line search in FindMinimum does not seem to accept a user...
Gemini 2.5 Pro
Excellent question. You've hit upon a common and important limitation of "black-box" optimization routines: the hooks for injecting problem-specific knowledge (like your maximum safe step size) are often limited or non-existent.
Your analysis is correct. FindMinimum evaluates the Method options once at the beginning of the call. It expects a number for "MaxRelativeStepSize", not a function ...
o4 mini
Mathematica’s line‐search machinery in FindMinimum (and friends) simply isn’t set up to call back out to you for a periteration step‐size bound. The only “step‐size” option you can hand it under Method→{"ConjugateGradient","StepControl"→{"LineSearch",…}} is
"MaxRelativeStepSize"→constant
and that constant is read once when the search is initialized. There is no hidden option or hook that ...
Claude 3.7 Sonnet
Controlling Step Size in Mathematica Optimization
Your problem is quite interesting - implementing a custom step size control to prevent triangle inversions during mesh optimization. Unfortunately, Mathematica doesn't provide straightforward support for dynamic step size calculation in the way you're attempting.
Why Your Approach Doesn't Work
The error you're seeing with the delayed rule (...
You need to be logged in to submit an answer