hi,
MNE optimizes the mean squared error between data and dipole model taking into account the noise
covariance. You can see this as L2 norm on the whitened data (see eg Engemann et al. 2015 https://www.semanticscholar.org/paper/Automated-model-selection-in-covariance-estimation-Engemann-Gramfort/34b659eeb98727fc37b7e60c0d652fcb20ccb870)
first it tries to find a good initial position of the dipole by only considering a discrete regular grid.
Then it iterates and aims to refine the location of the dipole by making small changes in location
and orientation. Amplitude is easier to optimize when location and orientation are set.
you can look at the MNE code at:
https://github.com/mne-tools/mne-python/blob/main/mne/dipole.py#L829
you see it uses scipy fmin_cobyla function for the model optimization.
Alex