I would like to know what is the advantage of using GMM for estimation? Why don’t the researchers simply choose parameters to minimize sum of squares error (SSE) of the difference between the real data and simulated data? I think the latter is better in that the estimation result does not depend on the moments we pick.

My second question is, is the estimation result of minimizing SSE equivalent to the MLE?

