Automatic Differentiation of DSGE models

I’m wondering if there are any tools to do automatic differentiation through the solving of a DSGE model. I kind of want to calculate gradients with respect to my parameters for something like SW 2007 for optimization. What are my options for that and how fast would they be (ie is numerical gradients fast enough)? Thanks,

Cameron

In the unstable version, you can use analytic_derivation together with particular optimizers.See tests/analytic_derivatives/fs2000_analytic_derivation.mod · master · Dynare / dynare · GitLab

Note that our experience is that it does not cause a speed gain. Rather, it’s a bit more accurate and less prone to failing if you reach corners where finite differences cannot be computed.

1 Like

Are you referring to the way Dynare computes derivatives for computing the perturbation solution? Here we use analytic derivatives computed by our C++ preprocessor (so similar to say Matlab’s symbolic toolbox). We don’t use automatic differentiation techniques for that.

Or, as Johannes mentioned, do you want to provide an analytically computed gradient to the optimizer? analytic_derivation is then the way to go. What we do under the hood is explained for a first-order approximation in Ratto and Iskrev (2011); the relevant code is in get_perturbation_params_derivs.mwhich can also compute parameter derivatives of the perturbation solution matrices up to a third-order approximation. I will release a technical note soon, how we do that.

1 Like

And also, welcome to the Dynare community :grinning:

Thanks for the welcome and thanks Johannes for the information! Maybe this is helpful, I actually tried both automatic differentiation and finite difference and AD is much faster. I think the reason you don’t notice a difference is that the dynare_estimation function calls fmincon which does a bunch of other stuff like line search, BFGS, solving a quadratic programming problem to get an optimal step size. If you don’t do that and just run plain vanilla SGD with some predefined step size it takes finite differences of 230ish seconds to run 1000 steps. It takes AD much less than a second to run 1000 seconds (which is not out of the realm of possibility based on TensorFlow and Pytorch performance). I could have made a mistake on this as I’m a little skeptical the speed up is that significant.

one more question, can the automatic differentiation handle the particle filter? I know at the very least the resampling can make it problematic.

Edit: Removed my edit as I solved the problem

No, it cannot. Only at order=1 do we know how to construct the likelihood analytically. That does not work at higher order…

  1. So turns out I made a coding error for gradient descent and AD versus FD don’t seem that different in terms of running time.

If you are interested, a couple thoughts I’ve had on speeding up estimation of models:

  1. Regarding analytical derivatives of higher order models, I was able to do AD for a particular second order model in tensorflow that agreed with FD as well as backpropagating through a particle filter but that’s not tested and verified as well, but generalizing this process would be daunting. I think Jesus FV is working on something like this for Julia

  2. Related to particle filter backpropagation, there probably is a way to do variational inference via finite differences, but I think it will still use both tensorflow and dynare, so it won’t be easy. I plan on working on this at some point but it’s also going to be a lot of work and I have to teach myself dynare much better. Let me know if you or anyone you know is interested about these things and I’ll send what work I currently have and feel free to make determinations from there.

Thanks for the feedback. I am not sure that Matlab is the right program to do this. But it may be interesting for Dynare’s Julia package. Maybe @frederic.karame is interested.

Yea that’s probably true. Didn’t know there was a Julia version of Dynare. Will check it out.