I just wanted to ask what your thoughts are on using LLMs such as ChatGPT or Claude to help with DSGE modelling.
I’ve been experimenting with them recently and have to admit they’re surprisingly useful for spotting coding mistakes or rewriting solving routines to make them more efficient. I’m currently testing whether I can replicate a published model and so far it’s done quite well. That said, you definitely need prior experience in DSGE modelling to ask the right questions and to handle debugging properly.
When it comes to implementing more “advanced mechanisms” (e.g. endogenous default), things get trickier. In many cases, doing it manually might take just as long. I’ve also noticed that for smaller models, the LLM often manages to compute the steady state correctly, but for larger or more complex models I’m sceptical that it would succeed.
I’d love to hear if anyone here uses a different or more reliable LLM for DSGE modelling and for what specific tasks, e.g. inputting model equations, solving for the steady state, plotting IRFs and so on.
Would be great to hear how others are using (or not using!) LLMs in their DSGE workflows.
Well, first of all, you want to be careful. Ideally, you should solve the steady state yourself and then use the LLM to transform the steady-state conditions into code, for example if you already have them in LaTeX form. I remember once making a stupid mistake with brackets that altered the steady state, which probably would not have happened had I used an LLM.
If you have a solver for a non-linear steady state, you could use the LLM to help make it more efficient or to check it. I would be careful about relying on it to solve the steady state outright: it might work for smaller models but I am not sure it would scale well to larger ones. This is especially important when you are new to DSGE modelling and still benefiting from the learning effect of working through what is needed to solve a model and simulate or estimate it. Otherwise, it all becomes a black box and you have no idea what is going on, which will make debugging very difficult.
I would call me a heavy LLM user (usually Claude within Cursor) and it assists me mostly with writing tasks, creating ideas for visualizations and coding them, so in general helping me with “straightforward” coding and writing.
My experience with LLMs for actual modelling are rather bad. Sure, every once in a while you get out a working mod file for rather standard models but you can never be sure unless you check it yourself. It can come up with documentation and derivation for a model which might be helpful to get started, but is rarely correct or super useful. It can however assist you greatly along the way your DSGE and Dynare journey, if you use it step by step, e.g. these are my nonlinear model equations how do I compute the log-linearized New Keynesian Phillips Curve, what is the interpretation or intuition behind the Frisch elasticity, please read the documentation of dynare and tell me how I invoke IRF matching, what are important options I should play around with, what is the meaning of a certain command or option, and similar questions.
All in all, I am extremely excited about LLMs as a helpful assistant, but you need to use them wisely.
Thanks @wmutschl for your comment, I think the same. It is a great tool if used correctly. As you said, it might one day produce a working mod file, but you would still have to check whether it is correct and often this takes roughly the same amount of time as deriving it yourself.
Your last sentence is spot on. I suppose that when you throw complex DSGE models with intricate mechanisms at LLMs, trying to push the research frontier even slightly, the LLM will, by construction, struggle. I could be wrong here though, as ChatGPT now offers a “research-grade” subscription - but even then..