Hi

I have a large dsge model which I have estimated. I do however struggle with obtaining “nice” posterior draws ( I use the mh sampler). I have tried 10 chains, each chain including more than 3 million draws (over 30 mill draws in total). Although the number is high, the distribution surfaces are non-smooth and the autocorrelations are high (trimming a lot and high burn-in didn’t help). The mean log likelihood is a bit lower than at the mode. The result isn’t very satisfactory.

Then I tried something very different. I used 700 much shorter chains, in total about 5 million draws. I concatenated the chains, i.e. creating 25 long chains (based on the 700 chains), and the result is really, really good. The surfaces are smooth, the autocorrelation is low etc. Furthermore, the distributions seem to be very much what the 10 chain method would converge to (if I had let that run for another 10 million draws).

My intuition tells me that having many chains makes the sampler explore larger areas of the likelihood function since the 700 chains start all over the place, whereas the 10 chains are just stuck in some regions.

Is my intuition correct? Can I trust the 700 chains approach?

Any thought highly appreciated!

Thanks!!