Basic question about computing second moments in data

dsge_modeling · June 19, 2021, 5:03am

I want to calibrate my model using GMM/SMM, and I’m computing second moments from real data, but some problems have arose. How I’ve proceeded: 1) downloaded raw relevant time series (most having strong seasonal component); 2) deseasonalized the time series with X-13ARIMA-SEATS (didn’t control for any particular setting, just computed deseasonalized series from basic function); 3) converted deseasonalized non-rate series (e.g. not interests nor employment/unemployment rates) to logs; 4) Extracted cyclical component using two-sided HP filter with \lambda=1600 since my data is quarterly; 5) Computed Pearson correlation matrix between all series. From here only I’m only using the cycle component from the previous process, after all this I end up with 56 obs. Then some of my problems/doubts:

Some of the correlations I’m getting are at odds with other papers that compute similar statistics (worth mentioning that those use a bit older data, ~8 years of difference), nevertheless some results seems really implausible, e.g. positive correlation of real M1, M2 and M3 with the nominal interest, or negative correlation between GDP and CPI with positive between GDP and inflation of CPI, or positive corr. between GDP, interest rates, and inflation.
Also I’d like to compute standard deviation in percent of variables, how should I calculate it? Since I thought it was another name for the coefficient of variation, but the problem is that since I’m working with the cycle the mean will be ~0, then for example computing sd(X_{cycle})/mean(X_{cycle}) gives me really big numbers (e.g. -4.484304e+16), which differ from most common statistics reported in papers (in a range between roughly ~0 and ~3).

I’ve tried to be as rigorous as possible working with the data, but not sure if I’m mistaken in some of the procedure I followed, then I’d be very grateful if you may give me some clue if I’m doing it right or where should I check again, and also some advice in how to do the computations for GMM/SMM real data second moments and checking if those’re “correct”.

Thanks!

PD: Maybe if helpful, for example applying HP as described above to aggregate household consumption, gives me a cycle that looks like this:

which seems like “too smooth” for me, and each cycle lasts very long (?), is that normal? Also noted that after the seasonal adjustment the data got a bit too smooth, nonetheless using data source’s seasonally adjusted data gives me very similar cycle but much more noisy.

jpfeifer · June 19, 2021, 12:53pm

I am not sure I am completely following. My thoughts:

Your data seems to be really short. This is a problem with the HP filter. See e.g. Limited number of observations - #2 by jpfeifer
I am not sure that correlations in such a short sample are really meaningful. It’s not even a full cycle.
Standard deviations in percent are (approximately) the standard deviations of the logged cyclical values. Once you have taken the log, any division by the mean is wrong. When using the HP filter on logged values, you will be getting percentage deviations from trend, which is already what you want.
The cyclical component indeed looks extremely smooth, but the problem mostly seems to be with the original data, which looks like it has already been filtered. There is not much high-frequency movement.

dsge_modeling · June 20, 2021, 5:43am

Thanks for your answer!

Given that I have limited observations and the mentioned HP-drawbacks in such case, then should I’d better use one-sided HP filter? or if another filter at all, which would you recommend?
Indeed, in that aspect the thing is that one of the key series (which comes from household survey which is applied only since 2007) is available in a very short period of time, actually I thought that my better option was this moment matching and given also that a previous article in which I’m heavy based on has a similar limitation and use the same strategy. Also, it comes to my mind maybe I can compute second moments according to the maximum number of observations I can get, e.g. compute cross second moments for typical national accounts variables with the maximum obs. I can get (I think maybe around 80), and then the second moments related with my limited series just with 56 obs. Would that be right?
Okay, that means the covariances I get are already in percent, which confirm that statistics are at odds with previous studies. Just to make sure, for example I compute the standard deviation of say \ln x_t and it results as 0.012, is it already in percent or I have to multiply it by 100? i.e. the sd is 0.012% or 1.2%
Indeed, actually I’ll check my seasonal adjustment and perhaps trying with the already seas-adjusted series from the department of statistics.

Thanks!