19 October 2011

Dealing with data

Having acquired a small experience from dealing with small and medium size datasets, i should admit that most of the data i have encountered have originated from Greece where the art of econometrics is still premature, hence the data quality is disputable and the time series tend to be quiet messy.

Given the fact that datasets derived from various sources most of the times need to be 'beautified' and processed so as to be useful and interpretable, softwares that allow and ease the task of purifying the data are quiet useful.

Recently, i encountered the FREE software provided by GOOGLE called Google Refine. Its goal is exactly to clear and purify the datasets so as to constitute a database that will be user-friendly and more interpretable by the user. Google Refine which I openly recommend as the main pre-analysis software that I use to smooth the data and fix errors before applying econometric methods and analyses, tends to be my newest friend in the daily data treatment adventures. It works within the google Chrome browser without the need to upload any data online, hence preserving the privacy and the secrecy of any research.

Having purified the datasets, I have been taught to use STATA (I just bought v.12) as my statistical processing software. I think that Stata is the most complete, powerful, and dynamic statistical programme in the market.