Dark Mode

Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Latest commit

History

History

0_make_dataset

README.md

Run instructions

To run all code in this directory, do ./0_make_dataset.sh.

Please note, the raw data used here is not publicly available.

Folder structure

climate - code and shapefiles for constructing, cleaning and assembling climate data

coded_issues - encoded issues and documentation used to construct reporting regimes, clean the energy consumption data, and construct the climate data

energy_load - code for cleaning raw energy consumption data

pop_and_income - code for cleaning population and income data

merged - code for cleaning the merged dataset

Because the raw data is currently unavailable, code in climate, energy_load, and pop_and_income cannot be run and is only present for reference.

Directory Master Scripts

Codes in this folder accomplish the following tasks:

  • Construct an intermediate dataset including population, energy consumption, climate, and income data.
    • /DATA/regression/IEA_merged_long_GMFD.dta:
    • We clean IEA_merged_long.dta in two different ways to produce regression ready datasets for our main specification (Methods Equation 2; Appendix Equation C.4) and robustness models;
  • Construct regression ready data:
    • DATA/regression/GMFD_TINV_clim_EX_regsort.dta: used for estimating the excluding imputed data model (Appendix I.2)
    • DATA/regression/GMFD_TINV_clim_regsort.dta: used for estimating the main model
  • Save information on each country-year's income and climate covariates, which is used as an input to plotting code
    • /DATA/regression/break_data_TINV_clim_EX.dta: used for plotting output for the excluding imputed data model
    • /DATA/regression/break_data_TINV_clim.dta: used for plotting output for the main model
  • Testing for the existence of unit roots in our outcome variable, motivating the need to use first differenced variables in our empirical analysis

Constructing Intermediate Dataset (IEA_merged_long.dta)

1_construct_dataset_from_raw_inputs.do produces IEA_merged_long.dta throught the following steps:

  1. Clean and construct population, income, climate and energy consumption datasets
  2. Merge population, income, climate and energy data by country and year

Code Inputs:

  • unavailable raw data

Code Outputs:

  • /DATA/regression/IEA_merged_long_GMFD.dta

Constructing Regression Ready Dataset (GMFD_TINV_clim_*_regsort.dta)

2_construct_regression_ready_data.do can produce both GMFD_TINV_clim_EX_regsort.dta and GMFD_TINV_clim_regsort.dta through the following steps:

  1. Construct reporting regimes and drop data according to encoded data issues
  2. Match product specific climate data with product
    • climate is product specific due to the encoded data issues. Please reference this climate/README.md for more information on the topic.
  3. Find income spline knot location to model a nonlinear effect of income on energy temperature sensitivity
  4. Perform Final Cleaning Steps before first differenced interacted variable construction
    • Classify countries within 1 of 13 UN regions -- these UN regions are used to construct one of the fixed effects used in the analysis
    • Classify countries in income deciles and groups -- merge constructed income groups from (3) into main dataset
  5. Construct First Differenced Interacted Variables used in the analysis section

Note: at the top of 2_construct_regression_ready_data.do set the global macro model to TINV_clim to produce regression ready data for the main model and to TINV_clim_EX to produce regression ready data for the excluding imputed data model.

Code Inputs:

  • /DATA/regression/IEA_merged_long.dta

Code Outputs:

  • /DATA/regression/GMFD_TINV_clim*_regsort.dta

Constructing Covariate Intermediate Datasets (break_data_TINV_clim_*.dta)

As well as producing the regression ready datasets, 2_construct_regression_ready_data.do can produce both break_data_TINV_clim.dta and break_data_TINV_clim_EX.dta. These are intermediate datasets, that are outputted for 3x3 array plotting. These datasests contain covariate information for each country-year, including:

  • Income:

    • Decile of overall income distribution of our observations (gpid)
    • Tercile of overall income distribution of our observations (tpid)
    • Income groupings based on location of knot (largegpid_*), note, these vary by product.
      • See the Paper Appendix Section C.3 for discussion of what these knots are.
    • Average values of the long run income covariate, within each CDD tercile (avgInc_tgpid)
    • Maximum values of the long run income covariate within each income group (maxInc_largegpid_other_energy and maxInc_largegpid_electricity)
  • Climate

    • Tercile of the distribution of long run CDDs (tpid)
    • Average value of the long run HDD covariate, within each income tercile (avgHDD_tpid)
    • Average value of the long run CDD covariate, within each income tercile (avgCDD_tpid)

Note: at the top of 2_construct_regression_ready_data.do set the global macro model to TINV_clim to produce regression ready data for the main model and to TINV_clim_EX to produce this covariate information for the excluding imputed data model.

Code Inputs:

  • /DATA/regression/IEA_merged_long.dta

Code Outputs:

  • /DATA/regression/break_data_TINV_clim.dta
  • /DATA/regression/break_data_TINV_clim_EX.dta

Testing for the existence of unit roots in our outcome variable

Through this data testing we motivate the first differencing completed in Step 5 of Constructing Regression Ready Dataset

3_unit_root_test_and_plot.do takes the regression ready dataset created in 2_construct_regression_ready_data.do, and tests for the existence of unit roots in the load_pc variable.

  • The code implements the tests described in Section Appendix A.1 of the paper.
  • The figures outputted are those in the paper as Appendix Figure A.2

4_plot_ITA_other_energy_regimes_timeseries.R takes the regression ready dataset created in 2_construct_regression_ready_data.do, and plots a simple visualisation of the time series for ITALY Other fuels.

  • The figure outputted is in the paper as Appendix Figure A.1

Code Inputs:

  • /DATA/regression/GMFD_TINV_clim_regsort.dta

Code Outputs:

  • /OUTPUT/figures/fig_Appendix-A1_ITA_other_fuels_time_series_regimes.pdf
  • /OUTPUT/figures/fig_Appendix-A2_Unit_Root_Tests_p_val_hists_electricity.pdf