Step-by-Step: How the Coincident Consumption Index Was Made

A coincident index provides a gauge for the movement of a key economic variable through the use of related data series. In this case, the coincident investment index mirrors one the largest components of GDP, Private Consumption. The index is divided in sections, which are weighed in accordance with their relevance to the variable at hand, and estimated using the latest available data for each. You can see read more about the index (methodology, usefulness, interpretation) [here](LINK AL NUEVO INSIGHT).

ComponentWeightIndicators
Food and Beverages22.7%Baked Goods, Meats, Fruits and Vegetables, Dairy, Beverages, Sweets and Candy
Housing and Utilities14.5%Real Estate and Housing Services, Electricity, Heating, Running Water
Transportation14.5%Vehicle purchases, Fuels, Transportation Services, Airfare
Recreation and Culture8.6%Related goods
Clothing and Footwear6.8%Clothing and Footwear production
Restaurants and Hotels6.6%Restaurants and Hotels activity
Healthcare6.4%Healthcare activity
Home Equipment5.4%Appliances, Furniture and Mattresses
Communications5.3%Activity in Communications
Other Services4.3%Financial Services, Personal and Professional Services
Education3.2%Education activity
Alcohol and Tobacco1.9%Wine, Cigarettes

Rather than focusing on what the index looks like and what to use it for, this post will focus on how the Alphacast pipelines engine can be used to make it in a completely automated manner. This means that, as soon as the data is uploaded in the Alphacast site, the Index will be updated to reflect it. The first step, then, is identifying the data. As said above, the index is divided into three components, reflected their importance to private fixed investment: Construction, Machinery and Equipment, and Transport Equipment. Each indicator has a number of data sources attached (in principle two, but not necessarily).

image.png

The first step to creating any pipeline is clicking on the "create new" button on the site and selecting "pipeline". Then, name slect the repository in which the pipeline will be "housed" and give it a name. For the name, I'd recommend using the same as the final dataset, to avoid confusion (and because the pipeline name is publicly displayed). For the repository, it depends on who you want to have access to it - in a public repository, everyone would be able to modify it, while a private one would allow for more control over modifications.

image.png image.png

The next step is fetching a dataset. Since the index utilizes many different ones, it's not particularly relevant which ones to begin with - just search for it by name. For example, using the Construya Index as a proxy for construction activity, you can just look up "Construya Index" on the search bar and look for the desired dataset. It's inadvisable to choose the "raw data" version of this dataset, because the data is, as the name would sugges, raw - for instance, if you wanted a seasonally adjusted variable, you would have to do it manually. Don't forget to click "save" after selecting the dataset! image.png

The next step is merging the dataset with all the other relevant ones. Click "add step below" and look for "merge" in the list. Selecting the dataset to merge with works exactly the same as fetching it: look up its name, type it, select it.

The remaining step is "matching entities". All datasets have at most two kinds of entities, at least one. The one they all have is "date" - the date for each data point. So when matching entities, choose "date" and "date" (they might have different names, like date and year, but the principle is the same). The remaining entity type is "countries" - what the data refers to. These are often countries, but sometimes other kinds of data too. Either way, if they are available, match them; if not, it's not a problem. The dataset for construction employment doesn't have an entity - and the pipeline works just fine. image.png

You can merge every dataset consecutively (in this case, all six of them) or you can do it one at a time. But either way, the truth is that they don't all contain useful information - some variables are not relevant and clutter the process (especially in later steps). Regardless of the choice, selecting columns is a must. You can do this while selecting a dataset, or after merging through the "select variables" option. Either alternative works the same, by dropping the variables that won't be needed from the dataset. Be mindful of the difference between seasonally adjusted and regular variables, and between variables and MoM/YoY changes!

image.png image.png

The full list of variables is as follows, and it's fairly simple to figure out which is which within each dataset.

ComponentWeightIndicators
Food and Beverages22.7%Food production, Egg production
Housing and Utilities14.5%Real Estate and Housing Services, and Public Services activity
Transportation14.5%New car purchases, used car purchases, motorcycle purchases, Fuel Production, Air Traffic
Recreation and Culture8.6%Related goods production
Clothing and Footwear6.8%Clothing and Footwear production
Restaurants and Hotels6.6%Restaurants and Hotels activity
Healthcare6.4%Healthcare activity
Home Equipment5.4%Home Appliances production, Electronics production, and Furniture and Mattresses production
Communications5.3%Communications activity
Other Services4.3%Financial Services and Personal and Professional Services activity
Education3.2%Education activity
Alcohol and Tobacco1.9%Wine and Cigarettes production

The final step is calculating the variables. This was done in two steps. The first is converting all of the variables into a base=100 index, simply to prevent their different scales from muddying up the numbers - for instance, a variable that is measured in millions of pesos would have a thousand times less weight than one measured in thousands. A simple way of doing this is taking the variable for the base date to use (which will be the soonest date at which all variables are present - for instance, in the case of the investment index, it's January 2016), dividing each value by that figure, and multiplying by a hundred. Then, each component is calculated as a weighted sum of the subcomponent indeces.

image.png

The last step is calculating the Consumption Index itself, which is done using all the subindeces and multiplying them by their weight. Since they all use base 100 variables by construction, their base values don't need to be adjusted. After this is done, the remainder is simple: filter out the input variables (Wine, Meats, Car Sales, etc) to have a cleaner presentation of the final dataset.

You can also apply transformations, so users can have access to a more "complete" dataset: seasonal adjustment, YoY change, % of GDP, constant prices, whatever. Select the variables within the list (this is important if there are multiple transformations) and click save.

image.png image.png

The final step is simple: publish to dataset. Pick a repository for the dataset to be listed, and pick a name for the dataset (it's best if it keeps with the general Alphacast formula of Topic - Country - Dataset Name - Frequency). Click "save", run the pipeline, and enjoy the results! If there are any errors, check for unsaved changes, bad formulas, or even typos. Voila!

Maia Mindel

Written by

Maia Mindel

Macroeconomic analyst at Alphacast. Following inflation, activity, and trade.

Published in

My Public Repo

You can use you first public repository to share content with the community

Related insights

  • Read more...

    A short guide to Ecuadorian macro and financial data

    Highlights of the Ecuadorian Economy

    The Ecuadorian economy continues slow recovery from COVID. Given that the economy was already in a recession by 2019, Ecuador was hit particularly hard by the COVID pandemic - with a GDP contraction of 9% in 2020 and a slow recovery of 4% in

  • Read more...

    Keeping Track of Vehicle Prices

    Automobiles are a major item for the economy, both because of the weight the industry has on total manufacturing output/employment, and because of their relevance for consumer spending. But data on vehicle prices is not easy to find, and INDEC does not publish a monthly estimate for national vehicle prices.