Step-by-Step: How the Coincident Investment Index Was Made

A coincident index provides a gauge for the movement of a key economic variable through the use of related data series. In this case, the coincident investment index mirrors one the two largest components of GDP, Private Fixed Investment. The index is divided in sections, which are weighed in accordance with their relevance to the variable at hand, and estimated using the latest available data for each. You can see read more about the index (methodology, usefulness, interpretation) here.

ComponentWeightIndicators
Construction47.2%Construction Activity, Construction Employment
Machinery and Equipment41.4%Machinery Production, Capital Goods Imports
Transportation Equipment11.3%Utility Vehicles Sales, Transportation Imports

Rather than focusing on what the index looks like and what to use it for, this post will focus on how the Alphacast pipelines engine can be used to make it in a completely automated manner. This means that, as soon as the data is uploaded in the Alphacast site, the Index will be updated to reflect it. The first step, then, is identifying the data. As said above, the index is divided into three components, reflected their importance to private fixed investment: Construction, Machinery and Equipment, and Transport Equipment. Each indicator has a number of data sources attached (in principle two, but not necessarily).

image.png

The first step to creating any pipeline is clicking on the "create new" button on the site and selecting "pipeline". Then, name slect the repository in which the pipeline will be "housed" and give it a name. For the name, I'd recommend using the same as the final dataset, to avoid confusion (and because the pipeline name is publicly displayed). For the repository, it depends on who you want to have access to it - in a public repository, everyone would be able to modify it, while a private one would allow for more control over modifications.

image.png image.png

The next step is fetching a dataset. Since the index utilizes many different ones, it's not particularly relevant which ones to begin with - just search for it by name. For example, using the Construya Index as a proxy for construction activity, you can just look up "Construya Index" on the search bar and look for the desired dataset. It's inadvisable to choose the "raw data" version of this dataset, because the data is, as the name would sugges, raw - for instance, if you wanted a seasonally adjusted variable, you would have to do it manually. Don't forget to click "save" after selecting the dataset! image.png

The next step is merging the dataset with all the other relevant ones. Click "add step below" and look for "merge" in the list. Selecting the dataset to merge with works exactly the same as fetching it: look up its name, type it, select it.

image.png

The remaining step is "matching entities". All datasets have at most two kinds of entities, at least one. The one they all have is "date" - the date for each data point. So when matching entities, choose "date" and "date" (they might have different names, like date and year, but the principle is the same). The remaining entity type is "countries" - what the data refers to. These are often countries, but sometimes other kinds of data too. Either way, if they are available, match them; if not, it's not a problem. The dataset for construction employment doesn't have an entity - and the pipeline works just fine.

image.png

You can merge every dataset consecutively (in this case, all six of them) or you can do it one at a time. But either way, the truth is that they don't all contain useful information - some variables are not relevant and clutter the process (especially in later steps). Regardless of the choice, selecting columns is a must. Once all the useful datasets are uploaded, whether partially or fully (for large datasets like the Industrial Production Index I'd recommend filtering variables after every merge), select the useful variables from the list and click "save". Be mindful of the difference between seasonally adjusted and regular variables, and between variables and MoM/YoY changes!

IndicatorWeightAlphacast Source Source
Construction Activity33.2%Construya Index
Construction Employment14.2%EIL - Labor Indicators by Sector
National Machinery & Equipment15.5%Industrial Production Index
Imported Machinery & Equipment25.9%Monthly Trade Statistics
National Transportation Equipment7.4%ADEFA Car Statistics
Imported Transportation Equipment3.9%Monthly Trade Statistics

The full list of variables is there, and it's fairly simple to figure out which is which within each dataset.

image.png

After reiterating this process for all the relevant datasets (see chart above for the outline of which variables and from where), the next step is mostly for convenience: renaming the variables. This is convenient because, in some cases, variable names might not be comfortable to use or easy to remember (for larger lists of variables, this is really important). This process is simple: rename the variables you want, and keep the rest the same. It's really important to not have any variables have the name of a final output: for instance, a variable called "Investment Index" wouldn't make sense to keep, so you'd have to rename it.

image.png

The final step is calculating the variables. This was done in two steps. The first is converting all of the variables into a base=100 index, simply to prevent their different scales from muddying up the numbers - for instance, a variable that is measured in millions of pesos would have a thousand times less weight than one measured in thousands. A simple way of doing this is taking the variable for the base date to use (which will be the soonest date at which all variables are present - for instance, in the case of the investment index, it's January 2016), dividing each value by that figure, and multiplying by a hundred. For example, the Construya Index has a base 2016 number of 327.5 and a weight within the construction subindex of 70%, while the employment indicator has a base number of 87.3 and a weight of 30%.

image.png

The last step is calculating the Investment Index itself, which is done using all the subindeces and multiplying them by their weight. Since they all use base 100 variables by construction, their base values don't need to be adjusted. After this is done, the remainder is simple: filter out the input variables (Construction Activity & Employment, Car Sales, etc) to have a cleaner presentation of the final dataset.

image.png

You can also apply transformations, so users can have access to a more "complete" dataset: seasonal adjustment, YoY change, % of GDP, constant prices, whatever. Select the variables within the list (this is important if there are multiple transformations) and click save.

image.png image.png

The final step is simple: publish to dataset. Pick a repository for the dataset to be listed, and pick a name for the dataset (it's best if it keeps with the general Alphacast formula of Topic - Country - Dataset Name - Frequency). Click "save", run the pipeline, and enjoy the results! If there are any errors, check for unsaved changes, bad formulas, or even typos. Voila!

Maia Mindel

Written by

Maia Mindel

Macroeconomic analyst at Alphacast. Following inflation, activity, and trade.

This repository contains a sample of SEIDO's economic research products, including High Frequency CPI, Macro Weekly Reports, Public Opinion Reports and Monthly Macro Updates.

SEIDO

Part of

SEIDO

Specialized in macroeconomic research, high-frequency inflation tracking, and Latin American public opinion surveys

Related insights

  • Read more...

    [SAMPLE] SEIDO High Frequency CPI: Inflation was 1% WoW

    Weekly inflation remained relatively stable, though at high levels. Consumer prices increased 1% WoW (slightly lower than the previous 1.2% WoW -revised-), and its monthly printing was 3.5% MoM (down from previous 3.7% MoM -revised-). Furthermore, the interannual inflation rate was 56.9% YoY (down from previous 57.1% YoY -revised-).

    @chart

  • Read more...

    [SAMPLE] Output recovers to pre-pandemic level

    Economic activity recovered in August, reaching pre-pandemic levels. Official data reported an increase of 1.1% MoM in July (seasonally adjusted), with interannual comparisons showing the economy up 12.8% YoY. In consequence, output is 0.28% above its pre-pandemic level (January 2020) and remains just 1.9% below early 2019 levels. **Data for September is scarce,