Hands on tracking Consumption and Investment in real time

alphacast_logo_1.png

Hands on tracking Consumption and Investment in real time

By Maia Mindel (mmindel@alphacast.io)


Read more Alphacast Highlights here


In this hands-on guide, we will show you how to use Alphacast Pipelines to create coincident indeces and monitor key economic variables. What is a coincident index? A coincident index provides a gauge for the movement of a key economic variable through the use of related data series. In this case, the coincident investment index mirrors two of the two largest components of GDP, Private Consumption and Private Fixed Investment.

Since both indeces have a certain complexity, the methodology will be shown separately, though there are major similarities between the two.

How to make the Investment Index

In this case, we will show you how to create the coincident investment index. The index is divided in sections, which are weighed in accordance with their relevance to the variable at hand, and estimated using the latest available data for each.

Given that the index uses the Alphacast pipelines engine and data available on the platform's database, that means that the Index will be updated to reflect it as soon as the data is uploaded in the Alphacast site - in a completely automated manner.

The first step is identifying the data series to use. To do this, it must be taken into account that the index is divided into three components, reflected their importance to private fixed investment: Construction, Machinery and Equipment, and Transport Equipment. Each indicator has a number of data sources attached (in this case two, but not necessarily). The components for each indicator are as detailed below:

ComponentWeightIndicators
Construction47.2%Construction Activity, Construction Employment
Machinery and Equipment41.4%Machinery Production, Capital Goods Imports
Transportation Equipment11.3%Utility Vehicles Sales, Transportation Imports

The sources to be used are:

IndicatorWeightAlphacast Source Source
Construction Activity33.2%Construya Index
Construction Employment14.2%EIL - Labor Indicators by Sector
National Machinery & Equipment15.5%Industrial Production Index
Imported Machinery & Equipment25.9%Monthly Trade Statistics
National Transportation Equipment7.4%ADEFA Car Statistics
Imported Transportation Equipment3.9%Monthly Trade Statistics

The first step, then, is creating a pipeline. To do this, click on the "create" button and select "pipeline". Name the pipeline in any way that is easy to find later (especially since, if it is public, then the name is also publicly displayed), and select a relevant repository for it - in a public repository, everyone would be able to modify it, while a private one would allow for more control over modifications.

image.png

image.png

The next step is fetching a dataset. Since the index utilizes many different ones, it's not particularly relevant which ones to begin with - just search for it by name. For example, using the Construya Index as a proxy for construction activity, you can just look up "Construya Index" on the search bar and look for the desired dataset. It's inadvisable to choose the "raw data" version of this dataset, because the data is, as the name would sugges, raw - for instance, if you wanted a seasonally adjusted variable, you would have to do it manually. If the dataset has more than one column, you can select it during the fetch process. Don't forget to click "save" after selecting the dataset!

image.png

image.png

The next step is merging the dataset with all the other relevant ones. Click "add step below" and look for "merge" in the list. Selecting the dataset to merge with works exactly the same as fetching it: look up its name, type it, select it. If you need to modify the dataset in any way (rename columns, change the date frequency, etc) you can do this by selecting "fetch and modify data before merge - Add step above in new branch".

image.png

The remaining step is "matching entities". All datasets have at most two kinds of entities, at least one. The one they all have is "date" - the date for each data point. So when matching entities, choose "date" and "date" (they might have different names, like date and year, but the principle is the same). The remaining entity type is "countries" - what the data refers to. These are often countries, but sometimes other kinds of data too. Either way, if they are available, match them; if not, it's not a problem. The dataset for construction employment doesn't have an entity - and the pipeline works just fine.

You can merge every dataset consecutively (in this case, all six of them) or you can do it one at a time. But either way, the truth is that they don't all contain useful information - some variables are not relevant and clutter the process (especially in later steps). Regardless of the choice, selecting columns is a must. You can do this while selecting a dataset, or after merging through the "select variables" option. Either alternative works the same, by dropping the variables that won't be needed from the dataset. Be mindful of the difference between seasonally adjusted and regular variables, and between variables and MoM/YoY changes!

image.png

A step to take before calculating the variables is limiting the date range - since not all variables have the same frequency, it could lead to problems. The way to do this is to select the "filter dates" step, then choose the initial date (in this case, January 1st, 2016), and then choose "latest available" as the end date.

image.png

The final step is calculating the variables and the index itself. This is done in two steps. The first is converting all of the variables into a base=100 index, simply to prevent their different scales from muddying up the numbers - for instance, a variable that is measured in millions of pesos would have a thousand times more weight than one measured in thousands. A simple way of doing this is taking the variable for the base date to use (which will be the soonest date at which all variables are present - for instance, in the case of the investment index, it's January 2016), dividing each value by that figure, and multiplying by a hundred. Then, each component is calculated as a weighted sum of the subcomponent indeces.

The second half of the last step is calculating the Consumption Index itself, which is done using all the subindeces and multiplying them by their weight. Since they all use base 100 variables by construction, their base values don't need to be adjusted. After this is done, the remainder is simple: filter out the input variables (Construction Employment, Car Sales, etc) to have a cleaner presentation of the final dataset.

You can publish the dataset in any repository, so the data is available directly to anyone else (and it's best if it keeps with the general Alphacast formula of Topic - Country - Dataset Name - Frequency), or you can use it for, for example, charts straight away. If there are any errors, check for unsaved changes, bad formulas, or even typos. Voila!

You can find the pipeline for the Argentina Coincident Investment Index here.

How to make the Consumption Index

The process to make the Coincident Consumption Index is similar, since it has the samer overall logic, although the higher complexity of the data series involved results in a more nuanced process to make it. Just like the Investment Index, it has several sections with their own individual components, but it also has several components that are made up of multiple data series as well.

The components and data series are:

ComponentWeightIndicators
Food and Beverages22.7%Food production, Egg production
Housing and Utilities14.5%Real Estate and Housing Services, and Public Services activity
Transportation14.5%New car purchases, used car purchases, motorcycle purchases, Fuel Production, Air Traffic
Recreation and Culture8.6%Related goods production
Clothing and Footwear6.8%Clothing and Footwear production
Restaurants and Hotels6.6%Restaurants and Hotels activity
Healthcare6.4%Healthcare activity
Home Equipment5.4%Home Appliances production, Electronics production, and Furniture and Mattresses production
Communications5.3%Communications activity
Other Services4.3%Financial Services and Personal and Professional Services activity
Education3.2%Education activity
Alcohol and Tobacco1.9%Wine and Cigarettes production

The actual technical steps to make the index are very similar: create a pipeline, fetch a first dataset, merge with the rest of the relevant datasets (selecting or filtering only the relevant variables), rename columns for greater clarity. The datasets used for this are listed above, though most variables come from either the Industrial Production Index or the Monthly GDP Estimate (EMAE) - both official data.

image.png

The only difference with the investment index, in terms of methodology, is that the subcategories for each component need to be calculated separately - for example, the "Meat" component of Food and Beverages has to be done before the category, so it can be included. The actual procedure is the same: add up a weighted sum of the variables in a base-100 index.

After calculating each data series, which can be done in the same way as the investment index, one can also publish it in a dataset, transform the data, and/or create charts as desired.

You can find the pipeline for the Argentina Coincident Consumption Index here.

Usefulness and Accuracy

The usefulness of constructing, or using, a coincident index is that one can receive a glimpse of upcoming economic trends a few months in advance, and therefore adjust behaviors in turn. A coincident index can provide an useful guide to both observing and analyzing reality, as well as delineating policy recommendations. For instance, it can be seen that the growth rate of investment declined in recent months, and while it has recovered from the 2020 recession rapidly, it is still roughly 10% below 2018 levels. Additionally, it can be broadly inferred that Construction and Machinery and Equipment have seen a smaller depression and a faster recovery, while Transportation Equipment still has a long ways to go.

As long as the indicator proves reliable in forecasting changes in the economic variable at hand, it has a variety of uses related to forecasting, analysis, and investment. As seen above, both the YoY change and the 3-month moving average annual change are fair approximations of the INDEC investment data, meaning it can be relied upon to present an accurate approximation of the upcoming trends in investment.

Maia Mindel

Written by

Maia Mindel

Macroeconomic analyst at Alphacast. Following inflation, activity, and trade.

Alphacast

Part of

Alphacast

Related insights

  • Read more...

    How to convert a series to the official USD or Blue Chip Swap?

    The pipeline engine "Apply Transform" step incorporates a new transformation that allows changing the source unit: Convert to dollar official or to Blue Chip Swap (for Argentina only).

    The pipeline is separated into Two steps

    1. Select ("Fetch") the dataset and its columns
    2. "Apply transform"
  • Read more...

    How is a Time Series seasonally adjusted?

    Removing seasonality from time series is always complicated and laborious. The standard deseasonalization method is X-13ARIMA-SEATS or some other version of the methodologies maintained by the United States Census Bureau. Denationalizing usually includes using some application such as Eviews, Demetra or Stata or Python, combining it with the files that are downloaded

  • Read more...

    How to merge the content of two datasets?

    Surely in your usual work with data, you needed to join several data sources and if your calculation tool is Excel you may solve it with some combination of the VLOOKUP, HLOOKUP, and/or MATCH formulas. Excel is a great solution in many cases, but it can be difficult in some scenarios.

  • Read more...

    How to calculate a monthly end-of-period series?

    Pipelines are an easy way to apply transformations to datasets that update automatically every time data is updated.

    Suppose we have a daily data series for which we need only the last value of each month. It is possible to do that in Excel. For example, an auxiliary column is added that