How to calculate a variable with the formula editor

The calculate variable step is an all-purpose formula parser to create new variables in a more "Excelish" way. There are dozens of operators, comparisons, and functions that allow endless combinations. For a comprehensive list of functions and operands see here

Video Tutorial (Spanish)

First step. Writing a basic formula

The first step is to create a pipeline (click on "Create new --> Pipeline" on the top right) and select a data source, followed by "add step below --> Calculate Variable"

There are a couple of rules to creating new variables

  • Select a name on the textbox. If you select a variable name that already exists it will be overwritten
  • You can use @ to select variables of the dataset. Start writing the name of the variables and options will show up
  • You can do basic arithmetic operations such as (+, -, *, /, ^): @column1 + 3
  • You can use functions by typing the name followed by parenthesis and commas to separate parameters: round(@column) + @column2. Full list of functions is here: here.
  • The formula will always return the same number of rows as the original datasets, even if it involves calculations that usually reduce the dimensionality such as "Total sum per group". Check group and rolling windows functions below.
@Close + @Volume * 2)

image.png

Row Functions

Row functions are those that are calculated using only information from a single row in the dataset. That is, no values from other periods or entities are used in the calculation. This includes functions such as power, exp, sqrt, log, abs, trunc, round,sin, cos, etc.

log(@Close)+ sqrt(@Volume))

Nested calculations

Calculations can be nested inside other calculations to be used as parameters

log(exp(@Close))

Group Functions

Group functions are used to make calculations for subgroups within the dataset. The data is split by the unique values of the entities defined in the parameters and calculations are done for each separate group. Entities can be the real entities of the dataset or any other variable or even the result of nested calculations.

There are dozens of group functions. See a comprehensive list here

Example: Calculate for each ticker the cumulative price change vs the first price 
@Close / first(@Close, @Ticker)

Example: Calculate the cumulative sales per salesperson
cumsum(@sales, @salesperson)

Example: Calculate for each date the share of sales explained by each product vs the total sales of that date
@sales / sum(@sales, @date)

Rolling Window Functions

Rolling windows are a special class of group function where calculations are made within a group AND within a rolling window. The user has to define the number of periods to be included in the rolling calculation

Example: calculate the 20 day moving average for each ticker
rollingmean(@Close, 20, @Ticker)

Date and time Functions

Date and time functions are performed over DateTime variables (usually the @Date entity of the dataset) and include functions such as year, month or day, week, dao_of_year and more

Example: Extract the year from the date entity
year(@Date)

Luciano Cohan

Written by

Luciano Cohan

Co-Fundador de Alphacast. Ex Subsecretario de Programación Macroeconómica. Data Science. Creando una plataforma para el trabajo colaborativo en economías

Related insights

  • Read more...

    A short guide to Ecuadorian macro and financial data

    Interested in activity, prices, monetary, fiscal, external sector, and financial data for Ecuador? There are loads of datasets, available in both the Central Bank of Ecuador (BCE) and **National Institute of Statistics (

  • Read more...

    A short guide to Chilean macro data

    Interested in activity, prices, monetary, fiscal, external sector, and financial data for Chile? There are loads of datasets, see for example this Repository which has official statistics from INEI and BCRP. However, this short guide will help you find the our "must-see" datasets.

    Chile Country Profile Dashboard

    The starting point is

  • Read more...

    A short guide to Uruguayan macro and financial data

    Interested in activity, prices, monetary, fiscal, external sector, and financial data for Uruguay? There are loads of datasets, see for example this Repository which has oficial statistics from from different sources, including both government statistics and datasets produced by private entities. If you are looking for data about Uruguay you'll

  • Read more...

    A short guide to Mexican macro and financial data

    Interested in activity, prices, monetary, fiscal, external sector, and financial data for Mexico? There are loads of datasets, see for example this Repository which has oficial statistics from from different sources, including both government statistics and datasets produced by private entities. If you are looking for data about Mexico you'll

  • Read more...

    A short guide to Peruvian macro data

    Interested in activity, prices, monetary, fiscal, external sector, and financial data for Peru? There are loads of datasets, see for example this Repository which has official statistics from INEI and BCRP. However, this short guide will help you find the our "must-see" datasets.

    Peru Country Profile Dashboard

    The starting point is