Friday MEGA update - September 10th

We don't want to exaggerate, but this was probably the most productive week in many months. We have so many things to share that we don't know where to start: pipelines, filters, new API, interpolation and resampling, improved search, and more! So "Let's go in parts," said Jack the Ripper.

New Pipelines editor

Pipelines are finally here! You can now create your own datasets derived from any other dataset in the platform. From an input dataset filter and rename the variables, rescale the frequency, combine and transform and a new dataset will be created in your repository that will be updated every time the original dataset is updated.

Pipelines - Luciano.drawio (1).png

Now in the Datasets view you have a new button: Create Pipe that will start a new Pipeline

new pipe.png

The process of creating a pipeline implies 5 steps

  1. Select the input and output repository and datasets (selected by default)
  2. Select the output date frequency. If this is different than the input then rescaling options will appear (see below)
  3. Select the name of the new variables. Only variables with names will be part of your dataset. If required, select the interpolation and rescaling options.
  4. Create linear combinations of variables and the transformations for your variables from a long list of options
  5. Save the pipeline. You will be redirected to your new dataset where you can fill the metadata while the pipeline is running. Thats it!!

Interpolation & Rescaling

If the frequency of the output is different than the input then rescaling and interpolations definitions are required.

Pipelines - Luciano.drawio (2).png

When going from a lower frequency to a higher frequency (i.e. monthly to daily) you have to select from the 8 interpolation options: Repeat the value, linear interpolation, quadratic, cubic, polynomial, and piecewise, Splines and Krogh interpolation.

Alternatively, when going from higher to lower (I,e. monthly to quarterly) there are 6 options: average all the values, sum, select the min or max or the last and first.

methods.png

This is an example of the different interpolation methods applied on the same input dataset.

New API & Improved filters

The new API is up and running. We have been speaking about the API in previous updates so we will not bother with repetitive details. However, as a summary, now using the API you can:

  • Create and manage your repositories
  • Create multi-entity datasets (with Date + any number of entities)
  • Upload data and choose whether to append or replace existing data.
  • Monitor the status of your processes
  • Download data either full dataset or filtered (more on this below)

Check the postman documentation for further reference.

With the new API there are now new filtering options, some of which we have already implemented in the dataset views. You can now filter the data you wish to download by date. When "Filter variable" is enabled just select the start and end dates (or just one of them), and only data between those dates will be downloaded.

filter_dates.png

Have a nice week!

Luciano Cohan

Written by

Luciano Cohan

Co-Fundador de Alphacast. Ex Subsecretario de Programación Macroeconómica. Data Science. Creando una plataforma para el trabajo colaborativo en economías

Repo with the log of updates to the website and dataset.

Alphacast

Part of

Alphacast

Related insights