Creating a pipeline do's and dont's

With pipelines you can process the data within the platform. A pipeline is a sequence of steps, transformations and others to be applied to a particular dataset that has the characteristic of updating automatically every time a data is updated. There are two different ways to do this:

  • From scratch: click Create new on the upper right corner and then Pipeline.
  • From any dataset: you can create a pipeline from any dataset. When entering to the dataset view, on the right you will find the button Create Pipe

Both options will lead you to the Pipeline Engine. Here you will need to choose the repository where the pipeline will be stored and select a name for it. After that is where the fun begins! First, you will have to choose the dataset you want to work with. Once you save it, you will be able to choose different steps to modify and transform your data as you wish.

As an example we chose the following dataset: Inflation - Argentina - INDEC - Consumer Price Index - Groups - Monthly and we applied the following steps:

  • Rename columns: to change the name of one of the variables
  • Change Frequency: to rescale the frequency from monthly to quarterly
  • Select columns: to leave only the variable we were interested in and leave behind the other ones.

Remember to click the Save button each time you finish working on your step.

Once you finish working on your pipeline, you will need to click the default step called Publish to dataset to assign a name and repository to your new dataset. It will be automatically updated every time the original dataset is updated either by Alphacast or by yourself or when the pipeline is run. After that, you can save it and preview it or you can just go to the new dataset to start charting!

DLnTIG77LP.gif

Related insights