Datasets FAQ

How can I find and navigate data?

There are several ways to search for data:

  • You can use the Search bar from the top menu. Simply write what you are looking for.
  • You can also go to the Explore section of the top menu. On the left bar you will find filters to limit your search to what you need. Results can be sorted by name, popularity, last updated and most recent by clicking Sort and you can show or hide details by clicking Hide Details.
  • In the Explore section you can browse the categories, find the most downloaded and popular datasets and discover data relevant to you based on your history on the site

In the Explore section you will see, in the left column, different filters that help define and facilitate the search such as categories, country, source, etc.

How do I build my list of favorite datasets?

There are two ways to build lists of favorite datasets:

  • You can go to the view of the dataset you want and press the Follow option, located under the publisher. To remove it from the list, simply select Unfollow.
  • In the dataset listings, when browsing categories and tags, you will see the Follow button on the right, next to the update information.

Your selected datasets will then appear on your home page, in the followed datasets section.

How to upload/build a dataset?

There are 4 ways to create datasets::

  • Uploading by hand from a CSV or XLS
  • Automatically connecting it to a file in Google Drive
  • As a result of running a pipeline (read more)
  • Using the Python API or library (read more)

For the first two, on the Create new button at the top right, click on Dataset, after which you can drag the CSV or XLSX file you want to upload or select the Google Drive file you want to connect.

How to configure the columns of a dataset when creating it?

Once the file is selected, you must choose to configure the dataset, defining the Entity columns and the variables. The Entity columns are those necessary to uniquely define a row of the dataset (to see more about entities click here) and all datasets must have at least one “Entity” column with the date (to see the formats of accepted date click here) Then you define the type of each variable (if it is a text or a number), and indicate which variables you want to ignore. Finally, the last step allows you to choose the name of the dataset and its repository.

What is an “Entity”?

The Entity columns are those necessary to uniquely identify a row of the dataset. To that extent, the combinations of Entities cannot be repeated.

For example, if your only entity is Date your dataset cannot have repeated dates. If your entities are Date and Country the dates can be repeated, but the combinations of date and country must be unique.

Entities are equivalent to unique indexes in other data frameworks

What are the accepted formats for date columns when creating a dataset?

When creating a new dataset, the date column must be selected, marked as entity and, in data type, select the format. YY-MM-DD (year, month, day) is generally used, but can be changed with the Change Date Format button.

Are all datasets public?

Datasets can be public or private. If the repository where it is stored is public, the datasets it includes will be freely accessible. On the other hand, if the repository is private, the datasets will have restricted access to its members. Repository administrators can define the level of access. You can see how to do it by clicking here

Can I edit a dataset?

Datasets can be modified by selecting the Edit option at the top of the dataset view. Within the metadata section you can change the name, add/modify a description, tags, source and reference links.

Can I delete a dataset?

To delete a dataset, select the upper right Edit button of the dataset view and go to the bottom. Once there, passing the Metadata section, it can be deleted by pressing Delete this dataset in the Danger zone. Before deleting it, in Confirmation you have to write the name of the dataset to be deleted and pressing Delete will permanently delete it.

What is a “Tag” and what is it for?

A tag or label is a word associated with the content that includes it, which allows the information to be categorized into different topics and facilitates subsequent searches.

What are the “Activity” messages that appear to the right of a dataset?

When entering a dataset, the Activity column on the right shows the status of the actions performed on the dataset and the associated pipeline. Four stages may appear:

  • Updated shows when was the last time a new data was added
  • Processed means that the dataset finished running
  • Processing assumes that you are loading the new information
  • Error when something failed in the process.

What does the “Sync Now” button do?

By clicking on Sync Now the user calls the process that automatically fetches the data. If there is new data in the original source it should be updated within a few seconds.

How can I filter data within the same dataset and download it?

From the dataset view, you can download all or part of the data.

To download all the data, you simply need to click on Download and select the download format. By default, the data is downloaded as shown in the Data excerpt, but if you wish, you can download it by reversing rows by columns, that is, where the dates and entities are transposed as columns.

To filter information, you can do so by clicking Filter. You can filter variables, transformations, and the period it covers.

What are transformations and how are they used?

The transformations are the results of processing a variable using the Pipelines tool. It includes processes such as changes in the unit of measurement of the data, seasonal adjustments, filters, changes and frequencies, and more. To see more information about the pipelines and available transformations, click here.

Can I change the data frequency?

To change the frequency of the data you have to edit the pipeline, adding a step that is Change Frequency. The pipeline will identify the frequency the data is in, and ask you to choose the new transformation by adding the data interpolation method(s). The interpolation can be from higher to lower frequency (repeat, linear, quadratic, spline, etc.) or vice versa (average, sum, min, last, etc.)

What transformations are there and what does each one do?

There are different types of transformations depending on the frequency of the dataset. For example, transformations can be applied to a monthly variable, such as monthly average variation, annual accumulated variation, variation at constant prices, conversion to dollars or the seasonally adjusted variable. Also, you can change the frequency of the data and then apply transformations. To learn more about each transformation read here

Can I freely choose the transformations I want to apply to a dataset?

Data can undergo as many transformations as possible.

Luciano Cohan

Written by

Luciano Cohan

Co-Fundador de Alphacast. Ex Subsecretario de Programación Macroeconómica. Data Science. Creando una plataforma para el trabajo colaborativo en economías

Related insights

  • Read more...

    How to integrate Alphacast with Excel for Mac?

    You will need the complete URL for the dataset that you want to integrate, which you can find by clicking the Download button and then copying (not downloading!) the URL attached to a TSV file.

    !

  • Read more... alphacast_logo_1.png

    8 amazing econdashboards you cannot miss!

    Stop spending your day copying & pasting data from sources that frequently change their formats, and get your charts updated right away.

    Dashboards are updated every day with the most recent data. Remember that you can copy and clone everything you see on those dashboards. Create your charts or pipelines to transform

  • Read more...

    How can I create a basic dashboard?

    Here you will find some examples that will be useful when building your own dashboard

    Saving charts on your clipboard

    Before you start creating your first dashboard, it's useful you preselect the charts that you want to work with using the clipboard. The clipboard allows you to have all the charts

  • Read more...

    A short guide to Argentina's Mutual Funds Industry Analysis

    In this short tutorial, we will guide you on calculating ranks of YTD Total Returns for different funds and management companies.

    Alphacast hosts a number of daily updated datasets of Argentinas Mutual Funds Industry. Two weeks ago we began publishing detailed datasets for based on CAFCI daily reports, mostly in

  • Read more...

    A short guide to Argentina's Financial and Monetary Data

    There are more than 2.000 datasets in Alphacast, and there are plenty of hidden gems. This is a short "Must see" guide for those interested in Argentina's financial and monetary data.

    **Would you like to know more?