Interacting with Alphacast API on Python
API References and Authentication
Alphacast API is organized around REST. Our API has predictable resource-oriented URLs, accepts form-encoded request bodies, returns JSON-encoded responses or CSV files, and uses standard HTTP response codes, authentication, and verbs.
All API requests must be authenticated. Alphacast API uses an API key to authenticate requests. When you sign up and login to your account at https://www.alphacast.io, on the left side of the screen you will find your user name, and then if you go to “Settings”, on the general page you will find your own API Key.
Anyway, if you have any issue in obtaining your API key, please contact hello@alphacast.io.
Downloading and uploading data
Interacting with the Alphacast API on python to upload and download data is very straightforward. You will only need the API key (see the previous section) and some basic libraries.
The are 3 python libraries for requesting and uploading data that you will need to install:
- requests: It allows you to replicate a browser request using different methods like GET or POST and setting different parameters. Another option would be to use urllib3.
- io : It will help you to decode the server response and make it readable.
- pandas: It allows you to organize and wrangling your data.
Downloading data
Depending on the amount of data that you want to download, the data can be downloaded in various ways through the API.
Get all repositories
First of all, if you want to know all the repositories available for your account, you can get a data frame containing all of them. Here is an example code:
import requests
from requests.auth import HTTPBasicAuth
import pandas as pd
import io
API_key = 'ak_lCNKZN26aap61EtCWEZy'
r = requests.get('https://api.alphacast.io/repositories', auth=HTTPBasicAuth(API_key, ""))
df = pd.read_json(io.StringIO(r.content.decode('utf-8')))
This will return you a dataframe with all the details (IDs, names, descriptions, etc.) of your repos.
id | accountId | name | description | privacy | slug |
---|---|---|---|---|---|
315 | 265 | My Public Repo | You can use you first public repository to sha... | Public | public-repo |
317 | 265 | Webinar Repo | Este es el Repositorio de prueba para el webinar | Private | webinar-repo |
Get all datasets
If you haven’t selected a particular dataset to download yet or you don’t know which ID has, this way provides you with a JSON representation of all datasets available under your account. Here is an example code:
import requests
from requests.auth import HTTPBasicAuth
import pandas as pd
import io
API_key = 'ak_lCNKZN26aap61EtCWEZy'
r = requests.get('https://charts.alphacast.io/api/datasets', auth=HTTPBasicAuth(API_key, ""))
df = pd.read_json(io.StringIO(r.content.decode('utf-8')))
Then if you print the resulting dataframe, it will look like:
id | name | description | database | tags |
---|---|---|---|---|
5208 | High Frequency CPI - Argentina - Wide - Weekly | Alphacast Basics: Argentina High Frequency CPI | [{'id': 1795, 'name': 'High Frequency CPI'}, {... | |
5225 | High Frequency CPI - Argentina - Weekly | Alphacast Basics: Argentina High Frequency CPI | [{'id': 1795, 'name': 'High Frequency CPI'}, {... | |
5226 | High Frequency CPI - Argentina - SEIDO vs INDE... | Alphacast Basics: Argentina High Frequency CPI | [{'id': 1795, 'name': 'High Frequency CPI'}, {... | |
5231 | Public Opinion - Latin America | SEIDO: Latin American Public Opinion | [{'id': 1798, 'name': 'Private'}, {'id': 1802,... | |
5236 | Public Opinion - Argentina | SEIDO: Latin American Public Opinion | [{'id': 1802, 'name': 'Public Opinion'}, {'id'... |
Now you can see all the datasets that you have available along with their IDs, descriptions, tags and the repository / database to which they belong! The id column is numeric and each row in the tags column contains a list of dictionaries where the key is assigned to a different tag.
Get an specific dataset
If you have already chosen the dataset from all the available data, select the id and change the url request used in the previous entry. For example, if we want to download the ‘High Frequency CPI - Argentina - Wide - Weekly’ dataset, the id would be 5208. The dataset can be downloaded in json, csv or xlsx format. Both of the last two formats allow you to download in long format (default) or wide format.
url = "https://charts.alphacast.io/api/datasets/"
dataset = '5208'
extension = '.csv'
r = requests.get(url + dataset + extension, auth=HTTPBasicAuth(API_key, ""))
df = pd.read_csv(io.StringIO(r.content.decode('utf-8')))
If the format is different from csv, it will be necessary to change the pandas function to read the requested info.
This will return a dataframe containing:
- Entity: This is usually the name of the country,
- Year: This is expressed in Year-Month-Day and depends on the frequency of the dataset.
- Series: There will be a column for each series.
Entity | Year | seasonal | core | regulated |
---|---|---|---|---|
Argentina | 2013-03-08 | 1.000916 | 1.001550 | 1.001314 |
Argentina | 2013-03-15 | 1.002491 | 1.002641 | 1.002509 |
Argentina | 2013-03-22 | 1.006532 | 1.005014 | 1.003886 |
Argentina | 2013-03-29 | 1.000876 | 1.006997 | 1.004829 |
Argentina | 2013-04-05 | 1.001603 | 1.007388 | 1.011392 |
Get only one or multiple series from a particular dataset
Through the API, you can also get only one or multiple series of interest from a dataset without having to download it entirely. If you are recurrently working with a dataset and do not need all the series from it, you can download the whole dataset the first time and extract the ids of the series for the subsequent downloads. The code below shows how to obtain the id of the series for dataset 5208:
url = 'https://charts.alphacast.io/api/datasets/'
dataset = '5208'
r1 = requests.get(url + dataset + '.csv?includeVariableIds=true', auth=HTTPBasicAuth(API_key, ""))
df1 = pd.read_csv(io.StringIO(r1.content.decode('utf-8'))).T
df1.reset_index()[['index', 0]]
The request url is similar. This will return a transposed dataframe and the old index with the first column.
index | 0 |
---|---|
Unnamed: 0 | Entity |
VariableId | Year |
143741 | seasonal |
143742 | core |
143743 | regulated |
143744 | general |
143745 | food |
143746 | dollarized |
143747 | non_dollarized |
The first column of the table contains the id of the series and the second one contains the name of the series. This info will help us to select the desired series. In the next example we will obtain only the first two series of the dataset in csv format.
url_base='https://charts.alphacast.io/api/datasets/'
dataset='5208'
series = ['143741','143742']
url = url_base + dataset + ‘.csv?variableIds=’ + ','.join(series) + '&transformationIds=original'
r1 = requests.get(url, auth=HTTPBasicAuth(API_key, ""))
df1 = pd.read_csv(io.StringIO(r1.content.decode('utf-8')))
This will return a dataframe with the ‘Seasonal’ and ‘Core’ series:
Entity | Year | seasonal | core |
---|---|---|---|
Argentina | 08/03/2013 | 1.000916 | 1.00155 |
Argentina | 15/03/2013 | 1.002491 | 1.002641 |
Argentina | 22/03/2013 | 1.006532 | 1.005014 |
Argentina | 29/03/2013 | 1.000876 | 1.006997 |
Argentina | 05/04/2013 | 1.001603 | 1.007388 |
Uploading data
The process of uploading data to Alphacast’s website through the API consists of three steps:
Step 1: Create a repository
When uploading the information, the repository where it will be uploaded must be created first, unless the one to be used already exists. To create a new repository you need to define some parameters such as the name of the repo, a description of it, wether it will be of public or private access and set a slug.
Unlike when we wanted to download data, for creating a repository we will use POST Method instead of GET. The code below shows an example of it:
url = "https://api.alphacast.io/repositories"
form={
"name": "Example Repo",
"description": "This is a private repository",
"privacy": "Private",
"slug": "test-repo"
}
r = requests.post(url, data=form, auth=HTTPBasicAuth(API_key, ""))
After that, you can get the information of the repo you have just created along with all the others repositories, by making a request GET:
r = requests.get(url, auth=HTTPBasicAuth(API_key, ""))
r.content
The response will provide you the “name”, “description” and the other parameters of the repo you have created along with its ID, which you will need later to create a dataset inside this repository.
Step 2: Create a dataset
After creating a repository, you will be able to create a dataset in which you can upload new data. To make a proper upload, you will have to set two parameters: the name of the new dataset and the ID of the repo in which it will be uploaded. As in the previous entry, the POST method will be required. The code below shows an example of it:
url = "https://api.alphacast.io/datasets"
form={
"name": "test_dataset",
"repositoryId": 621
}
r = requests.post(url, auth=HTTPBasicAuth(API_key, ""), data=form)
After you create a new dataset, Alphacast will return the id alongside other details of the dataset.
pd.DataFrame(eval(r2.content.decode('utf-8')), index=[0])
id | name | createdAt | repositoryId |
---|---|---|---|
6825 | test_dataset | 2021-07-27T15:50:08.510165 | 621 |
Also, you can get the information of the dataset you have just created along with the information of all the others datasets in the same repo, by making a request GET:
url = "https://api.alphacast.io/datasets"
r = requests.get(url, auth=HTTPBasicAuth(API_key, ""))
pd.DataFrame(eval(r.content.decode('utf-8')))
This will return a DataFrame object with all the datasets that belong to your account.
id | name | createdAt | updatedAt | repositoryId |
---|---|---|---|---|
6504 | test_datasets | 2021-07-01T17:11:30 | 2021-07-01T17:11:30 | 317 |
6505 | Fx Premiums - Test | 2021-07-01T17:55:50 | 2021-07-01T17:55:50 | 317 |
6510 | Fx Premiums | 2021-07-01T19:36:57 | 2021-07-01T19:36:57 | 315 |
6515 | 5331 | 2021-07-01T21:12:29 | 2021-07-01T21:12:29 | 317 |
6516 | EMAE | 2021-07-01T22:21:41 | 2021-07-01T22:21:41 | 315 |
Step 3: Uploading information to a dataset
Once a repository and a dataset is created, we can upload new data. Since we will be uploading data in csv format, the request will be made using the PUT method. The data can be stored in our computer, loaded as DataFrame or available remotely.
The API to upload data to a dataset requires two parameters:
- deleteMissingFromDB: True / False. This parameter indicates, if True, that if the data that was in the dataset is not in the new data being uploaded, it should be deleted from the database.
- onConflictUpdateDB: True / False. This parameter indicates, if True, that if there is a conflict between the data that is already in the database and the new ones, it keeps the new data and overwrites the previous ones.
In the example show below we will use a DataFrame that is already loaded in our current Jupyter Notebook (df_example):
url_base = 'https://api.alphacast.io/datasets/'
dataset_id = '6825'
additional_params = '/data?deleteMissingFromDB=True&onConflictUpdateDB=True'
url3 = url_base + dataset_id + additional_params
files = {'data': df_example.to_csv()}
r3 = requests.put(url3, files=files, auth=HTTPBasicAuth(API_key, ""))
This step will require more time and it will depend on the size of the file or dataframe. The server will return a response with the status of the uploaded data.
pd.DataFrame(eval(r3.content.decode('utf-8')), index=[0])
id | status | createdAt | datasetId |
---|---|---|---|
615 | Requested | 2021-07-27T16:05:24.879688 | 6825 |