Ingest Data from REST API’s to MySQL
This tutorial will be covering an end to end use-case of loading data from a REST api to a table in MySQL. It will be using Guzzle’s ingestion activity which allows to move data from any source dataset to a target dataset applying various validations, transformations and rejections.
Load Dataset from Azure Blob storage to the Guzzle Platform through its Source Section.
Loading or configuring the Source Dataset to a Target Dataset or Table.
Since we use Guzzle as our Data Engineering Workbench for movement and transformation we must ensure it is set up and configured properly. Guzzle supports both on-premise and cloud deployment.
Ensure that the file is saved in an available Source Location. This is essential as the Data will be ingested from REST api’s.
Ensure that the Target Location is available in order to prevent any errors after the Data has been Transformed and Validated.
Guzzle should be configured to support Azure Databricks and MYSQL. This is essential as we will be using a MySQL table in this tutorial. It will also help in exploiting all of Guzzle’s Features.
The first step in the Ingestion Process would be to deal with the Source Section of Guzzle. Guzzle’s Source Section provides many features including specifying the file names, location of the file and the file format.
In the Datastore option choose the Dataset you would like Guzzle to work on. Ensure that the Dataset is there in REST api before choosing it and select the datastore REST.
Give HTTP method requests like GET or POST and also give uri of api from where to fetch data.
If there are any request parameters or request headers, users can expand Configure Request Parameters and Configure Request Headers.
Under the Response Format choose the required File Type. Guzzle provides options from the delimited which is a fixed length system to multi-structured systems XML, JSON.
Guzzle also provides a feature to Sample your Data. This option can be seen in the top right corner of the Interface.
As seen above we have chosen the JSON Format with the json file pattern. The option to Preview Data can also be seen in the Top Right Corner.
After previewing the data can be seen.
We now have to load the Source Dataset to a Target Dataset or Table. Guzzle provides many different Datastores from Local Files, Delta Tables and MySQL. It allows configuring a Target Partition Scheme.
In the Datastore option choose the Target you would like to configure the Source Dataset to according to your requirement. Here we choose the MySQL table as we are ingesting Data into a MySQL Table.
We can now give any name to the Target in the Table Section. We can again Sample the Data through the Sample Data or Preview Data option given in the top right corner.
Here we have chosen a MySQL or JDBC table as our Datastore and assigned the name posts to our Table.