Batch API using Python SDK (Clone)

03:48 mins

R

Ralph Florent

Updated on Feb 19, 2025

How to Run Batch APIs Using the Coherent Python SDK

Hi, everybody. My name is Ralph, and in this video, I'm going to show you how to run batch APIs using the Coherent Python SDK. If you're not familiar with batch APIs, they are a series of endpoints that allow you to collectively process a massive amount of data using dedicated infrastructure. While you could manually orchestrate this operation using the API documentation, this demo will demonstrate how to achieve the same results using the Python SDK.

Getting Started with the Python SDK

The Python SDK package can be found in the PyPI library under the name cspark, short for Coherent Spark. This package covers a series of endpoints, especially those for batch APIs. For more information, you can navigate to the main documentation. However, for today's demo, we will jump right in and show you how it works.

Term Life Quote Model as a Spark Service

We will use the Term Life Quote model as a Spark service to demonstrate the batch API in action. On the left, you will see a series of inputs, which are the fields we will be using. Upon execution, the expected outputs will appear on the right.

Setting Up the Environment

Python Environment: We will use Python 3.11 along with all necessary dependencies.
.env File: This file will store the Spark credentials, such as bearer token and base URL.
input.json: This dataset contains about 1000 data points in a tabular format, with headers followed by values.
main.py: This source code contains the logic to run the batch API.
Requirements: This specifies all dependencies needed to run the process, including cspark.

Running the Batch API

Let's open the terminal to see how this works. We will run the process using the command:

python main.py

Once executed, the process will start. We pushed 1000 records to the pipeline, and the execution is super fast and efficient. The batch API calls a series of endpoints collectively to achieve this.

Viewing the Outputs

Let's take a look at the outputs by opening the output.json file. This file is divided into chunks, with each chunk containing input fields and output fields. The input fields are the chunks submitted for processing, and the output fields are the results obtained. The data is in a tabular format, similar to what we saw in the API tester. We chose to persist this as a file, but you are free to choose a different persistence layer for your system.

Conclusion

This concludes the batch API demo. I hope you found it informative and enjoyable. See you in the next video!