L&R 300: Python + Spark: Accelerating Automation
Tee
Updated on Mar 27, 2025
Hey everyone, Tsun here again to talk about another use case for Spark. Today, we're going to explore how to use Spark and Python together to automate a process.
We'll be using an example model called the Required Minimum Distribution (RND) Calc. This model requires inputs such as:
Years
Date of birth for the beneficiary and the spouse
Account balance
The outputs will be the current year RND amount and the modeled RND amount in the future.
Tag your inputs and outputs as usual.
Upload the model into Spark by hitting "New Service" and dragging the RND calculation into Spark.
Spark will convert the spreadsheet's logic, calculations, formulas, and tables into fast C++ code.
The code is hosted on the cloud and made accessible via API.
Once the API is published, you can access the RND calculation anywhere in your system of choice. Here's how to use Python to call the API:
Access the API endpoint and submit your inputs to calculate the RND amount.
For large-scale operations, such as processing 1,000 policies, use Python to automate the process.
Read data from a SQL database, run calculations through the Spark API, and store results back into SQL.
Execute the Python script to automate the calculations:
Read policy data from the SQL database.
Run calculations through the Spark API.
Output results back into SQL, creating a new table with outputs for each policy.
This process involves converting Excel logic into an API and running it at scale. Python serves as an excellent orchestration tool to connect data to calculations. While this example focused on RND calculations, clients also use this method for data transformations, post-model processing, reporting, and reserve calculations. Spark provides full governance and auditability for each calculation.
Thanks for your time today!