Generate a PowerPoint file using Azure Functions and Python

For many years I have been involved with the Azure Sydney User Group, and even though I’m no longer the organiser I still gather updates for Azure in the prior month and prepare a PowerPoint presentation that contains them. My go-to place for the information is the Azure Updates website and I’ve typically just been browsing the site and manually updating an existing PowerPoint template.

Recently I reflected on the amount of time it takes for me to click through the pages and pull out the headlines, and decided this probably wasn’t a great use of my time. Even though it’s typically under 10 minutes to do it’s still repetitive which clearly means it’s the perfect candidate to automate!

I’ve used it for other integration in the past, so I know the updates website has an RSS feed which is perfect for automation.

Choosing an implementation approach

There are a few ways I could have built an automation, and my original intention was to use either Azure Logic Apps or Power Automate Flow and integrate into PowerPoint online in Microsoft 365. Unfortunately, it turns out there is no native PowerPoint online connector which means this approach became a no-go!

In the absence of this integration capability, I decided to turn to a code-based solution because I know there are many ways to generate Office documents through SDKs that implement the Office Open XML File Formats standard (ECMA-376).

One of the reasons I also looked at Logic Apps or Power Automate was their serverless pay-on-execution model. Keeping this in mind I turned to my trusty friend, Azure Functions. At this stage the no-brainer for me as a long-term C# developer would have been to implement a .NET-based Function, but as I’ve said a few times before, I’m wanting to push my skills to cover other languages, so I thought I’d have a go with Python.

Azure Functions + Python = ❤️

It turns out there is an excellent PowerPoint library for Python called python-pptx that has everything I needed, and I found a great blog and sample from Matthew Wimberly that had what I needed to read and parse an RSS feed. Now I had these two elements I needed a little bit of Functions magic to tie it together and provide a simple HTTP API I could use to generate my presentation.

The resulting Azure Function (shown below) does all I need in less than 200 lines of code.

import logging
import azure.functions as func
import os
from azure.storage.blob import BlobClient, BlobSasPermissions, generate_blob_sas
from datetime import datetime, timedelta, timezone
from pptx import Presentation
from pptx.util import Pt
import requests # pulling data
from bs4 import BeautifulSoup # xml parsing
# RSS scraping function
# Based mostly on: https://github.com/mattdood/web_scraping_example/blob/master/scraping.py
def get_updates_rss(startDate, endDate):
article_list = []
try:
# execute my request, parse the data using XML
# parse using BS4
r = requests.get(os.environ["UpdatesURL"])
soup = BeautifulSoup(r.content, features='xml')
# select only the "items" I want from the data
updates = soup.findAll('item')
# for each "item" I want, parse it into a list
for a in updates:
# Get publication date
published = a.find('pubDate').text
pubDate = datetime.strptime(a.find('pubDate').text, "%a, %d %b %Y %H:%M:%S Z")
# only include items falling within our requested date range
if (pubDate >= startDate and pubDate <= endDate):
title = a.find('title').text
link = a.find('link').text
# basic parse to flag announcement types
if "preview" in title.lower():
announcement_type = "preview"
else:
announcement_type = "GA"
# create an "article" object with the data
# from each "item"
article = {
'title': title,
'link': link,
'published': published,
'antype': announcement_type
}
# append my "article_list" with each "article" object
article_list.append(article)
# after the loop, dump my saved objects into a .txt file
return article_list
except Exception as e:
logging.exception("Couldn't scrape the Azure Updates RSS feed")
###
# Generate a section of the final PowerPoint
###
def generate_presentation_section(presentation, layout, articles, item_type):
# Add first slide and slide notes
slide = presentation.slides.add_slide(layout)
slide_notes = slide.notes_slide
shapes = slide.shapes
slide_item_count = 0
total_item_count = 0
article_count = len(articles)
slide_count = 1
for article in articles:
# Each new slide requires first elements be added differently to the rest.
if slide_item_count == 0:
# Insert title for slide
title_shape = shapes.title
body_shape = shapes.placeholders[1]
title_shape.text = item_type + " (" + str(slide_count) + ")"
# Insert first bullet item
tf = body_shape.text_frame
tf.text = article["title"]
tf.paragraphs[0].font.size = Pt(24)
# Insert first slide note
sltf = slide_notes.notes_text_frame
sltf.text = "- " + article["link"] + " (" + article["published"] + ")"
else:
# Insert bullet point
p = tf.add_paragraph()
p.font.size = Pt(24)
p.text = article["title"]
# Insert slide note
dotpoint = sltf.add_paragraph()
dotpoint.text = "- " + article["link"] + " (" + article["published"] + ")"
slide_item_count += 1
total_item_count += 1
# If we hit 5 items on a slide, create a new slide and reset item count
# If there aren't any items left, don't create a new empty slide
if slide_item_count == 5 and total_item_count < article_count:
slide = presentation.slides.add_slide(layout)
slide_notes = slide.notes_slide
shapes = slide.shapes
slide_item_count = 0
slide_count += 1
###
# Upload generated file to Azure Storage and generate a SAS URL for it
###
def upload_file_to_storage(presenation_file):
blob_client = BlobClient.from_connection_string(conn_str=os.environ["PowerPointAccountConnection"], container_name=os.environ["PowerPointContainer"], blob_name=presenation_file)
with open(presenation_file, "rb") as data:
blob_client.upload_blob(data)
# Generate a SAS-protected URL for the item which will allow the caller to download the file for 1 hour.
startTime = datetime.now(tz=timezone.utc)
endTime = startTime + timedelta(hours=1)
return "https://&quot; + os.environ["PowerPointStorageAccount"] + ".blob.core.windows.net/" + os.environ["PowerPointContainer"] + "/" + presenation_file + "?" + generate_blob_sas(os.environ["PowerPointStorageAccount"],os.environ["PowerPointContainer"],blob_name=presenation_file,account_key=os.environ["PowerPointStorageKey"],permission=BlobSasPermissions(read=True),start=startTime,expiry=endTime)
#####
# Azure Function main entry point
#####
def main(req: func.HttpRequest) -> func.HttpResponse:
blob_sas_url = ""
message = ""
http_status = 200
try:
# start date is required
startParam = req.params.get('start')
if not startParam:
message = "Bad request: 'start' query parameter is required in format YYYY-MM-DD."
http_status=400
else:
# end date is optional, so if not provided use today
endParam = req.params.get('end')
if not endParam:
endParam = datetime.now("%Y-%m-%d")
# add 1 day to end date so we include all of the day
ending = datetime.strptime(endParam, "%Y-%m-%d")
ending = ending + timedelta(days=1)
starting = datetime.strptime(startParam, "%Y-%m-%d")
updatelist = get_updates_rss(startDate=starting,endDate=ending)
if len(updatelist) > 0:
prs = Presentation()
# Initialise default slide layout (bullets)
bullet_slide_layout = prs.slide_layouts[1]
preview_items = [item for item in updatelist if item["antype"] == "preview"]
ga_items = [item for item in updatelist if item["antype"] == "GA"]
generate_presentation_section(prs, bullet_slide_layout, preview_items, "Preview")
generate_presentation_section(prs, bullet_slide_layout, ga_items, "GA")
filename = os.environ["LocalTempFilePath"] + "AzureUpdate-" + datetime.strftime(datetime.now(),"%Y-%m-%d-%H-%M-%S") + ".pptx"
prs.save(filename)
blob_sas_url = upload_file_to_storage(filename)
message = "File created and uploaded to storage. You can <a href='" + blob_sas_url + "'>download it</a> for the next 1 hour."
else:
message = "There are no updates for the specified period, so no PowerPoint has been generated.",
except TypeError as te:
logging.exception("Type error")
message = "Check the format of your request and ensure you provide the 'start' query parameter in the format YYYY-MM-DD",
http_status=400
except ValueError:
pass
return func.HttpResponse(
mimetype="text/html",
body=message,
status_code=http_status
)

To run it, you invoke the Function via a web browser with a URL similar to:

https://your-func-app.azurewebsites.net/api/GeneratePresentation?code=YOUR-FUNC-KEY&start=2021-06-20&end=2021-06-30

If the supplied date range is valid and there are updates that fall within it, you receive a simple web page with a link to a downloadable PowerPoint file held in a private Azure Storage account which is served for a limited period using a SAS-protected URL. The full documentation around how to debug, deploy and execute the Azure Function can be be found on the GitHub repository for the solution. Also, here’s a sample of what you can generate.

I have deployed the solution onto a Consumption plan in Azure which means I’m not paying for idle compute, and the PowerPoint takes up so little space that my Storage Account costs will be tiny, especially given this API endpoint can’t be invoked by just anyone. Finally, to save myself even more money, I have a Timer Function that once a week deletes any PowerPoint files sitting in the Storage Account, which won’t be many (if any) for most of the time.

I’m pretty happy with the solution as it stands, but in future I might look to use my existing PowerPoint template as the base for the resulting presentation which means there would be even less manual work for me to do. Right now I still need to copy / paste from one PowerPoint to the other, but this is so trivial that I’m not bothered about automating it away … just yet 😉.

Hopefully you find some inspiration in the solution here!

Happy Days! 😎

P.S. The GitHub repository with the solution is here: https://github.com/sjwaight/AzureUpdatesPresentationGen

5 thoughts on “Generate a PowerPoint file using Azure Functions and Python

  1. Nice. I have to do the same thing for ADNUG, and inevitably it has to happen on the day of our meetup as all the .NET changes tend to come out the night before (aka Patch Tuesday).

    Will have to give it a go!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s