Reading and writing binary files with Python with Azure Functions input and output bindings

Ah, Streams… how could I ever forget about manipulating files in code and needing to use Streams to read and write them!

I’ve written a lot of .NET in my life and am used to the way you read and write files with it, but I’ve been increasingly doing a lot of Python (BTW: .NET could learn a lot from Python’s Virtual Environments construct!) and have been having to learn a whole new set of constructs to work with elements such as files.

For an upcoming demo I developed a small Python Azure Function whose job it is to create a thumbnail image of a JPEG file. You can find the source for this Azure Function on GitHub.

Azure Functions Triggers and Bindings

If you aren’t familiar with Azure Functions let me briefly cover a couple of key concepts – Triggers and Bindings.

Azure Functions provide developers with a rapid development environment for event-driven solutions by having pre-built logic that removes a lot of plumbing code from your codebase.

Triggers are how an Azure Function is invoked and can be from many different sources – Timers triggers (like cron on Linux), File (Blob) Created Trigger, et al. You define the Trigger type when creating the Function and the Azure Functions tooling automatically wires up the Trigger, gives you configuration options and adds some parameters to the entry point of your Function that allow you to access context from the Trigger source.

In the sample Function below we have a Blob Trigger that includes a filter on the file type (JPG) that will cause the Trigger to fire. This Trigger also provides some additional parameters we can access – a Stream that contains the binary data of the file along with the “name” of the file which in this case just maps to the filename without the extension.

using System;
using System.IO;
using Microsoft.Azure.WebJobs;
using Microsoft.Azure.WebJobs.Host;
using Microsoft.Extensions.Logging;
using Microsoft.WindowsAzure.Storage.Blob;
namespace Siliconvalve.Demo
public static class JpegUploadRouter
[return: Queue("images", Connection = "customserverless01_QUEUE")]
public static string Run([BlobTrigger("sampleuploads/{name}.jpg", Connection = "customserverless01_STORAGE")]Stream blobContent, string name, ILogger log)
log.LogInformation($"Routing image file: {name}.jpg");
// just return the filename.
return $"{name}.jpg";

This sample Function also uses an Output Binding. You can see the Output Binding definition in the above sample on line 13 where the [return] attribute defines the target for the Output (in this case an Azure Storage Queue).

C# Class Library Functions behave a little differently to other languages because we can be define a Binding entirely in code and not have to configure it via a function.json file.

My Scenario

I wanted my Python Azure Function to receive a message from an Azure Storage Queue, where the message contains the name of a file (blob) that has been uploaded previously to an Azure Blob Storage Container. The file would be downloaded to the Function host, processed and then written back to Azure Blob Storage at a different location.

The queue message, which is part of the Trigger for the Function, is bound via configuration to the Input and Output bindings in the function.json as follows. The special formatting of {queueTrigger} means at runtime the Functions handler will pass the message content to the Input and Output Binding automatically which means as a developer I don’t need to write anything up in my code to achieve this.

"scriptFile": "",
"bindings": [
"name": "msg",
"type": "queueTrigger",
"direction": "in",
"queueName": "images",
"connection": "customserverless01_STORAGE"
"name": "inputblob",
"type": "blob",
"dataType": "binary",
"path": "sampleuploads/{queueTrigger}",
"connection": "customserverless01_STORAGE",
"direction": "in"
"name": "outputblob",
"type": "blob",
"path": "thumbnails/{queueTrigger}",
"connection": "customserverless01_STORAGE",
"direction": "out"

view raw


hosted with ❤ by GitHub

This configuration is actually based on the Python Azure Functions documentation which is great for understanding the general format to use to create the bindings, but the sample Function they have is very basic and doesn’t explore how you might manipulate the incoming file and write a file to the output binding.

The Solution

As it turns out the solution isn’t that difficult, and I suspect some of my challenge was me trying to understand how to work with file streams in Python.

The resulting Function code (see the full solution on GitHub) is shown below. You can match the ‘inputblob’ and ‘outputblob’ parameters to the function.json configuration shown above. This really demonstrates how clean your code can be when working with Bindings.

from io import BytesIO
from logging import FileHandler
import logging
import azure.functions as func
from PIL import Image
def main(msg: func.QueueMessage, inputblob: func.InputStream,
outputblob: func.Out[func.InputStream]) -> None:
blob_source_raw_name = msg.get_body().decode('utf-8')'Python queue trigger function processed a queue item: %s', blob_source_raw_name)
# thumbnail filename
local_file_name_thumb = blob_source_raw_name[:4] + "_thumb.jpg"
# Download file from Azure Blob Storage
with open(blob_source_raw_name,"w+b") as local_blob:
# Use PIL to create a thumbnail
new_size = 200,200
im =
im.thumbnail(new_size), quality=95)
# write the stream to the output file in blob storage
new_thumbfile = open(local_file_name_thumb,"rb")

Lines 20 and 32 are the two key ones. On line 20 we receive the bytes from the input binding and invoke the read() Python method which will read in all bytes to the end of the stream. This is done inside of the write() method for our new local file object. As it turns out this is super clean!

The output binding (line 32) is similarly clean – we call outputblob.set() to write the file data to that stream and use the local file’s read() method to read the bytes from the new local file.

Note: there is one minor detail to be aware of. Right now there is a limitation with the Output Binding for Azure Blob Storage – all files are written with a Content Type of ‘application/octet-stream’ which means you can’t easily embed them in web pages (you will be prompted to download the file). There is an open GitHub Issue for this limitation, so hopefully it will be something we can control in future. If you’re not intending to serve the files via the web directly then this is a non-issue.

So there we are – how you can quickly and cleanly use Azure Functions written in Python to process files. Hopefully this post will save you some time if you’re looking to do something similar in your code!

Happy Days 😎

6 thoughts on “Reading and writing binary files with Python with Azure Functions input and output bindings

  1. Hi Simon. I am trying to do the same thing, but I am getting error saying that MyAppStorageConnection defined in the connection in function.json does not exists.
    I created it in my Function Apps, the same place where AzureWebJobs is, but nothing. I do want to use binding because it is easier and cleaner. Please, can you advise me where did you define:

    1. Fidel – “customserveless01_STORAGE” is the key for an App Setting that contains the Azure Storage Account connection string that points at the account from which you wish to receive events. When deployed these values are set and managed via the App Settings blade in the Azure Portal, and when you are debugging locally you can set the value in the local.settings.json file. Make sure not to check the file into source control! Hope this helps.

  2. Hello Simon, great article thanks! How do we read multiple files from Blob, do some processing/transformation and write the streams to various Blob outputs? Do we create distinct input and output bindings for each file input and output?

    1. Hi Alan

      Azure Functions has many bindings, some of which can support batching of incoming data – the Azure Service Bus binding for example. The Blob Storage binding doesn’t support reading multiple files in or writing multiple out via a binding.

      The best way to tackle this would be to use a trigger of some sort that notifies your Function that files require processing and then you can write the logic in your Function to use the Azure Storage SDK and read / write files that way.

      My only caution with this method versus using a binding mapping is that the Functions runtime will likely do a better job of concurrent processing and resource utilisation on the host on which it is running when compared to potentially reading / writing multiple files in a single Function and then managing concurrency yourself.

      If you needed to track the processing of the files you could also consider using Durable Functions as you can have an Activity Function that simply reads / writes a single file and allow the Orchestrator Function to feed in the batch of filenames.

      Hope this helps! – Simon.

  3. Hi Simon,
    can you please share any example to read data(json format) from the event hub using Python.

    1. Hi Ketan – you can see how to read an EventHub Event in Python in this blog. The event body would be extracted as a string so you could convert it into JSON by calling json.loads(event.get_body().decode(‘utf-8’)). You might want to check the body content first before trying to cast it to JSON, but this method should work for you. – Simon.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s