Using and validating secure data settings in Azure Logic Apps

Appropriate data handling is an important part of any software system design. Often times we overlook the risks involved with the visibility and accessibility of data passing through our solutions and don't even think about how secure data handling applies to telemetry and logging. In this post I am going to explore how you can use Azure Logic Apps in-built input/output obfuscation capabilities to limit the exposure of data in operational logging and how you can check that the settings are configured.

Understanding secured inputs/outputs

Many (but not all) triggers and actions that ship with Logic Apps support obfuscation of inputs and/or outputs. Let's start with a simple scenario as shown below.

We have a HTTP trigger which will receive an inbound HTTP request and then pass the request contents on to another Web API (in this case the echo API from Postman).

I have the Logic App workflow open in the designer and have selected the HTTP trigger. In the panel I then selected Settings and scrolled down to the Security section. As you can see, secured inputs and outputs are currently not selected.

Default Secure Settings for a Logic App action or trigger!

If we save this workflow and take a look at the workflow.json it generates we can see there is no mention of any security settings.

{
  "definition": {
    "$schema": "https://schema.management.azure.com/providers/Microsoft.Logic/schemas/2016-06-01/workflowdefinition.json#",
    "actions": {
      "HTTP": {
        "type": "Http",
        "inputs": {
          "uri": "https://postman-echo.com/get",
          "method": "GET"
        },
        "runAfter": {},
        "runtimeConfiguration": {
          "contentTransfer": {
            "transferMode": "Chunked"
          }
        }
      }
    },
    "contentVersion": "1.0.0.0",
    "outputs": {},
    "triggers": {
      "When_a_HTTP_request_is_received": {
        "type": "Request",
        "kind": "Http"
      }
    }
  },
  "kind": "Stateful"
}

If we run this workflow and open the run history we can inspect the contents of the inputs and outputs as we wish. While this data is encrypted at rest, as an operator I can still view it which may not be desireable, and with Application Insights bolted in I have this data flowing there as well.

Logic Apps Output view with secured data off!

Let's flip on the secure data settings and see what difference it makes.

We get a visual cue in the trigger (or action) that the secure settings have been enabled for which is great.

Secure Settings enabled for a Logic App trigger!

Next, let's see what it's done to the JSON - we can clearly see some new secureData nodes under runtimeConfiguration in the JSON that represent what we just enabled.

{
  "definition": {
    "$schema": "https://schema.management.azure.com/providers/Microsoft.Logic/schemas/2016-06-01/workflowdefinition.json#",
    "actions": {
      "HTTP": {
        "type": "Http",
        "inputs": {
          "uri": "https://postman-echo.com/get",
          "method": "GET"
        },
        "runAfter": {},
        "runtimeConfiguration": {
          "contentTransfer": {
            "transferMode": "Chunked"
          },
          "secureData": {
            "properties": ["inputs", "outputs"]
          }
        }
      }
    },
    "contentVersion": "1.0.0.0",
    "outputs": {},
    "triggers": {
      "When_a_HTTP_request_is_received": {
        "type": "Request",
        "kind": "Http",
        "runtimeConfiguration": {
          "secureData": {
            "properties": ["inputs", "outputs"]
          }
        }
      }
    }
  },
  "connectionReferences": {},
  "parameters": {}
}

Finally, let's determine what's happend with our run history.

Secure Settings impact on Logic App run history!

Happy Days! 😎 Almost...

Ensuring secure data settings are set

Up until now this has all looked pretty sweet. But there are two limitations in the approach:

The setting is not global and
It relies on developers to remember to flip on the settings for every trigger or action... and we know we can rely on developers to do that consistently, don't we?! 😁

These two limitations got me thinking about how best to manage this in a larger team working across a range of integrations using Logic Apps.

The options I saw were:

Write a solution to inject the secureData nodes under runtimeConfiguration when missing. The big challenge with this is not every trigger or action supports these values, and in some cases, like the commonly used Parse JSON Action only one property (input) is used. As you can see - this quickly gets complicated, and likely to break more workflows than it fixes.
Have a way to validate workflows and highlight when secureData was missing and allow developers to fix. This validation can be run as part of a Pull Request or code review and will quickly highlight if there are substantial gaps. As the validator doesn't change the workflow.json file I can have confidence I'm not introducing unexpected breaking changes.

As you can probably guess, I proceeded with the second option - to build a parser.

Say hello to GitHub Copilot

I've written a few scripts to validate things over the years, and if I'm honest, I didn't really feel like writing another one for this scenario.

So I turned to my helpful pair programmer - GitHub Copilot.

In Visual Studio Code I had a sample Logic Apps Standard project open with the JSON for a workflow with secureData settings in place. I then used the open JSON file and about 20 minutes of prompting with GitHub Copilot to help me build a script that suited my needs.

GitHub Copilot chose Python for the implementation language, so I was thankful that it could help me deliver 99% of the working solution as shown below as I'm not a heavy duty Python developer.

import argparse
import json
import os
import sys

# Add a global variable to track if any items are not present
items_not_present = False

def find_nodes_in_dict(dict_obj, type_string):
    """
    Recursively searches for actions in a dictionary object and checks if they have secure data properties.

    Args:
        dict_obj (dict): The dictionary object to search in.
        type_string (str): The type of Logic App entity (actions or triggers) to search for.

    Returns:
        None
    """
    global items_not_present

    if type_string in dict_obj:
        for action_name, action_value in dict_obj[type_string].items():
            print(f"Name: {action_name}")
            if 'runtimeConfiguration' in action_value and 'secureData' in action_value['runtimeConfiguration']:
                print("\t'runtimeConfiguration' node with 'secureData' child is present")
                if 'properties' in action_value['runtimeConfiguration']['secureData']:
                    properties = action_value['runtimeConfiguration']['secureData']['properties']

                    if 'inputs' in properties:
                        print("\t\t'inputs' is present in 'properties'")
                    else:
                        print("\033[91m\t\t'inputs' is not present in 'properties'\033[0m")
                        items_not_present = True

                    if 'outputs' in properties:
                        print("\t\t'outputs' is present in 'properties'")
                    else:
                        print("\033[91m\t\t'outputs' is not present in 'properties'\033[0m")
                        items_not_present = True
                else:
                    print("\033[91m\t\t'properties' is not present in 'secureData'\033[0m")
            else:
                print("\033[91m\t'runtimeConfiguration' node with 'secureData' child is not present\033[0m")
                items_not_present = True

    for _, value in dict_obj.items():
        if isinstance(value, dict):
            find_nodes_in_dict(value, type_string)
        elif isinstance(value, list):
            for item in value:
                if isinstance(item, dict):
                    find_nodes_in_dict(item, type_string)

# Create the parser
parser = argparse.ArgumentParser(description='Process a directory for JSON files.')
parser.add_argument('DirPath', metavar='dirpath', type=str, help='the path to the directory')
parser.add_argument('--exit-code', action='store_true', help='return a non-zero exit code if any items are not present')

# Execute the parse_args() method
args = parser.parse_args()

# Walk the directory structure
for dirpath, dirnames, filenames in os.walk(args.DirPath):
    for filename in filenames:
        if filename == 'workflow.json':
            filepath = os.path.join(dirpath, filename)
            with open(filepath) as f:
                data = json.load(f)
            print(f"\nProcessing file: {filepath}")
            print("\nActions\n=======")
            find_nodes_in_dict(data, 'actions')
            print("\nTriggers\n========")
            find_nodes_in_dict(data, 'triggers')

if items_not_present and args.exit_code:
    sys.exit(1)

If you want to run this you can provide it with the root folder of a Logic Apps Standard project and it will walk the directory tree and find all the workflow.json files and parse them for you.

python validate-secure-data-settings.py /path/to/project

If you want to have the script return a non-zero error code (something you might want to utilise in a Continuous Integration build) you can call it like this:

python validate-secure-data-settings.py --exit-code /path/to/project

Your output will look similar to the below, with any errors highlighted in red text on the console (but not shown below).

Processing file: ./create-single-object/workflow.json

Actions
=======
Name: Condition
        'runtimeConfiguration' node with 'secureData' child is not present
Name: Create_Contact_record
        'runtimeConfiguration' node with 'secureData' child is present
                'inputs' is present in 'properties'
                'outputs' is present in 'properties'
Name: HTTP
        'runtimeConfiguration' node with 'secureData' child is present
                'inputs' is present in 'properties'
                'outputs' is present in 'properties'

Triggers
========
Name: When_a_HTTP_request_is_received
        'runtimeConfiguration' node with 'secureData' child is present
                'inputs' is present in 'properties'
                'outputs' is present in 'properties'

Processing file: ./create-single-sf-object/workflow.json

Actions
=======

Triggers
========
Name: Recurrence
        'runtimeConfiguration' node with 'secureData' child is not present

Is the Python the most elegant and efficient way to achieve this? Maybe not, but I went from an idea to working solution in about 20 minutes, all in a language I don't use much. If I really wanted to I guess I could ask GitHub Copilot to convert to a language I'm more familiar with, but this script works and is a good baseline.

There are still some gaps in this implementation, particularly around those triggers and actions that don't support these settings. These should be excluded to avoid false positives, but for my current situation this script is a good starting point, particularly to use for doing code reviews with my team.

You can find a sample Azure Logic Apps Standard project, along with the Python validation script on GitHub.

Leave feedback and comments below, or via Issues on the GitHub repository with the sample.

Happy Days! 😎