Microsoft recently announced the public preview of the Dynamic Data Masking (DDM) feature for Azure SQL Database that holds a lot of potential for on-the-fly data obfuscation that traditionally would have required either custom business logic or third party systems.
In this post I am going to take the opportunity to walk through how we can set this up for an existing database. For the purpose of this post I am going to utilise the AdventureWorks sample database for Azure SQL Database which you can download from Codeplex.
Our Azure SQL Database Server
Firstly we need to understand our Azure SQL Database Server instance configuration. At time of writing any existing Azure SQL Database Server or one created without specifying a Version flag will use the “2.0” Azure SQL Database engine which does not support DDM.
If this is your scenario the you will need to upgrade your server to v12 using the Azure Management Portal – make sure you follow the planning guidance provided by Microsoft. Note that an upgrade has minimal impact on running databases – there will be brief interruption to service at the very end of the upgrade, but for the rest of the process your database will remain online.
I’m going to setup my server using the following PowerShell, but if you want to specify a server name them you will need to use the preview portal (https://portal.azure.com/). Note the Version parameter I’m providing to specify the new engine release.
# Should return 0.8.14 (Get-Module Azure).Version New-AzureSqlDatabaseServer -Location "West US" -AdministratorLogin bobsyouradmin -AdministratorLoginPassword N0tRe4l? -Version 12.0
Notes:
- At the time of writing there is a known issue with this Cmdlet due to the time it takes to complete exceeding the maximum PowerShell timeout. The server will still be created but you will receive an error.
- Make sure you open up the firewall on the server to allow your client machine to connect. You can do this via the Azure portal or PowerShell.
After this has run I’ll have an Azure SQL Database instance with the v12 engine ready for use. My next step is to deploy the Adventureworks database schema and data. We can do this leveraging the downloaded package (from Codeplex) and then running the following command at an Admin-elevated command prompt.
CreateAdventureWorksForSQLAzure.cmd {servername}.database.windows.net {user}@{servername} {password}
Once this command completes we will have a database setup that we can now query. I’m going to simply use SQL Server Management Studio (SSMS) for this. You will need a version recent enough to support Azure SQL Database (SSMS for SQL Server 2012 and 2014 will do).
This is the test query we’ll use to show how data masking works. I’m going to run the initial query as our admin user (‘bobsyouradmin’).
SELECT TOP 10 BusinessEntityID, FirstName, MiddleName, LastName FROM Person.Person
Which results in the following.
BusinessEntityID FirstName MiddleName LastName ---------------- ------------- ---------- ----------- 285 Syed E Abbas 293 Catherine R. Abel 295 Kim NULL Abercrombie 2170 Kim NULL Abercrombie 38 Kim B Abercrombie 211 Hazem E Abolrous 2357 Sam NULL Abolrous 297 Humberto NULL Acevedo 291 Gustavo NULL Achong 299 Pilar NULL Ackerman (10 row(s) affected)
Getting the Mask On
Now that we’ve gotten everything setup, let’s go and mask the LastName field. Firstly we need to switch on Dynamic Data Masking for our Azure SQL Database instance via the Azure Portal.
Navigate to the Azure SQL Database instance and then select the Auditing & Security tab on the very right.
Then scroll down to the bottom and switch on Dynamic Data Masking as shown below. I’m going to add our current login as a privileged login too (this means it can query the data without a mask).
Now at this stage if I re-run the query above with the same credentials then I will get back exactly what I have above which is expected because I allowed my admin user to be excluded from the masking (and I’m not using a secured connection string).
Let’s go ahead and create another login and grant it access to the database – we’ll just make it a data reader so that it can run the SELECT query but not update anything.
Firstly create a login for the server
CREATE LOGIN untrusteddev WITH PASSWORD = 'N0tL0Lc4t5^' GO
Then a user in the Master and AdventureWorks databases
CREATE USER untrusteddev FOR LOGIN untrusteddev WITH DEFAULT_SCHEMA = [dbo] GO -- Add user to the database data reader role (not needed in Master DB) EXEC sp_addrolemember N'db_datareader', N'untrusteddev' GO
Now let’s close our connection to the Azure SQL Database and create a new one with our new login.
As we don’t trust the developer we’re going to give him a special connection setting to use. Instead of the standard
{servername}.database.windows.net
we will provide the developer with
{servername}.database.secure.windows.net.
We will turn off the ability for clients to connect via the old un-secured hostname as well.
Before we give our developer access to the database let’s make a tweak by adding a mask to our Person table. On the Auditing & Security page for our database I’m going to click “Add Mask” in the bottom menu and then add the mask as shown below.
Now that this is place we can provide our developer with the necessary connection details and they can connect via SSMS like so:
When the developer logs in everything looks pretty much as it always does.
Let’s pop open a new Query Window in our AdventureWorks database and re-run our SQL query on the Person table from above.
SELECT TOP 10 BusinessEntityID, FirstName, MiddleName, LastName FROM Person.Person
So can we see the full Person details as before?
and our result set looks like:
BusinessEntityID FirstName MiddleName LastName ---------------- ------------- ---------- ----------- 285 Syed E XXXXXXXX 293 Catherine R. XXXXXXXX 295 Kim NULL XXXXXXXX 2170 Kim NULL XXXXXXXX 38 Kim B XXXXXXXX 211 Hazem E XXXXXXXX 2357 Sam NULL XXXXXXXX 297 Humberto NULL XXXXXXXX 291 Gustavo NULL XXXXXXXX 299 Pilar NULL XXXXXXXX (10 row(s) affected)
The key take away from this is that we haven’t actually changed the underlying data – the database does still contain the LastName of the Person – it’s just that the user context of the call is such that a configured mask is applied to the data as it is retrieved from the server.
We can use this approach for a lot of different data types and I expect eventually you’ll have a lot more customisation you can apply here. This is very much a “v1” release and will no doubt be added to over time. This is a great feature that may eventually remove some need to perform mass data cleansing operations when populating dev/test data sources with “production” data.
Reblogged this on Kloud Blog.
This looks like an interesting feature.
Do you know what happens when an unprivileged user tries to update a row that contains masked data? Will the masked data overwrite the original contents of will the update action fail on an attempt to update a masked column?
Adam – if the user has the right database role (i.e. db_datawriter) then they can update the value in any field, even if it is masked. Dynamic Data Masking isn’t a security measure and isn’t designed to stop users from updating data that is masked – it is simply designed to obfuscate certain field values when retrieved from the database. If you need to stop a user from changing data you need to rely on existing security permissions that Azure SQL Database (and SQL Server) have.
You set up a dynamic data masking policy in the Azure Preview portal by selecting the Dynamic Data Masking operation in your SQL Database configuration blade.