A new feature in SQL Server 2016 and the Azure SQL database is something called Dynamic data masking.
This is essentially used to mask the data from prying eyes and is helpful for blocking some users or DBAs from seeing confidential types of data like account numbers.
So let see and example of how this works, but first we’ll setup the environment and create a table which will show us all the different types of masking available.
CREATE DATABASE MyDB; USE MyDB; CREATE TABLE Mask ( Phone VARCHAR(20) MASKED WITH (FUNCTION = 'default()') ,Email VARCHAR(150) MASKED WITH(FUNCTION = 'email()') ,BankAccNumber INT MASKED WITH (FUNCTION = 'Random(1,12)') ,UserPassword VARCHAR(25) MASKED WITH (FUNCTION = 'Partial(0,"XxXxXxX",4)') )
There are four different types of masking available, default, email, random & partial, instead of listing the difference I’d advise you to look at this link.
Now lets put some test data into the table.
INSERT INTO Mask VALUES ('01284999654','johnjames@gmail.com',1222236,'Jo786DJa') ,('02088546585','alexjones@outlook.com',19854636,'Al3xJ0123') ,('01618549856','neil@Datadoor.co.uk',1462532,'N9sifoekd')
And then run a quick select statement to see what is returned.
SELECT [Phone] ,[Email] ,[BankAccNumber] ,[UserPassword] FROM [dbo].[Mask]
Now if we look at the results below we can see still see all of the data, this is due to the account security level for my current connection.
So to see the masking correctly we’ll create an account and apply it simple SELECT rights.
CREATE USER Test WITHOUT LOGIN GRANT SELECT ON [dbo].[Mask] TO [Test]
Now if we run the same select statement impersonating the Test user
EXECUTE AS USER = 'Test' SELECT [Phone] ,[Email] ,[BankAccNumber] ,[UserPassword] FROM [dbo].[Mask] REVERT
Now we’ll see the data is being masked correctly
If you want to grant this account rights to see the mask data you’d need to grant the user the UNMASK security rights like below
/*Apply unmask security rights*/ GRANT UNMASK TO [Test] EXECUTE AS USER = 'Test' SELECT [Phone] ,[Email] ,[BankAccNumber] ,[UserPassword] FROM [dbo].[Mask] REVERT
Now we can see the data unmasked.
Problems with data masking
There are two main problems with dynamic data masking
1. Once a user is granted the UNMASK credential that user is then able to see all of the masked data globally in the database, this can’t be restricted to one table or column.
2. For some masking types like default, email or partial, you’re also able to figure out the data in the column by using some T-SQL investigations, for example if I supply the below SQL to firstly revoke the UNMASK rights for the test user and then re-query the data I’m able to narrow down the password letter by letter till I have the complete password.
/*Revoke unmask security rights*/ REVOKE UNMASK TO [Test] EXECUTE AS USER = 'Test' SELECT [Phone] ,[Email] ,[BankAccNumber] ,[UserPassword] FROM [dbo].[Mask] WHERE UserPassword LIKE 'a%' REVERT
As you can see from the result I’ve now discovered there is one account which password starts with A.
Finding all columns where data masking has been applied
If you want to see all of the columns in a database which has masking applied you can see this via some system views like in the query below
SELECT c.name ,tbl.name AS table_name ,c.is_masked ,c.masking_function FROM sys.masked_columns AS c INNER JOIN sys.tables AS tbl ON c.[object_id] = tbl.[object_id] WHERE is_masked = 1;
You can see from the results it shows you which columns are masked and by what masking type, which again I feel could be used to determine how easy your able to unmask the column.
Regardless of the security issues, dynamic data masking is a useful addition to any SQL developers arsenal of functions avaiable to them.