Security is not a feature.

21 Maart 2023

In todays world data is everywhere. And data is valuable. More and more companies are shifting to data driven or AI assisted decisions. All this comes at a price. Depending on the use case companies have to collect a vast amount of, mostly personal data. For example, a recommendation system for an online retailer could consist of creating a set of customer profiles, match that to a user’s individual behavior and generate a list of recommended shopping items. The more data you have, the better the algorithm works.

All this data becomes an asset. And like other valuables data becomes a target for attackers. Companies have to think carefully about how and what data they store to minimize exposure and comply to regulations. Also, the danger is not always on the outside. When employees have access to sensitive data they can use this for their own benefit. Of course, it is not a good starting position to not trust your own staff but at the end of the day there are enough examples where employees sold internal data or accessed confidential records out of personal interest.

With SQL Server 2016 Microsoft introduced Dynamic Data Masking, Row Level Security and Always Encrypted to help companies better control access to and storage of their data.

Dynamic Data Masking

Dynamic Data Masking helps to prevent unauthorized access to sensitive data. For example, instead of exposing a credit card number to the user only the last four digits are shown. Users or applications with appropriate permissions have access to the full credit card number. SQL Server has a few built-in masks for common types of data like social security numbers and email addresses. More information on how to configure this can be found here: https://learn.microsoft.com/en-us/sql/relational-databases/security/dynamic-data-masking

Row Level Security

With Row Level Security you can create an additional filter on data limiting access to record for a user. For example, a doctor in a hospital will only see information for their patients. Or a sales manager can see all individual sales while a sales person can only see their own sales. More information on how to configure this can be found here: https://learn.microsoft.com/en-us/sql/relational-databases/security/row-level-security

Always Encrypted

Always Encrypted uses an encryption key stored in a certificate to encrypt and decrypt data on the client using the client library. This way the actual data is never exposed to the database engine and you can achieve a clear separation of concerns. Users that do not have access to the encryption key will not be able to encrypt the data, including administrators. More information on how to configure this can be found here: https://learn.microsoft.com/en-us/sql/relational-databases/security/encryption/always-encrypted-database-engine One of the benefits of using these features is that they can be used without making changes to the application. At least, not substantial changes. Dynamic Data Masking and Row Level Security are typically implemented based on the logged in user. This means that you need a way to distinguish users on the connection. This would not be a problem in a typical client/server application but becomes a bit harder when using a web application. These can all be solved in one way or another but might require more changes to the application. For Always Encrypted the major change is to specify the feature in the connection string. Another thing to consider it that the datatype for encrypted data differs from decrypted data. For example, when the original datatype is varchar, the returned datatype is varchar when the encryption key is present at the client. When the encryption key is not present at the client the returned datatype is varbinary. The application should be able to handle this scenario.

But what if you don’t have SQL Server 2016? Maybe you need an older version of SQL Server. Or you have a different storage solution. Or your design principle is to not lock into vendor specific features. Even then you can use all this. Because, as the title of this post implies, security is not a feature. Meaning that you should not rely on functionality within a platform when you business requires a certain level of security. Let me give you three examples of how to implement the previous three features in a version prior to SQL Server 2016. Even with maintaining the same level of transparency to the application.

DIY Dynamic Data Masking

This is fairly easy to implement. To make it fully transparent to the application when it is already running you can rename the original table to something different and then create a view with the original name of the table. Let’s imagine you have a table with the following columns: user_id, user_name and user_phone. You want to mask the user phone similar to the SQL Server functionality where the last four characters are visible and an arbitrary number of characters are placed preceding the numbers. The following code snippet could be the definition of the view.

CREATE VIEW dbo.user_info AS
  SELECT
    user_id,
    user_name,
    CASE WHEN user_name() = 'dbo' THEN user_phone ELSE 'xxxxxx' + RIGHT(user_phone,4) END AS user_phone
  FROM
    dbo.user_info_base

DIY Row Level Security

This can be implemented by filtering the result set based on the current logged in user. As with the previous example this can be implemented by creating a view as an abstraction layer. Let’s say a doctor should only see information about their patients the view could be created like this.

CREATE VIEW dbo.patients AS
  SELECT
    …
  FROM
    dbo.patient_base
  WHERE
    doctor_name = user_name()

DIY Always Encrypted

This is a bit harder to implement in SQL Server alone. The best way to implement this is to encrypt and decrypt data within your application and then storing this encrypted data in binary format or with a base64 encoding as text. A benefit of this approach is that you are no longer bound to a specific SQL Server library or version. Also, you do not have to rely on certificates to transport the encryption key but could use another method of sharing this information.

A completely different approach

When your application uses an intermediate layer to access the data, for example through an API you can also handle all this logic within the API. Dynamic Data Masking and Always Encrypted can be directly implemented in the API controllers by masking and encrypting the outgoing or incoming data. Row level security can be implemented in two ways. The first method would be to retrieve an unfiltered set from the database and then filter the results before sending them back to the client. Another method would be to add the filter predicate to the statement send to the database. Now the takeaway from all this should not be to replace features and functionality that are within a product with you own implementations. Whenever you can use these features and they align with your goals you should use them. Security should not be based on product features but on business requirements.