John Magnabosco

SQL Server Development and Data Security

  • Protection for the Data Journey

    Posted Thursday, January 01, 2009 10:44 AM | 1 Comments

    The touch of a human finger to a keyboard begins a journey. A journey in which the data that is born of this minor action must navigate with caution; for there in wait are gremlins, goblins and pirates waiting to disrupt or prematurely terminate this journey. These agony-hounds have an expensive taste. The most valuable and sensitive data that traverse these highways and byways are the ones that they target the most.

    In the attempt to guard the data as it passes through these perilous crossroads the user interface developer may advocate that the protection of the data should reside in the business layer or data layer of the application. The database developer may advocate that the database is the prime location to apply data protection. Others may recommend that a hybrid of these options provides the most secure environment. The process of making this decision can be confounding. Below are some thoughts that may help in this decision:

    Not All Data is Created Equal
    The use and classification of the data stored within the database significantly contributes to the implementation of security. A database that stores aggregated survey results on North America's cola preferences may require only the running of backup routines to guard against data loss. A database that contains a financial institution's customer and transaction data will require much greater and complex protection efforts.

    The beginning of any data protection effort is data classification. This process will identify the data and databases that are most sensitive and worthy of additional efforts.

    The danger of over-securing data is additional maintenance costs and an unnecessary reduction to the database performance. The danger of under-securing data includes data loss, regulatory violation fines, loss of competitive edge, loss of customer confidence, financial ruin for customers and of course loss of your job. Applying security is as much an art as it is a science.

    Data in Action/Data at Rest
    When data is in transit across a network or cached in memory for use it is said to be "in action". When the data is being stored in media for retention it is said to be "at rest". Other dimensions of "in action" and "at rest" include the frequency in which the data is updated.

    The primary goal for implementing data protection at the database level is to protect "data at rest". Transparent Data Encryption (TDE) is a good example of a feature that is designed for that purpose. TDE encrypts the physical data files and transaction logs so that their contents are protected from prying eyes when it is stored on the database server and various backup media. TDE also protects the "data at rest" from becoming "data in action" on a server in which it was not intended by requiring the appropriate keys to be in place prior to recovery.

    This does not mean that there should not be efforts made to protect "data in action" at the database level. An example of this would be Cell-Level Encryption. Any decryption that occurs at the database level is cached into memory prior to use. This presents a level of vulnerability if the cache is being sniffed for goodies by the bad guys. This also means that when the data is pushed out to the user interface it is traveling the network unprotected. The use of the HashBytes method does offer some level of avoidance of these specific vulnerabilities since its cipher text is never decrypted.

    If One Is Good, Two Is Better
    In our relentless pursuit of ultra-performance and efficiency we often fail to remember that redundancy is not necessarily a bad thing. The design of our own bodies utilize redundancy brilliantly: Two eyes, two ears, two lungs... and the list could go on. In the event that one of these items fail there is another that preserves the original functionality of the pair.

    When it comes to protecting data, implementing a hybrid of methods improves security significantly. In this specific case the redundancy is not only a fail-safe measure, it introduces complexity. Not unlike that gag-gift that we all certainly have received at some point where a gift is wrapped in a series of boxes. The implementation of complex security methods increases the skill and knowledge level required to achieve disclosure which reduces access by the unauthorized.

    When data is encrypted at the business or data layer of the user interface its cipher text travels the network to the database. Once received by the database it can further be encrypted. If the database cipher text is broken, the sensitive data remains protected because user interface encryption is still in place. The illustration below demonstrates this dynamic:

    UI
    Plain Text

    UI
    Cipher Text

    DB
    Plain Text

    DB
    Cipher Text

    My Plain Data

    0x172634383

    0x172634383

    0x678262539

    Encrypt -->

    Encrypt -->
    <-- Decrypt

    Encrypt -->
    <-- Decrypt


    <-- Decrypt

    In God We Trust, All Others Pay Cash
    This title of the classic book by Jean Shepherd says it all. The DBA cannot be sure that the sensitive data that is flowing to the database has been sufficiently protected. The role of the DBA is one of great responsibility. If sensitive data is compromised while it is in storage there is no doubt that the DBA will be the first questioned. Applying protection methods regardless of methods used in the user interface ensures that some level of protection exist.

    The road that our data travels is indeed a formidable one. With these considerations in mind the process of implementing protection for these journeys will be more effective.

  • Documenting Data Classification with Extended Properties

    Posted Friday, December 19, 2008 6:03 AM | 0 Comments

    Data classification is an important process of securing data. Without it there cannot be a consistent security policy implemented. At the end of this process each target object is assigned a level the indicates its classification. This classification can cover many aspects; but for the interest of this blog entry I will focusing on data sensitivity classification.

    For data sensitivity classification I would have a set list of options to choose from that indicate an escalating level of sensitivity. For example:

    Public: Low level sensitivity. This data is available for all users to view.
    Internal: Medium level sensitivity. This data would be privy to company personnel.
    Confidential: High level sensitivity. Additional security efforts should be made to protect this data and it is only accessible by a select group of individuals.
    Top Secret: Very high level sensitivity. Extra measures of security are required and very few have access to this data. Printing or e-mailing this data would be prohibited.

    In the not so distant past I often wished that there was a column property available in which I could document its classification. Then I discovered extended properties. Extended properties exist for any schema object within SQL Server. They allow the developer or DBA to add custom properties that denote valuable information. It is perfect for documenting data classification information.

    Extended properties can be accessed via SQL Server Data Management Studio (SSDMS) by right-clicking on the desired object (such as a column within the table) and clicking on "Properties". Inside the properties window there is a tab for "Extended properties". The name of the property and its value can be entered at that location. An alternative is to script the creation of extended properties through the use of the sys.sp_addextendedproperty procedure. This option is very helpful when there is a lot of objects to update. The syntax of adding an extended property is:

    EXEC sys.sp_addextendedproperty
        @name='[Property Name]', @value='[Property Value]' ,
        @level0type='SCHEMA', @level0name='[Target schema]',
        @level1type='TABLE', @level1name='[Target table]',
        @level2type='COLUMN', @level2name='[Target column]'
    GO

    Once the sensitivity classification has been assigned to all of the columns in your database the process of documenting which columns should have extra security efforts made, such as encryption, becomes a much more efficient process. To query the extended properties of an object, use the system function sys.fn_addextendedproperty or the sys.extended_properties catalog view. Below is a sample query that returns all columns in the target table that are classified as "Confidential":

    SELECT 
        objname
    FROM
        sys.fn_listextendedproperty ('[Property Name]',
                                                 'schema', '[Target schema]',  
                                                 'table',
    '[Target table]', 
                                                
    'column', default)
    WHERE
        value = 'Confidential'
        AND objtype = 'COLUMN'

    There may be a time when the value of the extended property will need to be modified. This can also be done through SSDMS as well as through the sys.sp_updateextendedproperty procedure.

    Once these extended properties are captured the user interface, stored procedures and user defined functions can programmatically utilize this information. Adding extended properties to your database objects does take a bit of additional thought and effort; but using this feature for purposes such as this will make life much easier when it moved out of the development stages and into support. It also will make the enforcement and auditing of security policies a much smoother process.

  • Biometric Data Sensitivity

    Posted Monday, December 15, 2008 7:46 PM | 0 Comments

    It is very common to classify data such as a Federal Identification Number as sensitive. It is considered such because it is unique to the individual and is often used to gain access to additional data. Biometrics, in many cases, also uniquely identifies an individual due to being based upon their physical aspects.

    The use of biometric data in combination with other identity verification data is highly valuable. It presents a convenient method of authentication since it is something that we do not need to refer to a post-it note to recall or derive a cryptic combination of our pets names and our favorite celebrity.

    While fingerprints are usually the first biometric method that comes to mind there are many others, including some that may come as a surprise. Below is a short list of biometric data types:
     
    Fingerprint
    Fingerprints have been used to verify identity for thousands of years. There are records that reference the Ancient Chinese and Babylonian cultures utilizing fingerprints as signatures. We leave this calling card everywhere we go. The fingerprint pattern is unique to each person and does not change through their lifetime, barring serious injury to the finger. When a fingerprint is gleaned from an object and compared to a list of fingerprints that are associated to other personal information, such as a name, then personal identification is achieved.

    Hand Geometry
    The distance from the tip of your thumb to the tip of your index finger might be 6.04 inches while another person's might be 5.75 inches. The layout of a person's fingers when a relaxed palm is placed upon a reader may result in a pattern that is identifiable to the person. Jerry Garcia used his unique hand print as a logo that represented his work. These are all examples of hand geometry. While this data may not be as unique as a fingerprint it can be utilized to augment other biometric data.

    Iris Recognition
    Look at anyone's eyes and you will see their iris. The iris consists of varying structures, patterns and hues that are unique to each person. The technology of capturing these tiny details of the eye is a fairly recent achievement compared to other biometric methods. Unlike fingerprints that can be left behind on any object the source of this data remains with the carrier; thus requiring the presence of the person in order to collect the identifying information.
     
    Retina Geometry
    The collection of blood vessels, or more specifically capillaries, that supply the retina the nutrients needed to maintain its viability produce a pattern that is unique to each person and is considered by many as the optimal biometric data source. The ability to scan the retina and compare the results with a stored library of retinal scans is dependent on the retina remaining in tact without obstruction. Severe injury or disease, such as glaucoma, can make this type of data difficult to obtain for comparison.

    Facial Recognition
    The combination of facial features and their relationship to each other presents a form of identification. This is something that we practice naturally in recognizing our friends. Las Vegas has employed this method in their casinos for quite sometime to identify gamblers who are barred from their establishments. Photographs often are a source of data for facial recognition methods.

    Vocal Patterns
    One of my favorite radio personalities is Casey Kasem. His voice is unique and he can be identified without any visual representation. The combination of accent, rate, pitch and, in some cases, vocal ticks such as "umm..." create a unique pattern that can identify us without visual representation. These patterns can be captured and converted into data that is then stored and compared to future vocal recordings for identification purposes.

    Signature
    Believe it or not a signature is considered biometric data. The unique pattern of almost illegible squiggles and jots that make up our signatures is considered a behavioral biometric. It is the result of the action on our hands and wrists that generate these miniature pieces of ourselves. Banks have used items called signature cards for years as a means of identity verification. Credit cards have an area on the back in which a signature is to be placed. This location is intended to be used as a verification device to the signature that is written upon the transaction receipt.

    Keystroke Dynamics
    The interface in which we communicate to computers and our Internet communities is achieved through the use of a keyboard. The tap dance that our finger do on these keyboards differ from person to person. The amount of time that we press down on a key. The amount of time between pressing one key and pressing another key. Whether we favor the left shift key or the right. Our capitalization and punctuation patterns. The frequency of the use of the backspace key. All of these together indicate that we are a different person than the other users of the keyboard. While there are many variables in this behavioral biometric this can be used as a means to verify identification.

    While this list can go on ad nauseam it does represent that biometric data surrounds us on a daily basis and is used in even the most routine transaction. All of these items can be reduced to a binary array that can, and is often, stored within a database and utilized as a source of identity verification.

    Disclosure of biometric data to unauthorized persons could result in an extremely difficult situation to rectify. A person's fingerprints cannot be changed, like credit card numbers, if their unauthorized disclosure results in fraud or misrepresentation. Therefore, careful consideration and understanding of this type of data and its sensitivity is important in any data classification effort.

  • Salting Hashed Values To Improve Security

    Posted Monday, December 08, 2008 5:58 AM | 1 Comments

    In nature the mineral called salt is a preservative. Prior to the age of refrigeration, mankind utilized this valuable resource to slow the decomposition process and increase the opportunity to consume the preserved food. This valuable mineral is also used to enhance the flavor of food. What is a potato chip without salt? When it comes to data security the use of salt is slightly different in execution; although very similar in concept.

    The use of mono-directional encryption methods, such as HashBytes, to secure data is a valuable alternative to bi-directional encryption methods that require key management. The use of the HashBytes method in SQL Server right out of the box does have its vulnerabilities. One such vulnerability is the potential for a dictionary attack. A dictionary attack is one in which a list of common words are hashed and then compared to the cipher text. Once there is a match, the secret has been released.

    To protect hashed values from dictionary attacks the use of a "salt" is invaluable. In layman's terms a salt is simply appending the plain text with a constant value prior to hashing it. Here is an example:

    The plain text value of "My Sensitive Data" when hashed with the SHA1 algorithm returns the value of:
    0xA2D1EF295735857B9D7D674E1FE84B14B21EFA55

    The plain text value of "My Sensitive Data" is prefixed with the value of "Hn45Zz&" and then hashed with the SHA1 algorithm returns the value of:
    0x5D0C2127955BC510384D4DC1EAB4A60F284F98CC

    As you can see, the hash value is very different than not using a salt. When the hashed value becomes subject to a dictionary attack it will become much more difficult to crack.

    If all hashed values in your database or table are salted with the same constant value then you have increased your security to only one level. To "kick it up another notch", consider salting your values with the primary key of the row in which the sensitive data is stored. The benefit of this practice is that while you may have a hundred rows containing the same plain text values their salted hash values will be unique. This results in a much greater effort and cost to the hacker who is attempting to disclose the protected data.

    A vulnerability of any hashing effort is that a phenomena called a "hash collision" might occur. A hash collision occurs when two unique plain text values produce an identical hash value. There is no such thing as total elimination of hash collision possibilities; rather, the effort is to decrease the probability of their occurrence to such a degree that it is a near improbability. Salting the hash values is one method in which the probability is reduced.

    The utilization of HashBytes and other hashing methods are certainly a way to obtain a level of security for sensitive data. Their use should not be ignored simply due to their inherent vulnerabilities. All security measures have vulnerabilities. The key is to understand these vulnerabilities and apply additional layers of complexity so that their functionality is strengthened.

  • Exploring HashBytes

    Posted Monday, December 01, 2008 6:47 AM | 4 Comments

    The word "encryption" is often used to describe the process in which plain text is converted into cipher text and later transformed back into plain text for disclosing the data. For example: Cell-Level Encryption is the term used to describe the process in which a column of data within a table is protected by hashing plain text into cipher text and then returned to plain text through the use of a key or a series of keys.

    Another example is Transparent Data Encryption (TDE) is the name of a feature in SQL Server 2008 in which the plain text that is stored within a data file and transaction logs are hashed into cipher text and then reverted to plain text through a series of keys prior to its use.

    For the ultra-purist, the global use of the word "encryption" in this fashion is not wholly accurate. The definition of "encryption" is the process in which the plain text is converted into cipher text. The process in which a key, or series of keys are used to convert the cipher text into plain text is "decryption".

    Why does this splitting of hairs in regard to the use of the word encryption matter? The real area in which this matters is when a decision in regard to the method of data security that is to be applied to your database. The bi-directional approach, which is using encryption and decryption processes, is commonly the one that comes to mind when considering encryption. There are times when encryption in general is discarded due to the key management requirements... an unfortunate situation indeed.

    Consider the mono-directional approach to encryption. This approach is the hashing of plain text into cipher text without the intention of reverting the data back to plain text. In this approach key management is not required since it is not intended to be decrypted. Searching and comparison of values are accomplished by encrypting the input with the same algorithm and then comparing the input cipher text with the stored cipher text.

    In SQL Server 2005 and 2008 the HashBytes function provides us with the ability to perform the mono-directional approach to encryption. The syntax in which plain text is converted into cipher text is:

    HASHBYTES('SHA1','My Plain Text')

    In this example above, I chose to hash my plain text with the SHA1 algorithm. There are other algorithm options available such as: MD2, MD4, MD5, SHA and SHA1. The maximum number of bytes that accepted in the input argument and that are returned when converted into cipher text is 8,000.

    Much like implementing the Cell-Level Encryption methods the data type of the field in which the cipher text is stored must be varbinary. For example, the cipher text of "My Plain Text"  would be stored as 0x6D99DDF6FE7A32547B6766E0BF88B1F50835F0FF.

    There are vulnerabilities in all security efforts and by nature mono-directional encryption methods are weaker than bi-directional encryption methods. The strength of bi-directional encryption is not always necessary and their key management requirements are not always desired. The consideration of utilizing the HashBytes function in your data security efforts is something to not overlook.

  • TDE: Under The Hood With Disabling TDE

    Posted Friday, November 21, 2008 8:07 AM | 0 Comments

    It is indeed that time of the year when the love of American Football can be reflected through using its terms in metaphors for everything from life, politics, relationships, and in this case... Transparent Database Encryption (TDE).

    Life is not series of running plays
    After careful consideration, evaluation, and research the decision to implement TDE may be reached. You, as the DBA, swiftly execute the commands to create the appropriate keys and certificates. The TDE feature has been enabled and the database is now protected.

    A few weeks later the decision to remove TDE has been mysteriously reached and communicated to you... or immediately after implementing TDE and closer consideration the database that was to be encrypted was not the one in which it has been implemented. Regardless of the cause, it is time to reverse the actions that were taken to implement TDE.

    Call in the back up
    A life-saving practice is to perform a database back up before making any changes to database properties and schema. This practice includes the implementation of TDE. If the decision to reverse the implementation of TDE occurs immediately after its implementation and before any other changes have occurred, including data updates and inserts, the following steps can be followed to cleanly revert your database to its pre-TDE state:

    1. Restore the pre-TDE back up with the RESTORE DATABASE command.
    2. Drop the server certificate used for TDE in the MASTER database with the DROP CERTIFICATE command.*
    3. Drop the Database Master Key (DMK) in the MASTER database with the DROP MASTER KEY command.*
    4. Restart the instance in which these actions have occurred. If there are not any other databases in the instance with TDE enabled, this action will regenerate the TEMPDB database in an unencrypted format. For more information about how TEMPDB is affected by TDE, please refer to my blog entry on that subject.

    * Please note that steps 2 and 3 noted above should not be executed if there are other databases within the instance that have TDE enabled and are to remain encrypted.

    Playing the Audible
    There may be times when a change in plans is called for well after TDE has been implemented and data inserts and updates have occurred. At this point the option to simply restore the pre-TDE back up is no longer a viable solution due to potential data loss. The following steps can be followed to cleanly revert your database to its pre-TDE state:

    1. Set the ENCRYPTION option of the target database to OFF using the ALTER DATABASE command.
    2. Drop the Database Encryption Key (DEK) in the target database with the DROP DATABASE ENCRYPTION KEY command.
    3. Restart the instance in which these actions have occurred. If there are not any other databases in the instance with TDE enabled, this action will regenerate the TEMPDB database in an unencrypted format.

    Please note that the server certificate and the Database Master Key (DMK) in the MASTER database were not dropped in this sequence; although TDE is no longer enabled on the target database there still remains some dependencies on the server certificate. If the server certificate were dropped and the instance were restarted the target database would be marked as "Suspect" and would not be brought on-line. A review of the SQL Server Error Logs would reveal the following error:

              Cannot find server certificate with thumbprint [Binary Certificate Thumbprint]

    This is the same error that is received when attempting to recover a TDE enabled database to an instance without the accompanying server certificate and DMK combination.

    Reviewing the tapes
    Once the above steps are taken it is important to review the results to ensure that TDE has been successfully removed. The following queries can be used to view the results of TDE removal:

    1. SELECT name, database_id, is_encrypted FROM sys.databases
      This query will return all of the databases in your instance and reflect the encryption status. The value of "1" in the is_encrypted column indicates that TDE is enabled. A value of "0" indicates that TDE is not enabled.
    2. SELECT * FROM master.sys.certificates
      This query will return the server certificates that have been created in the MASTER database. Upon execution of the DROP CERTIFICATE command the server certificate will no longer appear.
    3. SELECT * FROM master.sys.symmetric_keys
      This query will return the Database Master Keys (DMK) that have been created in the MASTER database. Upon execution of the DROP MASTER KEY command the DMK will no longer appear.
    4. SELECT * FROM sys.dm_database_encryption_keys
      This query will return all Database Encryption Keys (DEK) that have been created in your instance. Upon execution of the DROP DATABASE ENCRYPTION KEY command the DEK for the target database will no longer appear. If there are no other databases with TDE enabled on the instance, a restart of the instance will remove the DEK for the TEMPDB.
    5. Query any table in the target database to show that data is successfully returned.

    Another option for reviewing the results of the removal of TDE is to make a back up of the database after TDE has been disabled. Once the back up is complete, open the [MyDb].bak file in Notepad, or your favorite text editor, and search for a known value contained in any table or the target database. Since an encrypted back up file will store the data in its binary hash, a plain-text search for a specific value will result in no results; while an unencrypted back up file will locate the value.

  • TDE: Under The Hood With Backup

    Posted Friday, November 14, 2008 9:43 AM | 1 Comments

    If you have worked with SQL Server for a given period of time chances are that you have experienced, witnessed, or at least familiar with the concept of backing up a database. This task is a primary staple in the DBA diet and should always include verification that the backups actually are valid.

    An example of a statement that can be used to backup a SQL Server database is:

    BACKUP DATABASE [DemoTDE]
        TO  DISK = 'C:\Backup\DemoTDE.bak'
        WITH NOFORMAT,
        INIT, 
        NAME = 'DemoTDE-Full Database Backup',
        SKIP,
        NOREWIND,
        NOUNLOAD, 
        STATS = 10
    GO

    When Transparent Data Encryption (TDE) is enabled on a database there are some additional dimensions that must be considered when performing a database backup.

    BACKINGUP TDE ENABLED DATABASES
    In my previous blog entry titled TDE: Under The Hood With TempDB I included the steps that are required to implement TDE. These steps include the creation of a Database Master Key (DMK), which is protected by the Service Master Key (SMK), as well as a Certificate that is protected by the DMK. All of these items reside in the MASTER database and are not included in the backup of the user database in which TDE is enabled or the MASTER database. To perform a backup of these items the following statements must be executed:

    BACKUP SERVICE MASTER KEY
        TO FILE = 'C:\Backup\DemoTDE_SMK.bak'
        ENCRYPTION BY PASSWORD = 'MySMKBackupP@ssWord2009'           
    GO

    BACKUP MASTER KEY
        TO FILE = 'C:\Backup\DemoTDE_DMK.bak'
        ENCRYPTION BY PASSWORD =
    'MyDMKBackupP@ssWord2009'  
    GO

    BACKUP CERTIFICATE MyServerCert
        TO FILE = 'C:\Backup\DemoTDE_CERT.bak'
    GO

    With the execution of the BACKUP DATABASE command and the three statements above you will have all of the files required to successfully recover your TDE enabled database. Please note that backing up the MASTER database or TEMPDB database is not required for recovering a TDE enabled user database.

    The general database backup process often includes transferring the backup files to an external medium and stored in a safe location. When encryption is involved it is recommended to store the key backup files on a separate medium and location from the database backup file. This practice ensures that if the medium that contains the database backup file falls into malevolent hands the contents of the data remains secure since the keys are required to recover and access the data.

    There are no additional backup requirements for the Database Encryption Key (DEK) which was created on the user database in which TDE was enabled. The DEK is actually stored in the user database's boot record and is included in the database backup. When the database is recovered the database boot record is accessed and the DEK is available for reference at that time.

    RECOVERING TDE ENABLED DATABASES
    When it comes time to recover a TDE enabled database there are a few variants in the approach depending upon the level of recovery required. For example: When the entire instance is requiring recovery, or the recreation of an instance is occurring on a separate server, the recovery of the SMK, DMK and Certificate will be required. If the recovery effort is focused on a specific user database within the instance of its origin the recovery of the database backup will often be sufficient.

    If the SMK recovery is required you will need to carefully consider recovery or altering of all items within the instance that are protected by the SMK. The recovery of the SMK is performed by the following statement:

    RESTORE SERVICE MASTER KEY
        FROM FILE = 'C:\Backup\DemoTDE_SMK.bak'                   
        DECRYPTION BY PASSWORD = 'MySMKBackupP@ssWord2009'
      
    GO  

    The recovery of the DMK is performed by executing the following statement in the MASTER database:

    RESTORE MASTER KEY
        FROM FILE = 'C:\Backup\DemoTDE_DMK.bak'
        DECRYPTION BY PASSWORD = 'MyDMKBackupP@ssWord2009'
        ENCRYPTION BY PASSWORD = 'MyNewDMKP@ssWord2009'
    GO

    Please note that we recovered the DMK using the protection of a password (ENCRYPTION BY PASSWORD) rather than the SMK. This allows the DMK to be recovered even in an instance in which the SMK is different from the instance in which the DMK was originally created. To change the DMK back to being protected by the SMK, the following series of statements are required in the MASTER database:

    OPEN MASTER KEY
        DECRYPTION BY PASSWORD = 'MyNewDMKP@ssWord2009'
    GO

    ALTER MASTER KEY
        ADD ENCRYPTION BY SERVICE MASTER KEY
    GO 

    CLOSE MASTER KEY
    GO

    Please note that modification of the DMK requires it to be opened first.

    Certificates are recovered through re-creation rather than recovery. If the Certificate used for TDE previously exists in the MASTER database it will need to be dropped prior to execution the following statement:

    CREATE CERTIFICATE MyServerCert
        FROM FILE = 'C:\Backup\DemoTDE_CERT.bak'   
    GO

    At this point all of the keys that reside in the MASTER database have been recovered and the user database in which TDE has been enabled is ready to be recovered through the following statement:

    RESTORE DATABASE [DemoTDE]
        FROM DISK = 'C:\Backup\DemoTDE.bak'
        WITH  FILE = 1, 
        NOUNLOAD, 
        REPLACE, 
        STATS = 10
    GO

    Once the database has been restored the success of its recovery can be determined through querying a table that resides within the database.

    Disclaimer: All of the SQL Statements offered in this blog entry are provided as examples of their use and syntax. The arguments and their values used may vary depending upon your situation. Additional details regarding these statements can be obtained through SQL Server Books On Line. Always backup and test in a development environment before executing these suggested statements in a production environment.

  • TDE: Under The Hood With TempDB

    Posted Friday, November 07, 2008 8:45 AM | 1 Comments

    The tempdb system database can be found in any SQL Server instance. It is a database that is utilized by all user databases that are created within the instance. This system database is where temporary user objects such as temporary tables, variables, cursors, spools, sorting and row version activities are performed. There is no doubt that this system table is extremely valuable in the functionality of the entire SQL Server instance.

    When Transparent Data Encryption (TDE) is enabled on any user database the primary elements of the user database that is encrypted is the physical database (.mdf) and log (.ldf) files. A byproduct of this feature being implemented is the encryption of the tempdb system database.

    On the surface the automatic encryption of the tempdb system database might be a bit confusing and raise some concerns about how the non-TDE enabled user databases might interact with tempdb. To clarify this, let's start at the beginning:

    On a fresh SQL Server instance, TDE is enabled on a user database by performing the following steps:

    1. Create a Database Master Key (DMK) using the master database.
    2. Create a Certificate using the master database.
    3. Create a Database Encryption Key (DEK) using the target database.
    4. Set Encryption to "ON" using the target database.

    We will explore these four steps in depth in later blog entries; but for now, let's now review some system views that reveal what has transpired as the result of these four steps using the master database.

    SELECT * FROM SYS.SYMMETRIC_KEYS
    The execution of this statement reveals the master keys that exist in the database. The one that we are focusing on, as the result of the steps listed above, is the one that contains the value of "##MS_DatabaseMasterKey##" in the column titled "name". This indicates the DMK that was created in the first step above.

    SELECT * FROM SYS.CERTIFICATES
    The execution of this statement reveals the certificates that exist in the database. The certificate that contains the certificate name that was used in the second step above in the column titled "name" is the one that we are focusing on in this example. Notice that the column titled "pvt_key_encryption_type_desc" contains the value of "ENCRYPTED_BY_MASTER_KEY". This value is in reference to the DMK noted above. Also, note the binary value that is contained within the column titled "thumbprint". We will refer to this in the next query.

    SELECT * FROM SYS.DM_DATABASE_ENCRYPTION_KEYS
    The execution of this statement reveals the database encryption keys (DEK) that exist for the instance. You will notice that even though we created only one DEK there are two listed in the query results. If you cross reference the value within the column titled "database_id" to the sys.databases view you will notice that one of the DEKs is for the target database in which we enabled TDE and the other is for the tempdb system database. The dm_database_encryption_keys view offers some interesting insight to these DEKs; specifically the column titled "encryptor_thumbprint".

    The encryptor_thumbprint column reveals the binary reference to the certificate that protects the DEK. The value that has been captured for the target database in which TDE was enabled can be directly referenced to the "thumbprint" column revealed in the sys.certificates view.

    The encryptor_thumbprint value for the tempdb system database contains the value of "0x", which is the binary equivalent to a zero-length value (not null). This means that the DEK that has been created or the tempdb system database is completely independent from the key hierarchy that was established for the target database in which TDE was enabled. In addition, a review of the columns titled "key_algorithm" and "key_length" contain the values of "AES" and "256" for the tempdb system database regardless of the algorithm utilized for the target database.

    Since the tempdb system database is referenced by all user databases within the instance, regardless of whether TDE as been enabled or not, all databases must perform the encryption and decryption process when using the tempdb system database utilizing its DEK.

    The inherent behavior of the tempdb system database is that when the SQL Server instance is stopped and started that the tempdb is dropped and recreated. This can be verified by performing the following steps:

    1. Stop the SQL Server instance.
    2. Start the SQL Server instance.
    3. Execute "SELECT * FROM SYS.DATABASES" using the master database.

    The result of the third step of this process will reveal that the column titled "create_date" for the tempdb system database will be approximately the date and time of the start of the SQL Server instance. When the dm_database_encryption_keys view is executed, the DEK for the tempdb will still be included in the results and the column titled "create_date" will reflect a recent date and time. This illustrates that when the tempdb system database is recreated so is its DEK.

    There is a lot that can be discussed on this specific aspect of TDE alone and it certainly presents some interesting questions in regard to performance, mirroring, backup and recovery strategies. While TDE's purpose is to provide protection of a specific databases' files there are affects to the instance as a whole and requires careful consideration and testing before implementing into a production environment.

  • TDE: Under The Hood With Database Encryption Key

    Posted Friday, October 31, 2008 10:21 AM | 0 Comments

    There is a scene in the 1955 Looney Tunes short called "Sahara Hare" in which Yosemite Sam is attempting to enter a castle in the desert. After various failed methods of forced entry he encounters a secret entrance in the wall of the castle. When Sam opens the door there is another one immediately behind it. When that door is opened it reveals yet another one and so the scenario repeats until he encounters some strategically placed TNT. This segment of the cartoon reminds me of the encryption key hierarchy minus the destructive ending.

    The series of keys that makes up the encryption key hierarchy protect each other until the key that used for the encrypting/decrypting process is revealed. The outer door being the Service Master Key (SMK) which protects an inner door, the Database Master Key (DMK) , which protects additional doors which are symmetric and asymmetric keys. With SQL Server 2008 and the introduction of Transparent Data Encryption (TDE) a new key was introduced into the mix: The Database Encryption Key (DEK).

    On the surface it may be a little confusing when determining the difference between a Database Master Key (DMK) and a Database Encryption Key (DEK). The similarity between their names certainly add to the confusion. They both are addressing the encryption process and the use of the word "encryption" in its name does not provide any implication to its unique purpose. Under the hood, the differences become much easier to understand:

    Database Master Key
    This symmetric key is used to protect subsequent symmetric keys or asymmetric keys within the database that are utilized in the actual encryption/decryption process of the data. The algorithm used when a DMK is created is Triple DES. The key that protects the DMK is the Service Master Key or a user-supplied password. When the SMK is used to protect the DMK, the opening of the DMK occurs automatically; otherwise the DMK must be opened using the OPEN MASTER KEY statement.

    The CREATE MASTER KEY command is used to create a DMK. The sys.symmetric_keys view will display the DMK for the database.

    Database Encryption Key
    This symmetric key is used only for TDE. The purpose of this key is to perform the encryption/decryption process on the physical files and filegroups of the database. The algorithm used when a DEK is created is determined based upon the WITH ALGORITHM argument and include various AES algorithms as well as Triple DES. The key that protects the DEK is a Certificate (Asymmetric Key) that resides in the MASTER database which is protected by the DMK for the MASTER database.

    Since the DEK is used with the TDE process, the opening of this key is transparent to the end user.

    The CREATE DATABASE ENCRYPTION KEY command is used to create a DEK. The sys.dm_database_encryption_keys view will display the DEK for the database.

  • TDE: Under The Hood With Master Key

    Posted Thursday, October 23, 2008 5:32 AM | 0 Comments

    Recently at an IndyPASS meeting I facilitated a general question and answer session preceding our regularly scheduled feature presentation for the month. There was a lot of great questions covering many aspects of SQL Server; but I was most excited to hear some great questions in regard to Transparent Data Encryption (TDE). In the next few blog entries I hope to cover some of the items that were raised. This entry focuses on the Master Key that is used for TDE.

    If you have an instance of SQL Server that does not contain databases that have TDE enabled, the first step in the process of the implementation of TDE is to create a Database Master Key (DMK) for the MASTER database. An example of this syntax is below:

          USE MASTER
          CREATE MASTER KEY
          ENCRYPTION BY PASSWORD = 'MySampleStrongP@ssword2008'

    In general when a DMK is created for a database it would be created using the database in which we intend to apply the encryption (target database). In the case of TDE we do not create the DMK in our target database, but do so directly in the MASTER database. This placement of the DMK in the MASTER database facilitates the server's handling of the encryption/decryption process automatically without the additional code that cell-level encryption requires.

    Only one DMK can be created for any given database. If the above commands are executed against the MASTER database that already has a DMK defined an error will be returned. To determine if a DMK has been previously created you can execute the sys.symmetric_keys catalog view on the MASTER database. The record with the name column value of "##MS_DatabaseMasterKey##" is your DMK.

    The creation of a DMK in the MASTER database does not imply that the MASTER database has been encrypted. In addition, the creation of a DMK in the MASTER database does not imply that the entire instance has been encrypted. Creating a DMK simply produces an symmetric key that is utilized for the encryption/decryption process.

    As you proceed with the implementation of TDE keep the following items in mind:

    • When the database backups occur, the DMK, or any other key, is not included in the backup. You must backup all keys individually as a separate step to your plan.
    • The DMK is protected by the Service Master Key (SMK). The SMK is created at the time the instance is created. If the SMK is changed or lost, recovery of the DMK to unlock the encrypted database will not be a pleasant experience and will likely result in an unrecoverable database.
    • All databases in a single instance that have TDE implemented will utilize the single DMK that was generated in the MASTER database.

    Certainly as you read this information additional questions will arise. In the following blog entries, I will cover other aspects of TDE that will hopefully answer many questions. Feel free to comment on this blog with your questions or contact me directly if you have more detailed or specific questions in regard to DMK or TDE.

  • Releasing the Pigeons

    Posted Wednesday, October 15, 2008 5:55 AM | 1 Comments

    I receive a daily feed from http://datalossdb.org/latest_incidents.rss which provides very basic information in regard to data loss events. These data loss events are not situations where a database crashes and data is lost, these are events in which valuable or sensitive data is disclosed to unauthorized parties. Examples of these events are:

    • Names, addresses and social security numbers of college students are published on the Internet.
    • A document containing sensitive information is found in a trash can.
    • Sensitive data that is contained on a laptop which is stolen from an employee's or auditor's automobile.
    • A disk containing sensitive information about a business' customers is lost.
    • A hacker gains access to a server containing sensitive data.

    I have been watching this feed for quite sometime and have been amazed at the frequency of these events. These occur on a daily basis and involves some rather significant organizations. It is very interesting to see that the events that are caused by hackers is a small percentage of the lot. Majority of them are the result of misplacement of data or irresponsible disclosure.

    There is a lot that can be done to protect data while it is in storage. There is also a lot that can be done to protect data as it travels from the database to the user interface and back again. Role based security, encryption and other obfuscation methods provide the armored car affect for the data. Once data has been disclosed the imperfect human factor enters the picture.

    A person who is otherwise authorized to view sensitive data might then save it in an Excel Spreadsheet on their laptop which gets stolen from their car. They might attach the information to an e-mail that they accidentally send to the entire company and their book club buddies. The might save that information on their favorite thumb drive which falls out of their pocket as they answer their cell phone... and the examples go on an on.

    It is an excellent practice for the DBA or Developer to question the inclusion of sensitive data in an unprotected format on any vehicle of disclosure. The requestor may not realize that the data that they see every day could be considered sensitive. The requestor may not fully understand the consequences of further disclosing this data to potential unauthorized parties. It is very likely that the sensitive data may not be needed at all except for record identification purposes and an alternative piece of data could be recommended.

    The practice of sending reports, data extracts or user interface displays that contain unprotected sensitive information to their recipients is not unlike attaching a scroll of paper to the leg of a pigeon in the old pigeon post days. While it is a very effective method of delivering data to those in need of it, the path is fraught with falcons and hunters awaiting to intercept the data.

  • Encryption Alternatives

    Posted Friday, October 10, 2008 8:24 AM | 4 Comments

    Databases often contain sensitive information and cell-level encryption is a very effective method to protect this information from those who should not have access to it. Although, encryption is not without its challenges. There is a performance cost. There also is additional administrative tasks required with its introduction into the database. That is why there are some instances where encryption is restricted, by the powers that be, from being utilized.

    When the DBA or Developer finds themselves faced with the requirement of protecting sensitive data while not being able to utilize encryption they must seek alternatives to this feature. There are many alternatives available and some, when appropriately implemented, can be as effective as encryption itself. Below are a few examples of these alternatives:

    Store Only The Data You Need
    In this day and age storage media is relatively inexpensive. This is a good thing, but it does enable lax storage strategies. When determining the schema of your table consider the use of the columns that are being added, especially ones that contain sensitive data. Try to avoid the "We may need it someday" mind set.

    For example: The Social Security Number is an identifying number provided by the Federal Government for individuals in the United States. This piece of data is often found in tables that contain other information about an individual. If this piece of data is not being used functionally by the applications that access the database its inclusion in the schema should be reconsidered and challenged.

    If it is determined that the sensitive data must be stored in the table, consider only storing a portion of the data rather than the entire data. For example: If you must store the Social Security Number, storing the last four digits may suffice for its use in your database.

    Masking Plain Text 
    We are all familiar with the standard method of displaying the text of a password as it is being typed into a text box; all letters are replaced with an asterisk (*). Most are also familiar with the masking methods used on a credit card receipt: the last four digits of the credit card number are displayed in plain text while the remainder of the number are replaced with an asterisk.

    These two examples demonstrate how sensitive data within the database is hidden at the point of disclosure from the eyes of those who should not see it. While this approach is very effective it does not necessarily suggest that the sensitive data that is stored in the database is similarly masked. According to PCI Compliance Standards, Requirement 3.4, credit card numbers (PAN) cannot be stored in a database in its entirety. Masking this information within your database is a valid means of meeting these requirements.

    When implementing masking, please remember that the storage of data that is entirely masked is rather useless since it cannot be read or searched by anyone regardless of the security level. Determining the extent of masking that is to be applied to the sensitive data is a fine balance between maintaining the value of the data for authorized users while rendering it useless to those who are unauthorized for its use.

    Coding Data
    Utilizing numeric or alphanumeric values to represent the real values of data is often a practice that can be found in normalization efforts. Placing a value of "16" to represent the State of Indiana or "01" for the State of Delaware is a very efficient means of storing this data. It also allows for the represented text to change without having to modify all of the rows that are associated with that value.

    This same practice can be utilized to protect sensitive data. Consider a financial planner who maintains records of their client's financial history in a database. Knowing that certain aspects of their information is sensitive the financial planner might store the value of "1250" to indicate that his client has declared bankruptcy or "3427" to indicate the beneficiary of the client's assets. Without access to the data that provides the interpretation of this code, the sensitive data remains a mystery.

    Hashing
    SQL Server offers the HashBytes method. This method uses a specified algorithm to turn plain text into a binary value that then can be stored in the database masking the sensitive data it represents. The specific syntax of using the HashBytes method can be found at:
    http://msdn.microsoft.com/en-us/library/ms174415.aspx

    There is a fine line between hashing data and encrypting data. Encrypting data involves the process turning plain text into cipher text as well as decrypting the data back into plain text when needed. Hashing converts the plain text into a binary value and is only compared with other hashed values to produce validation of plain text; thus removing the key management aspect of encryption as well as a portion of the performance affects.

    Here is a great blog entry that talks about using the HashBytes method to secure sensitive data:
    http://blogs.msdn.com/sqlcat/archive/2005/09/16/469257.aspx

    While these options have their own benefits and vulnerabilities they do offer some creative approaches to protecting the sensitive data that has been entrusted to your organization.

  • Understanding Master Keys

    Posted Friday, October 03, 2008 9:57 AM | 0 Comments

    At the heart of encryption are keys. These allow you, the authorized user, to unlock the subsequent key in a key hierarchy or the cipher text contained within a database. This hierarchical relationship in SQL Server 2008 is illustrated at http://msdn.microsoft.com/en-us/library/ms189586.aspx. For SQL Server 2005, the hierarchy is illustrated at http://msdn.microsoft.com/en-us/library/ms189586(SQL.90).aspx

    Within SQL Server, the head of the encryption family is the Service Master Key. This key is a Symmetric key that is based off of the service account credentials as well as the machine key from the Windows Data Protection API (DPAPI). There is only one Service Master Key per instance of SQL Server and is created at setup of that instance.

    The Database Master Key is a Symmetric key that is unique to each catalog (database) within the SQL Server instance. This key is encrypted using the Service Master Key of that instance. When a catalog is created the Database Master Key is not automatically generated. These keys are created using the CREATE MASTER KEY command.

    Once the Database Master Key is created, additional keys can be created to increase the granularity of of the encrypted data. For example a table or cell can be encrypted with one key while another table or cell could be encrypted with a separate key. Both keys are therefore encrypted by the Database Master Key. These keys can fall into the following types:

    Asymmetric Keys are the combination of a private key and a public key. The plain text is encrypted using the public key, which is distributed to others, and is decrypted by a corresponding private key which is very limited in its distribution. An Asymmetric Key is created using the CREATE ASYMMETRIC KEY command.

    A Symmetric Key is a single key that is used to encrypt and decrypt data. A Symmetric Key can only decrypt the data that was encrypted by itself. As previously noted the Service Master Key and Database Master Key are considered this type of key. A Symmetric Key is created using the CREATE SYMMETRIC KEY command.

    Certificates are private or public keys that are digitally associated with an individual or device. The use of a Certificate is very similar to Asymmetric Key. These can be created externally from SQL Server and can offer expiration management. To create a Certificate within SQL Server you will use the CREATE CERTIFICATE command.

    When a database is backed up through the standard means, the keys that are applied to the instance and subsequent catalogs are not included. Each key must be individually backed up using the BACKUP SERVICE MASTER KEY, BACKUP MASTER KEY or BACKUP CERTIFICATE commands. A best practice is not to store these key backups on the same device as the database backup in the event that the device is compromised.

    While a mile of text could be used to provide more details regarding keys, this cursory review should provide the novice cryptographer with some basic information that will help begin to lift the fog in the implementation of this valuable feature of SQL Server.

  • Data In The Cloud

    Posted Monday, September 29, 2008 9:06 PM | 1 Comments

    Back in 2001, I attended a seminar in which Microsoft introduced the .NET Framework. I distinctly remember the discussion surrounding the concept of software being offered as a service via the Internet. Over the years there has been many offerings in this regard and more potential customers of this option have grown accustomed to and interested in the concept.

    As the .NET Framework matured and the tools to create web applications became more powerful, the appearance of applications being offered on the Internet became more frequent. The straight-forward term of "Software as a Service" became a more visual and marketable "Computing in the Cloud" which could very well draw The Rolling Stones in again for someone's marketing campaign.

    The database world has not been left out of this brave new world. There are many on-line database service offerings that are making their claim in the cloud such as Intuit, Amazon.com, IBM and Google. Most interesting to the SQL Server crowd is Microsoft's offering called SQL Server Data Services (SSDS). This product offers on-line data storage, query and manipulation. While SSDS will not offer the entire suite of features of SQL Server when it is released, there will be many features that will make it a serious contender in the current "database in the cloud" market.

    The marketing information indicates that SSDS will have a "flexible data model" and further describes it as "no schemas required". With elements such as entities, containers and "flat scalar property bags" developing in the SSDS environment may require a bit of a learning curve for the Database Developer; but it is a very exciting opportunity for those willing to go off into the "wild blue yonder".

    There is much more to learn about SSDS and I am looking forward to getting my hands on it. This product is not yet in Beta; but Microsoft is taking applications for participating in the Beta program. You can sign up to participate in the Beta as well as learn more about SSDS at the following link: http://www.microsoft.com/sql/dataservices/default.mspx

  • Does Your Database Contain Sensitive Data?

    Posted Tuesday, September 23, 2008 9:47 PM | 0 Comments

    On the surface, the question of whether or not a database contains sensitive data may seem like a rather simple one to answer. Most people recognize that a Federal identification number or a credit card number is and should be recognized as sensitive data. While these pieces of data get a tremendous amount of attention by the media when data loss is reported there are other pieces of data that are not as easily recognized as being considered as sensitive.

    The following categories of data are considered to be sensitive and should be protected:

    Government Assigned Personal Identification data
    This type of data includes Social Security numbers, tax identification numbers for businesses, driver license numbers and other data that the Federal, State or Local Governments have assigned to an individual or business for the purpose of identification.

    Biometric data
    As biometrics become utilized more often for the purpose of identification verification the importance of protecting this information becomes more critical. This type includes items such as retinal scan images, facial images, fingerprints and signatures.

    Medical data
    The Health Insurance Portability and Accountability Act (HIPAA) protects data in regard to medical and insurance information for patients. This includes notations in regard to conversations with your health care professional, physical and mental medical history as well as payment history of medical care. Unauthorized disclosure could result in civil and criminal penalties.

    Student Education data
    The Federal Educational Rights and Privacy Act (FERPA) protects data in regard to students and their educational records. This includes the student’s name, address, telephone number as well as information specifically regarding their education history. Unauthorized disclosure could affect a school’s Federal funding… not to mention compromising a student’s privacy.

    Employment data
    Items such as salary information, performance reviews, worker’s compensation claims, benefit information and pension plan details fall into this category. Any HR Professional will tell you that the unauthorized disclosure of such information could result in severe consequences.

    Communication data
    E-mail messages, telephone records and recordings, fax documents, text messages are all carriers of information that may contain data that would fall into any of these categories; Therefore, this information should be considered sensitive data.

    Financial data
    Financial data not only discloses information regarding an individual or business’ financial status it also often contains data that is used to gain access to assets. For example: bank account numbers, personal identification numbers and beneficiary information.

    Intellectual Property data
    Items that fall into this category are source code, schematics, details regarding a new product and also creative works such as images and written documents. The unauthorized disclosure of such information could destroy the competitive edge of a business or compromise the copyright claim by an author or artist.

    As the DBA and Developer, we are typically the ones that implement encryption and other security measures in the database. We are often requested to provide extracts of data for use by external systems. We are also often requested to produce printed reports that present data for the use of Business Analysts to review. We are also a target for phishing or social engineering efforts to gain access to sensitive data.

    Once data leaves the protected environment of the database the control of its dissemination becomes nearly impossible. The printed report could be passed around and end up in the hands of a person who will use the information for fraudulent activities. The spreadsheet that is generated by a SSIS Package or query could be stored on a laptop that is not password protected or encrypted which is stolen from the person’s automobile. The information could be attached or typed in an e-mail that was accidentally sent to a mailing list that contains hundreds of people.

    While you cannot control the further disclosure of the data once it is in the requestor's possession the understanding of the data that is being stored in your database will go a long way in protecting your client's privacy, your employer's reputation and ultimately your job.


















<January 2009>
SuMoTuWeThFrSa
28293031123
45678910
11121314151617
18192021222324
25262728293031
1234567