Data Warehouse Security

University Degree Mathematical and Computer Sciences

Data Warehousing

Security Measures used to protect privacy of information

"In many ways the average data warehouse team still lives in a world of naive innocence. The team is so busy sourcing data and deciding on hardware and software that a comprehensive security plan simply hasn't been done ... In many cases, sensitive information is lying right on the table and hasn't been abused only because no one has tried to grab it yet. The situation is similar to leaving a car unlocked in a shopping center parking lot. You might go for years without having the car broken into just because the thieves have not turned their attention to it yet." – Ralph Kimball

Data warehouses have been an important way for organizations to efficiently use digital information for decision making. Simply put, a data warehouse is a database system that focuses on data storage specifically structured for query and analysis (Kimball, 2002). That is, data warehouses are able to optimize database related queries and analysis tools with their data analysis abilities. All of the organisation’s decisive information can be found in a convenient format to retrieve and use. This can include information about operational systems information and other company information, such as products and customer details. These details can include names and addresses, bank account numbers and other confidential information. Such data stored in the data warehouse can be used by the company for strategic purposes.

However, because of the private information within the data warehouse, a concern has arisen concerning the safety and protection of their personal details. Data warehouses are subject-oriented, integrated and generally easily accessible (Inmon and Hackathorn, 1994). Because of these characteristics, data warehouses provide a prime target for unauthorized users who hope to gain access to this information. Unauthorized users can obtain information and use them for their own purposes, such as human resource, medical and financial data.

Security can be complicated as it has to accommodate the needs of the data warehouse. Security must stop unauthorized users from obtaining or changing data within a data warehouse. The data must be available to authorized users when needed, while keeping track of the actions performed on it (Elson and Le Clerc, 2005).

A rising need

When data warehouses were first established in the late 1980s, protection against unauthorized access was simpler. The person in charge of the data warehouse, the database administrator would know that whoever accessed the data warehouse worked for the company. Security breaches were limited to dishonest employees who would access and abuse the data, but was not a common occurrence.

With the Internet gaining public interest in the early 1990s, data warehouse security became more complicated. People wanting to gain unauthorized access were able to access the data warehouse electronically without the database administrator’s permission. They were able to do so without their location concealed and their identity remaining anonymous. Because of this new vulnerability, new security measures were needed to be established.

With the rising trend towards flatter, more horizontally structured companies, new security measures are required that enable a division within the company to view only the data that is relevant to them, yet allow employees on the higher corporate levels to be able to view the data as a whole.

Higher levels of Government have recognized the need for privacy, and hence privacy laws govern the use of personal information within a data warehouse. It is the company’s responsibility to obey the law, especially to those companies who sell data to clients.

Privacy Laws

To reduce the concern of unauthorized use of personal data, national Governments have passed laws establishing requirements in the management of customer information. By understanding such laws, establishment of appropriate security measures in the data warehouse environment can be made. Here are examples of Acts and regulations that demonstrate this.

Australia has The Privacy ...

This is a preview of the whole essay

Privacy Laws

Australia has The Privacy Amendment (Private Sector) Act 2000 (C’th). This privacy act regulates the collection, utilization and release of personal information about people by private sector organisations (ComLaw Acts, 2000). This act ensures that there is some measure in which companies must abide by in terms of disclosure of personal information.

In the state of California, America, there is an act called the Security Database Breach Notification Act. This requires companies that conduct business in California to release any breach of data security to the person whose private information was obtained through unauthorized access (California State Bill 1386). An example of this act being taken into effect is with a data aggregation company called ChoicePoint. There was a data security breach within ChoicePoint, and because ChoicePoint has information about residents in California, the act meant that ChoicePoint had to disclose this breach in information. The breach exposed information on more than 140,000 people. Had the act not existed, then ChoicePoint might not have released the breach information, which may have resulted in more identity theft. The act is aimed to minimize the damage that occurs from such a security breach. (International Herald Tribune, 2008)

Multinational organisations must also be aware of whether or not their data warehouse security addresses the privacy laws within the European Union. The European Union (EU) is an economic and political union located primarily in Europe. For example, the Directive 2002/58/EC is an order on privacy and electronic communications. The order directs the provider of electronic communications to take appropriate measures to ensure protection of security of their services, and much like the California State Bill, any security breach that will affect customers must be reported.

Organisation Security Policy

An organisation’s security policy is the security rules that an organisation follows. The policy affects all the system’s users, and so it is defined by the management of the organisation, making it unique for each organisation. These managers generally write their own policies based on common sense and experience, along with the advice from members of their IT department.

Generalised policies should be written so that information is allowed to flow and flexible enough so that updates do not become troublesome. However, these policies should not be lenient as to prevent breaches form occurring, or to take action if a breach happens.

For a data warehouse, the security policy must ensure that adequate security measures are implemented to suit the needs of the company and to employee use. A policy should include a strategy for backing up data, in case of memory loss within a data warehouse due to an accident (egg fire), a procedure for managing updates, a properly planned back up strategy, and a post-incident recovery plan. The policy should define acceptable behavior for how data warehouse systems are implemented and used by an organisation. It should also deal with the organisations methods for sustaining the privacy and confidentiality of customer information (Smith, Newton, 2000).

A security policy should also take into consideration the legal obligations under laws and acts stated by their Government. As private information is more easily shared across the internet, organisations must make sure that they are not just reaching the minimum levels of security, but maintain an understanding of these policies.

Implementation of security

One way that security can be executed is in the application level of the data warehouse. This security is to protect against the vulnerabilities that are inherent in the code of the data-warehouse application. Security can be implemented within an application, which can then be joined to the data that is being processed within the application. In this way functions of the application can also be protected. This security can be specific to the data accessed by the application, thus certain applications will have its own data access rights.

Another way that security can be executed is at the database level. This means that security is added to the data warehouse itself. At the database level, only certain users can gain access to certain information within the data warehouse. Security is consistent to all the applications, with a single point of maintenance (Edwards, Lumpkin, 2005).

We will now look at the security measures aimed at protecting the private information.

External Security Measures

The aim of these security measures are to ensure that only users with authorization have access to the data stored within the data warehouse. Much like the skin is the first line of biological defence against infection, external security measures are the first line of defence against unauthorized access. These do not directly respond with the data within the warehouse, but protect the environment around the data by creating barriers to entry in the environment.

One way to implement an external security measure is to control who has access to the data. This involves limiting a particular user’s access based on the identity and privilege granted to that user. Access can range from a section of rows or columns in a table, to a complete set of tables. By effectively restricting access, users can only access the part of the data warehouse in where they have legitimate privileges. Within a security policy, a principle that controls access is the principle of ‘least privilege’. This principle limits access by need and job function. In this way, an organisation should ensure that the access is appropriate for each user, and there are different levels of users who require different access (Harmon, 1998). Much like before the Internet became widespread, employees may become vulnerability for data warehouse security. Therefore it is important that access be limited to authorized individuals.

If the data warehouse is stored within a data centre, access should only be given if required by their job. Physical measures to ensure legitimate access to data centres may involve keys, keypads or other automatic devices. This also means that the distribution of these manual devices should be controlled, and part of a security policy should address the process of granting and revoking access to employees. (Smith, Newton, 2000)

Authentication is a process used to determine whether a user is who he claims to be. This can be done through a combination of user IDs, passwords, confidential questions and other information that only the user is meant to know, such as pin numbers. With passwords, a security policy should state that a password should not be easily guessed. Therefore it may be mandatory to have a password that has a minimum amount of letters and is a combination of numbers and letters. A security policy should also encourage that a password is changed regularly and is not shared. Essentially common sense should dictate what is deemed a safe password, and a data warehouse manager should monitor if users are complying with the policy. One method for bypassing authentication was through brute force, for example, trying many password combinations. However, this method’s efficacy can be protected against by blocking access of guessing after a certain amount of unauthorized, incorrect access attempts. Once a user has accessed a warehouse, access within the data warehouse should be automatically logged off after a period of time in which it has not been used. However newer methods such as Trojans have been created that can be accessed through internet vulnerabilities. Trojans are disguised as or attached to a desired file. When downloaded, a Trojan can release malicious software such as a keystroke logger. A keystroke logger enables access to anything that is typed on the keyboard of an infected computer. In this way authorization can be recorded.

These external barriers do have their limitations. Once they are breached they do not provide any further protection to the data. By itself, external security measures are not sufficient for the high level security that a data warehouse environment needs.

Internal Security Measures

Unlike external security measures, internal measures protect the actual data, rather than entry ways to the data. One critical internal security measure is the encryption of data. Encryption is the process of converting text into protected data, thus making the text unable to be read under normal circumstances.

Encryption allows the data to be moved and stored with no vulnerability to unauthorized access. If an encrypted file is stolen via by unauthorized people, the person would not be able to read the file without the correct decryption information, that is, an encryption key. Only authorized people are given access to they encryption key, however managing the safety of these keys also poses a security problem (Elson and Le Clerc, 2005). Data within a data warehouse that are often encrypted include personal information such as credit card numbers.

Aside from confidentiality, encryption also serves other purposes. Data integrity can be ensured. In this way if a person sends an encrypted file to another person, that receiver of the encrypted file can be sure that the file was not altered. Also, as encryption keys are only given to authorized people, this provides a level of authentication, ensuring that the data came form a particular party. A company’s security policy should state that keys be changed after a certain date.

With encryption within a database warehouse, the structure of the data must be maintained. Also, the format of the data must remain the same during the encryption. If a numeric segment is encrypted, the encrypted value must remain numeric. If a data format is encrypted, then the encrypted data format must be in a valid data format. Unless this is done, system errors will occur when unauthorized users obtain access to encrypted data (Harmon, 2008).

In the data warehouse, encrypting sensitive data can be done at the table, column, or row level and serves as a defense against passive attack. Encrypting columns of a table containing private data is a common and simple approach. A data warehouse may have a need to encrypt certain columns, and so column encryption is particularly useful as it enables a level of granularity of security that is low. For example, encrypted columns might include birth dates or credit card numbers. Encrypting rows is less common, and only useful in some specific cases, for example, certain employees wanting to hide their phone numbers for privacy reasons. Encrypting tables are also rarely done, because the procedure requires decrypting the encrypted keys before access can be gained. This is not cost-effective and can be awkward.

Although encryption of data provides authenticity, confidentiality and integrity to the data, the act of encrypting and decrypting can consume a lot of the data warehouse server’s processing power. This can result in a lesser system performance and more over head costs.

If data is not encrypted properly, this can be detrimental to the security. Weak encryption methods can give the users a false sense of security, as it is possible to decrypt such information without the use of an encryption key. In terms of query speed, information needs to be decrypted before being used. This increases the waiting time needed for processing, which can be frustrating to users.

However, it is critical for data warehouses. Generally encryption ensures the authenticity, confidentiality and integrity of data in the data warehouse environment, and although it has some restrictive qualities, it provides a more complex and better protection then external security measures.

Ralph Kimball The Data Warehouse Toolkit, 2002

Using the Data Warehouse : Richard D. Hackathorn, W.H. Inmon (Hardcover, 1994)

Security and Privacy Concerns in the Data Warehouse Environment

Elson and Le Clerc, 2005

Integrated performance management: a guide to strategy implementation By Kurt Verweire, Lutgart Berghe 2004

Security and the Data Warehouse

April 2005

Author: Kristy Browder Edwards, George Lumpkin

Getting in on the Act: The review of the Private Sector Provisions of the Privacy Act 1988

Privacy Amendment (Private Sector) Act 2000 (C'th), 2000

California State Bill 1386, 2003

, , 21 February, 2008

Directive 2002/58/EC of the European Parliament and of the Council of 12 July 2002 concerning the processing of personal data and the protection of privacy in the electronic communications sector (Directive on privacy and electronic communications)

A Taxonomy of Organisational Security Policies

Smith, Newton, 2000

Safeguarding the Data warehouse “Computer Fraud & Security June 1998”

Christopher Harmon

1998