A bucketing framework for Database security

Authors Avatar

Query-Optimal-Bucketization and Controlled-Diffusion algorithms for privacy in outsourced databases

National University of Singapore, School of Computing

CS5322 Databases Security - 2009/2010 (Semester 2) - Project Report

Rattanak CHHUNG

A0066240N

NUS - SoC, Singapore

Supélec, France


Damien FOREST

A0066246A

NUS - SoC, Singapore

Supélec, France

Yann-Loup PHAN VAN SONG A0066238B

NUS - SoC, Singapore Supélec, France


Monika

A0066244H

NUS - SoC, Singapore

ABSTRACT

For the Databases  Security  project,  we choose to make a Java implementation of the solutions showed in [4].  The context of this paper is the classic paradigm of database outsourcing and privacy threats.  Our paper provides some explanations of [4] and of our choices of implementation.

Keywords

database, security, privacy, encryption, query, index, java, Query-Optimal-Bucketization, Controlled-Diffusion, attack.

1.   THE ORIGINAL PAPER

In this section we will quickly introduce the notions pre- sented in [4].  We advice the reader to refer at least to the table 1:  Notation  for Bucket of [4] page 722.  We  will be more precise in the section 3 where we will detail our imple- mentation.

1.1   Context and preliminaries

In the first assignment we studied [3, 1], where the problem is to assure privacy and confidentiality on untrusted server. [4] deals with  the same problem.  The solutions introduced

 This paper is based on the work of [4]. This paper has 9 pages.

Version of April 12, 2010.


in [1, 3, 4] use indexes to make basic operations on encrypted tuples.

A partition is an ensemble of buckets that are represented by the indexes.  It is used to create privacy on an sensitive column of the  relation  which  are used to  make the  basic operations (usually selections).  An example is given below:

Example 1. 1  Consider the table 1 (page 1) reprensenting the clear-text taple.

Table 1: Representation of the clear-text table

The  sensitive  attribute  consided from the  selection  are Id, One and Two.   On the  server  side there  is  the  encrypted table (cf. table 2 page 1).

Table 2: Representation of the encrypted table

When you look at the encrypted table, the only things that you can use are the distribution of the tuples into the differ-

1 This  example are largely  inspided  from the  example 1 of

[4] page 721 and of the examples of the section 2 of [3] pages

217 and 218.


ent  buckets  and/or  the  relations  between the  partitions  of the columns.

1.2   Algorithms

Two algorithms are presented with their pseudo-code.  The Query-Optimal-Bucketization Algorithm (QOB) is used to make a uniform distribution of the tuple into the buckets. Here the partition is just a cutting of the domain of the sensi- tive attribute.  In the opposition of the partition proposed in [3], where the partition is “equi-weight”, the partition made by QOB is similar to the “equi-depth” partition introduced in [1].

The second algorithm  answer to a lake of privacy from the first  one, particularly when you combine two  indexed  at- tributes.  The Controlled-Diffusion Algorithm  (CD)  creates a partition that is not as simple as a plain cutting of the do- main. Moreover the algorithm  has a random comportment during the filling of its “composite buckets” (CB).

1.3   Experiments

The experiments, conducted in the original paper [4], pointed out some considerations about the balance between perfor- mance (in terms of query selections) and privacy.  For that reasons they had introduced some indicators in order to mea- sure performance and privacy.

We invite you to refer to the subsection 5.2 (page 5) to dis- cover our version of these experiments.

2.   INTRODUCTION

Today the information storage is critical for most businesses and not every company can afford an on-site hosted database. It becomes more and more usual - particularly when the data retrieval are needed from any Internet access point - to call for a external application service provider to outsource the data center operation.

Encryption is usually supported to protect the data and in- dexing is commonly used to make the queries more efficient, however it appears that  the  use of indexation  may imply some information leak.  A challenge is to find a balance be- tween privacy and efficiency.

2.1   Scope

The scope of our paper is,....  we consider ...

by Monika. to check

The scope of our paper is analyzing a software implementa- tion of data  partitioning encryption over a sensitive single attribute and multi attributes.  For the multi attributes, we limit our scope into two sensitive attributes but this model is easily generalizable for more attributes.  Our goal in de- veloping this implementation is inferred from the paper [4] by considering the further aim of designing an algorithm for the privacy and efficiency challenge.

Our scope also includes  the  attack  scenario. Here, we as- sume the attacker may get the information about the par- tition and the  exact distribution  of either  single  attribute data  set or the  two  attributes  data  set.   This  attack  will


be resolved by utilizing the Hilbert Curve order, which has been proposed due to better locality preserving behavior for multidimensional database.

2.2   Objective

Our personal objectives for this project are to create a basic model of a database table and to implement on it the two algorithms presented in [4]. We will try to re-conduct a part of the experiments made in [4].

Join now!

We  also want  to  go a bit further by creating  attacks  on our  encrypted tables.    According  to  the  assumptions  for the attacks given in [4] we will need to use multi-sensitive- attributes, indeed if we just consider one sensitive attribute, i.e.  just one attribute use for the indexation,  we won’t be able to discover more information with our attacks.

Another challenging part of our paper is the generalization of QOB for two attributes by using along other things the Hilbert curve.

3.   OUR IMPLEMENTATION

3.1   Technologies

We have decided to use Java because it ...

This is a preview of the whole essay