Privacy Preserving Database Application
Testing
[Description]
[Research Task] [Sponsors] [People] [Papers]
[Software] [Related
Links]
Description
Government agencies and private organizations are
accumulating a vast amount of sensitive information on citizens. These
organizations often develop new applications to utilize the information stored
in the databases. An essential phase of developing such a new application is to
test it thoroughly prior to deployment. Traditionally, application developers
carry out tests on their local development databases. A major shortcoming of
this approach is that these local databases usually have only a small number of
data samples and hence cannot simulate satisfactorily a live environment. On the
other hand, the practice of testing applications against live production
databases is becoming increasingly undesirable due to the fact that such use of
live production databases may disclose sensitive data to an unauthorized tester
and incorrectly update the underlying databases.
This project investigates techniques to generate
mock databases for application software testing without revealing any
confidential information on the live production databases. Cryptographic
techniques such as indistinguishability will be used to model privacy
preservation as well as the degree of similarity between a live and a mock
database. Success of the project will open up a new approach to secure
application software testing and produce a prototype system for future
deployment.
Research Tasks
- Develop the theoretical model of application
performance vs. similarity vs. information disclosure.
- Develop the threat model for database
application testing.
- Extract rules (deterministic and
non-deterministic) and statistics effectively and efficiently from live
database resources including database catalog, DDL and ER model.
- Build the rule analyzer to prevent from security
leakage.
- Generate mock database efficiently based on
rules and statistics.
- Build a prototype system for privacy preserving
database application testing.
Sponsors
- This project is supported by the National Science Foundation under Grant
No. 0310974 ( Sept. 15 2003 -- August 31 2006,
$230,800). Any opinions, findings, and conclusions or recommendations
expressed in this material are those of the author(s) and do not necessarily
reflect the views of the National Science Foundation.
People
Faculty
Current
Graduate Students
- Ying Wu
(Ph.D. student)
- Songtao Guo (Ph.D. student)
- Ling Guo
(Ph.D. student)
- Guodong Jiao
(M.S. student)
Previous Graduate Students
-
Chintan
Sanghvi (M.S. student,
graduated in Dec 2005)
-
Amol Kedar
(M.S. student, graduated in Dec 2004)
- Jing Jin
(Ph.D. student)
Papers
- X. Wu, Y. Wang, S. Guo and Y.
Zheng. "Privacy Preserving Database Generation for Database Application
Testing", Fundamenta Informaticae, 78(4):595-612, 2007 .
- Y .Li, L. Qiu and X. Wu. "Privacy Preserving
Association Rule Mining with Bloom Filters". Journal of Intelligent
Information System, 29(3):253-278, 2007.
- S. Guo, X. Wu and Y. Li.
"Deriving Private Information from Perturbed Data Using IQR based Approach",
Second International Workshop on Privacy Data Management (PDM06),
In conjunction with
22nd ICDE conference, Atlanta, April 2006.
pdf
- S. Guo and X. Wu. "On the Use
of Spectral Filtering for Privacy Preserving Data Mining", Proceedings of the
21st ACM Symposium on Applied Computing (SAC06),
Dijon, France, April 23-27, 2006, pp. 622-626. (data mining track, acceptance
ratio: 20/59)
pdf
- X. Wu, S. Guo, and Y. Li.
"Towards Value Disclosure Analysis in Modeling General Databases", Proceedings
of the 21st ACM Symposium on Applied Computing (SAC06),
Dijon, France, April 23-27, 2006, pp.617-621. (data mining track, acceptance
ratio: 20/59)
pdf
-
Y. Wang and X. Wu. "Approximate Inverse Frequent Itemset Mining: Privacy,
Complexity, and Approximation". Proceedings of the 5th IEEE International Conference
on Data Mining (ICDM05), New Orleans, Nov 27-30, 2005. To appear (acceptance ratio:
69/630) (extended version: pdf)
-
X. Wu, C. Sanghvi, Y. Wang and Y. Zheng. "Privacy Aware Data Generation for
Testing Database Applications". Proceedings of the 9th International
Database Engineering and Application Symposium (IDEAS05),
Montreal, Canada, July 25-27, 2005. pp. 317-326. (acceptance ratio: 30/144).
pdf
-
X. Wu, Y. Wang and Y. Zheng. "Statistical Database
Modeling for Privacy Preserving Database Generation", Proceedings of 15th
International Symposium on Methodologies for Intelligent Systems (ISMIS05),
Saratoga Spring, New York, May 25-28, 2005, pp. 382-390.
- X. Wu, Y. Wu, Y. Wang and Y. Li. "Privacy
Aware Market Basket Data Set Generation: A Feasible Approach for Inverse
Frequent Set Mining", Proceedings of the 5th SIAM International Conference on Data Mining(SDM05),Newport
Beach, CA, April 21-23, 2005, pp.103-114. (acceptance ratio:40/218).
pdf
- Y. Wang, X. Wu and Y. Zheng. "Privacy Preserving Data Generation for
Database Application Performance Testing". 1st International Conference on
Trust and Privacy in Digital Business (TrustBus04),
Zaragoza, Spain, Sept 2004.
- X. Wu, Y. Wang and Y. Zheng. "Privacy
Preserving Database Application Testing",
Workshop on Privacy in the
Electronic Society, In conjunction with
10th ACM CCS,
Washington D.C., Oct 2003.
Software
The first version of our prototype system is available. You may visit
this for some snapshots
(better to download, save, and run since there are some animations).
Related Links
Some research projects, papers and products can be found
here