Abstract

A new inference control, called random sample queries, is proposed for safeguarding confidential data in on-line statistical databases. The random sample queries control deals directly with the basic principle of compromise by making it impossible for a questioner to control precisely the formation of query sets. Queries for relative frequencies and averages are computed using random samples drawn from the query sets. The sampling strategy permits the release of accurate and timely statistics and can be implemented at very low cost. Analysis shows the relative error in the statistics decreases as the query set size increases; in contrast, the effort required to compromise increases with the query set size due to large absolute errors. Experiments performed on a simulated database support the analysis.

Keywords

Computer scienceData miningSet (abstract data type)Sample (material)DatabaseOnline aggregationQuery optimizationInferenceSample size determinationContrast (vision)Information retrievalStatisticsWeb query classificationWeb search querySearch engineArtificial intelligenceMathematics

Affiliated Institutions

Related Publications

Publication Info

Year
1980
Type
article
Volume
5
Issue
3
Pages
291-315
Citations
202
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

202
OpenAlex

Cite This

Dorothy E. Denning (1980). Secure statistical databases with random sample queries. ACM Transactions on Database Systems , 5 (3) , 291-315. https://doi.org/10.1145/320613.320616

Identifiers

DOI
10.1145/320613.320616