Balancing Data Confidentiality and Data Quality: A two-day tutorial sponsored by DIMACS and DyDAn

November 8 - 9, 2007
DIMACS Center, CoRE Building, Rutgers University

Larry Cox, CDC, ljtcox at
Presented under the auspices of the Special Focus on Computational and Mathematical Epidemiology, the Special Focus on Communication Security and Information Privacy and the Center for Dynamic Data Analysis (DyDAn).

Workshop Program:

This is a preliminary program.

Thursday, November 8, 2007

 8:30 -  9:20  Registration and Breakfast

 9:20 -  9:30  Welcome and Opening Remarks
               Fred Roberts, DIMACS Director    
 9:30 - 10:45  What is Statistical Disclosure?

               Qualitative Issues
	           Ethical, legal and statistical considerations 
	           Balancing the right to privacy with the need to know
		   Administrative solutions
		   Disclosure checklists
		   Small geography and domain data
	       Quantitative Issues
		   Defining statistical disclosure quantitatively 
	           Illustrative example

10:45 - 11:00  Morning Break

11:00 - 12:30  Statistical Disclosure Limitation (SDL) for Frequency Count Data

               Examining and defining the problem
               Rounding and perturbation methods and their effects on data quality
	       Swapping and switching methods and their effects on data quality
12:30 -  1:45  Lunch

 1:45 -  3:15  SDL for Aggregate Magnitude Data

	       Quantifying disclosure: Statistical disclosure rules
	       Cell bounds and disclosure audit
	       Complementary cell suppression
	           Mathematical statement of the cell suppression problem
	           Why cell suppression is a very difficult problem
		   Using mathematical networks for complementary cell suppression 
		   Quality effects of cell suppression
		   Releasing interval data

 3:15 -  3:30  Afternoon Break

 3:30 -  5:00  SDL for Aggregate Magnitude Data (cont.)

	       Controlled tabular adjustment (CTA)
		   The CTA method
                   Quality-preserving controlled tabular adjustment (QP-CTA)
                   Minimum discrimination information controlled tabular adjustment (MDI-CTA)
	       Perturbing the underlying microdata

Friday, November 9, 2007

 8:00 -  8:30  Continental breakfast

 8:30 - 10:00  SDL in Microdata
	       Defining microdata disclosure 
	       Likelihood of disclosure and risk of disclosure
	       Censoring. Rounding. Perturbation
	       Microaggregation and its effects on data quality
               Blank and impute
	       Synthetic microdata and its effects on data quality
	       Contextual variables
	       Research data centers, remote access and remote execution

10:00 - 10:15  Morning Break

10:15 - 11:45  SDL in Microdata (continued)

               Small domain data
               Effectiveness of SDL methods for microdata 
	       Disclosure risk analysis
	       Defining disclosure and disclosure risk
	       Secure multi-party regression

11:45 -  1:00  Lunch

 1:00 -  2:15  SDL in Statistical Data Bases

	       Statistical data base query systems as multi-dimensional tables 
	       Estimating confidential and missing data
	       Releasing marginal totals or log-linear models and effects on data quality
	       Secure distributed statistical analysis

 2:15 -  2:30  Afternoon Break

 2:30 -  3:00  Wrap-Up and Discussion

	       Brief discussion of the literature
	       Questions, comments, discussion

 3:00	       Adjourn

