Synthetizing Privacy-Preserving Time Use Traces
Synthetizing data that reflects people's behavior traces is one of the most compelling methods for large-scale privacy-preserving data analysis. However, the task is challenging because of the complexity in generating accurate and useful behavioral traces, in complex domains such as mobility patterns or electronic health records. In this talk, I will present our work on synthesizing time use data that (derived from mobile phone logs.) We suggest several criteria for assessing the quality of the generated data, such as diversity, similarity to the original data, and statistical resemblance. We then compare Generative Adversarial Networks with other methods and discuss the implications of trace generation to solve troubling privacy problems.