Invited Talk: Flexible Privacy via Disguising and Revealing (Lily Tsai)

Time: 4-5 pm, Nov 5th, 2024
Location: CS 3310

Abstract: Many users today have tens to hundreds of accounts with web services that store sensitive data, from social media to tax preparation and e-commerce sites. While users have the right to delete their data (via e.g., the GDPR or CCPA), users want and deserve more nuanced controls over their data that don't exist today. For example, a user might wish to hide and protect data of an e-commerce or dating app profile when inactive, but also want their data to be present should they return to use the application. Today, however, services often provide only coarse-grained, blunt tools that result in all-or-nothing exposure of users' private information.

This thesis introduces the notion of disguised data, a reversible state of data in which sensitive data is selectively hidden. To demonstrate the feasibility of disguised data, this thesis also presents Edna—the first system for disguised data—which helps database-backed web applications allow users to remove their data without permanently losing their accounts, anonymize their old data, and selectively dissociate personal data from public profiles. Edna helps developers support these features while maintaining application functionality and referential integrity in the database via disguising and revealing transformations. Disguising selectively renders user data inaccessible via encryption, and revealing enables the user to restore their data to the application. Edna's techniques allow transformations to compose in any order, e.g., deleting a previously anonymized user's account, or restoring an account back to an anonymized state.

With Edna, web applications can enable flexible privacy features with reasonable developer effort and moderate performance impact on application operation throughput. In the Lobsters social media application—a 160k LoC web application with >16k users—adding Edna and its features takes less than 1k LoC, and decreases throughput 1-7% in the common case and up to 28% in the worst case (when the user owning 1% of all application data continuously disguises and reveals their account).

Bio: Lily Tsai currently works as part of SystemsResearch@Google (SRG), researching systematic ways to achieve better data privacy and security in data warehouses, ML frameworks, and more. She just graduated with her PhD at MIT in the PDOS group in 2024, where her research with Malte Schwarzkopf and Frans Kaashoek aimed to design systems for better data protections and security in web applications. Besides research, Lily loves to play violin, read sci-fi, hike, climb, and explore the world around her.