Sharing data while preserving privacy

I’ve been thinking about clever ways to aggregate data with minimal impact on privacy.

The simplest way is to just not share that specific data. Instead of sharing entire buckets you might just share aggregate results, like how much you’ve spend on site X in a given week.

There are also more difficult but more powerful approaches in sight on the research front. Fields like homomorphic encryption are really hot right now but I’m awfully underinformed about them. I’m studying it a bit in one of my courses right now but the field seems young and I think there’s a small probability that these approaches will be feasible anytime soon (if ever).

Regardless, here are a couple related Wikipedia articles:

1 Like

This sounds like a case for differential privacy:
https://en.wikipedia.org/wiki/Differential_privacy

The idea is that you add some random noise on top of the measured results. It works very well on aggregated data.

2 Likes