You make explicit that all data that people enter, they enter for purposes of sharing. At the same time, you ban creating profiles with data that has not been explicitly shared. IMO:
- Make a telefone-book style listing, or searching for "all metalheads < 25 near Chicago" where people entered that into their profiles -> OK
- Tracking users on your site -> OK
- Tracking users on third party sites, and then aggregating this data, so you can see "people who searched for baby carrages" or "people who bought diapers with their credit card" -> not OK
- Having some kind of database where people could concievably look up what user tqi purchased, searched, what their political affiliation is (when not made public) -> not OK (unless you have extreme auditibility, four-eye principle, and so on)
"You make explicit that all data that people enter, they enter for purposes of sharing."
I think the data captured by CA was also entered for the purposes of sharing, (often) limited to friends and friends of friends. I think the crux of this all is that as a society we haven't really established how those rights are transferred. If I share my email address with a friend, can they share it with their contact management app? I'm not sure how you create a consistent policy in a federated model.
The German Facebook clone back in the day was called StudiVZ which means "Student's directory". This was before social media and was more of a social network. Everything you put in there you do because you want it to be public, like your number in a telefone directory. It was almost a pure platform for self-presentation, like MySpace or LinkedIn.
I'm well aware of "more is different" aka the dialectic transform of quantity in quality. Lots of data that in individually innocent can be problematic if somebody amasses it. But especially for this reason I think it is not good to have these kind of semi-public spaces where the data is public and the only protection is it is cumbersome to collect. Public data should be clearly public, and private data should be clearly private, and the UX should be really clear so people know what is happening.
(By the way, I'm not even sure CA was a "scandal" or that it was bad for FB. I think the only effect was that FB used it to justify locking down their API more.)