Lecture 19
Contents
Privacy
Data is the new oil
- Very useful
- But very dangerous
Issues around data collection
- Data linking - correlating data sets
- (De)anonymisation
- Massive Scale
- Data breach
- Fishing expeditions -> Using data out of collected purpose
Tools aren’t built for users
Tools are often built for government agencies to easily retrieve your data.
Who has your data?
Everyone tbh.
- Apple
- …
What data?
Location history, calls, emails, files, age, etcetera
Who uses your data
- Private Companies
- Government
- Intelligence Communities
Data Lakes
Pooling ‘streams’ of data into one big ‘lake’.
Think twice
- It’s end to end encrypted
- Malicious public/private keys might be added
- Algorithms not people
- People
- Only does X under Y conditions
- Bugs.
- It’s locked down - only accessible by X
- Rogue sysadmins (of data centers)
- ie Snowden
- Thorough auditing
- Oh.. do you now…
- Secure
- Anonymised
Anonymisation Techniques
- Redacting
- Encrypting / Hashing
- Pseudonyms
- Binning (generalising the coverage)
- Statistical noise
- Aggregation