Ask HN: Where can I get a dump of modern emails for ML testing?
I need a dataset of modern emails to test against.
Where can I get a huge sample of anonymized emails from? I'm really struggling to find this anywhere.
I need a dataset of modern emails to test against.
Where can I get a huge sample of anonymized emails from? I'm really struggling to find this anywhere.
Is the Enron email dataset modern enough?
I don't know, but you might try asking one of these sub-reddits:
https://www.reddit.com/r/DHExchange/
https://www.reddit.com/r/datasets
https://www.reddit.com/r/opendata
The closest thing I've found to what I'm looking for is https://untroubled.org/spam/
I need real HTML emails that an actual human would have in their inbox, not just spam.
depends how modern and whether you need many organizations to test against but wikileaks has lots