Description (en)
Email provides a rich history of an organization yet poses unique challenges to archivists. It is difficult to acquire and process, due to sensitive contents and diverse topics and formats, which inhibits access and research. We plan to leverage predictive coding used by the legal community to identify and prioritize sensitive content for review and redaction while generating descriptive metadata of themes and trends. This will empower records creators, archivists, and researchers to better understand, synthesize, protect, and preserve email collections. Early findings and information on collaborative efforts are shared.