The problem
GDPR requires companies to manage personal data transparently and delete it once retention periods are exceeded. In large organizations, that obligation collides with massive, distributed data landscapes.
A company may need to reason across hundreds of thousands of OneDrives, globally distributed shared drives, SharePoint sites, and other sources. Manual auditing is hardly feasible at that scale.
The goal is a proof of concept that can reliably identify sensitive data, categorize it, attribute it to a responsible person, and support deletion when required.