Every organization has at least one shared drive that's been accumulating files for years. Maybe it's a department file server that predates your current IT team. Maybe it's a Synology NAS that someone set up under a desk in 2017. Whatever it is, it contains years of documents that nobody has inventoried, nobody fully owns, and almost certainly contain personal data that your organization is responsible for protecting.
Shared drives and NAS devices are where PII goes to hide. Not because anyone put it there maliciously, but because that's how shared storage works in practice. Someone exports a customer list to CSV for a one-time project and drops it on the S: drive. A manager saves a spreadsheet of employee SSNs to a folder called "temp" and forgets about it. An intern downloads a database extract, zips it, and stores it in a folder nested four levels deep. These files accumulate quietly, and without active scanning, they sit there indefinitely.
This guide walks through the practical challenges of scanning network-attached storage for PII and how to approach each one without disrupting your environment.
Before you scan anything, you need to know what you're scanning. This sounds obvious, but most organizations don't have a single, authoritative list of every shared drive and NAS on their network. Storage tends to sprawl. IT manages the primary file servers, but individual departments may have their own NAS appliances. Remote offices might have local storage that syncs intermittently. Legacy systems from acquisitions might still be accessible on the network even if nobody actively uses them.
Start by building an inventory. Check your DNS records and DHCP leases for devices running SMB or NFS services. Look at Active Directory for published shares. Ask department heads whether they have any local storage—many will know about NAS devices that IT doesn't. If you have a network scanning tool, run a sweep for open ports 445 (SMB) and 2049 (NFS).
Once you have your inventory, document the mount paths. On Windows, this means UNC paths like \server\share or mapped drive letters. On Linux and macOS, these are mount points in /etc/fstab or automounter configurations. For NAS devices, note whether they expose shares via SMB, NFS, AFP, or a combination. Your PII scanning tool will need to access these paths, so verify that each one is mountable from whatever machine you'll run scans on.
Shared drives have years of accumulated permissions—some inherited, some explicitly set, many that no longer make sense. The account running your PII scanner needs read access to everything you want to scan. This is where things get complicated.
If you run the scan as a regular user account, you'll miss files in directories where that account doesn't have read permissions. If you run it as a domain admin, you'll see everything, but you may trigger security alerts or violate your own least-privilege policies.
The practical middle ground is a dedicated service account with read-only access to the shares you need to scan. Work with your Active Directory team to create this account and grant it appropriate permissions. Document why the account exists—you don't want a future admin to disable it during a permissions cleanup, and you don't want it mistaken for a stale account during an audit.
For NAS devices, permissions work differently depending on the vendor. Synology, QNAP, and TrueNAS all have their own permission models layered on top of the underlying filesystem. Some NAS appliances let you grant read access at the share level, while others require folder-by-folder configuration. Test access before scheduling your first scan. There's nothing worse than discovering your service account can't read half the shares after a scan has been running for six hours.
A shared drive that's been in use for a decade can easily contain millions of files. Departmental file servers at mid-size companies routinely hold between 2 and 10 million files. Larger organizations with centralized storage may have tens of millions. The first challenge isn't scanning these files for PII—it's simply enumerating them.
Most PII scanning tools need to walk the directory tree to build a list of files before they can begin analyzing content. If the tool does this slowly, you're spending hours just building the file list before any actual scanning begins. This is where tool selection matters significantly. PII Crawler can enumerate 1 million files in 2-3 seconds by walking the filesystem recursively and recording paths directly to a SQLite database, handling deduplication and filtering as it goes.
Beyond enumeration speed, consider how the tool handles the file types typically found on shared drives. You'll encounter everything: Word documents, Excel spreadsheets, PDFs, plain text files, CSVs, scanned images, ZIP archives containing more of the same. Archive files are particularly common on shared drives because people zip up project folders before archiving them. A capable scanner should be able to look inside archives without extracting them to disk first.
Also consider file size distribution. Shared drives tend to have a long tail of very large files—database dumps, backup archives, video recordings. Your scanning tool should be able to skip files above a configurable size threshold. Scanning a 40GB SQL Server backup file is theoretically possible but rarely practical. Set sensible limits and document your exclusions so you can justify them during an audit.
Shared drives are shared. People are using them during business hours. A PII scan that reads millions of files will generate significant I/O load, and on older NAS hardware or saturated network links, this can visibly slow things down for users.
Schedule your first full scan for off-hours. Evenings, weekends, or maintenance windows are all reasonable choices. If your scanner supports throttling (limiting how many files it reads per second or how much bandwidth it consumes), use that too. The goal is to complete the scan without anyone calling the help desk to complain that the file server is slow.
After the initial full scan, switch to incremental scanning where possible. There's no reason to re-scan a 500-page PDF that hasn't been modified since 2019. Tracking file modification timestamps and only scanning new or changed files reduces both scan time and I/O load dramatically.
For NAS devices specifically, check whether the device itself is the bottleneck. Consumer-grade NAS appliances have limited CPU and memory. The scan isn't running on the NAS—it's running on the machine where your scanner is installed—but the NAS still needs to serve files over the network. A four-bay Synology under load from a scan while also serving files to 30 users will struggle. Plan accordingly.
One of the key architectural decisions in PII scanning is whether the tool processes data locally or sends it to a cloud service. For shared drives, this matters more than you might think.
Cloud-based scanners need to read your files and transmit content (or derivatives of it) over the network to a remote service. When you're dealing with millions of files on a shared drive, that's an enormous amount of data to move. It also means your sensitive data is leaving your network, which may violate compliance requirements—especially if the shared drive contains HIPAA-protected records, financial data, or anything covered by data residency rules.
A local scanner avoids both problems. You mount the network share on the machine where the scanner runs, and it reads files directly from the mounted volume. File contents are analyzed in memory and results are stored locally. Nothing leaves your network. For IT teams that manage sensitive environments, this is often a hard requirement rather than a preference.
PII Crawler takes this approach. It runs entirely on your machine, reads files from any mounted path—whether it's a local disk, a UNC path, or a mounted NAS share—and stores all results in a local SQLite database. There's no cloud component and no data transmission. You can scan a mounted network drive the same way you'd scan a local folder: just point it at the path.
Finding PII on your shared drives is only useful if you act on the results. For each finding, you need to determine: who owns this file, is there a business reason to keep it, and if so, does it need to be in a shared location?
Most organizations find that a significant percentage of PII on shared drives is in files that serve no current business purpose—old exports, archived projects, test data that someone forgot to clean up. These can often be deleted or moved to a more controlled location with appropriate access restrictions.
For files that need to remain accessible, consider whether the PII they contain can be redacted, masked, or moved to a system with stronger access controls. A customer list that needs to be on the shared drive for reference doesn't necessarily need to include full social security numbers.
Document your findings and your remediation actions. Compliance frameworks don't just want you to find PII—they want evidence that you found it and did something about it. Export your scan results, record what was deleted or moved, and keep that documentation accessible for your next audit.
Shared drives are rarely the first place organizations think to look for exposed PII, but they're often where the worst exposures live. The files are old, the permissions are loose, and nobody is watching. A structured scanning program, run regularly with the right tooling, turns an unknown risk into a managed one.