Scan an SMB Network Share
PII Crawler can scan an SMB / CIFS file share (a Windows shared folder, a Samba server, or a NAS) over the network. Files are read in over SMB2/3, scanned in memory, and discarded right after. The share is never mounted, so you don't need root or fstab edits.
This guide covers doing it from the TUI and the web UI. SMB scanning isn't available as a piicrawler scan CLI command yet, so use the TUI or web UI to set it up.
What you'll need
SMB support is built into the PII Crawler binary. You don't need to install smbclient or Samba on Linux, macOS, or Windows. The same binary works the same way on all three.
You'll need:
- The server hostname or IP (for example
fileserver01or10.0.0.42). - The share name (for example
Public,Finance,home$). PII Crawler doesn't enumerate shares on a server, so you'll need to know the name ahead of time or ask whoever administers the server. - Credentials: a username and password, plus a Windows domain or workgroup if the share requires one. NTLM authentication is used; Kerberos isn't supported yet.
Save a credential (recommended)
Saving credentials once means you don't re-enter them every time you re-scan, and the password never goes into your shell history.
How credentials are protected
SMB credential passwords are encrypted with AES-256-GCM under a 32-byte data-encryption key (DEK). The DEK never sits on disk in plaintext. It's wrapped under a key-encryption key derived (via Argon2id) from a credential password you choose. Someone who steals just the SQLite database can't read any stored secrets without your credential password.
The credential password is separate from the web-login password. They protect different things. You can use the same string for both if you want, but nothing forces it.
There is no recovery. If you forget the credential password, every saved SMB credential becomes unreadable. Pick something you'll remember (or store it in a password manager) before you save your first credential.
The first time you open the SMB Credentials screen (TUI) or the Settings → Network Share Credentials section (web UI), you'll be prompted to set the credential password. On later app launches you'll be prompted to unlock with the same password the first time anything in the session touches stored credentials. The DEK then stays in memory until the app exits.
Credential scope
You can scope a credential two ways:
- Share-specific: set the share name when creating the credential. Used only when scanning that exact share.
- Server-level: leave the share name blank. Used as a fallback for any share on that server when no share-specific credential exists.
TUI
- From the scan list, press c to open SMB Credentials.
- If this is your first time, you'll be prompted to set a credential password. Type it twice and press Enter. On later launches you'll be prompted to unlock with the same password instead.
- Press n to add a new credential.
- Fill in Server, Share (blank for server-level), Domain (optional), Username, Password. Tab moves between fields.
- Press Enter to save. d deletes the selected credential.
Web UI
Open Settings → Network Share Credentials. Your first visit shows the Set credential password form. After that, you'll see the Unlock credentials form. Enter your password to access the credential add form and the saved-credentials table.
You can also save a credential inline from the new-scan form. If you haven't set the credential password yet, or the session is locked, the new-scan form pops up the same set-password or unlock dialog before saving — no need to detour through the Settings page first.
Headless deployments (server, watch mode, CI)
Set PIICRAWLER_CRED_PASSWORD in the environment to unlock credentials automatically the first time they're used. If your pipeline already manages its own raw key, set PIICRAWLER_CRED_KEY_BASE64 instead. That injects a 32-byte base64 DEK directly and skips the password prompt entirely, which is handy for CI tests against an isolated database.
See CLI Reference → Environment variables for the full list of variables PII Crawler reads.
The web UI also exposes the unlock flow over HTTP for orchestration:
GET /api/cred/statusreturns{is_set, is_unlocked}.POST /api/cred/unlock{password}unlocks the running server.POST /api/cred/lockdrops the cached DEK without restarting.POST /api/cred/password{password}(first-time set) or{current_password, password}(change).
These endpoints sit behind the regular web-login session. Once you're logged into the web UI you can call them.
Changing the credential password
Use the web UI (POST /api/cred/password with both current_password and password). The change re-wraps the DEK under your new password without re-encrypting any saved credentials, so it's instant no matter how many credentials you have stored.
Start a network scan
TUI
- From the scan list, press N (capital N) to open New Network Share Scan.
- Fill in:
- Server: hostname or IP.
- Share Name: the share to scan.
- Subfolder (optional): restrict the scan to a path inside the share, for example
Departments\Finance\2024. Use backslashes; forward slashes are accepted and converted. - Scan Name (optional): defaults to
\\server\share\subfolderif left blank.
- Press Alt+C to cycle through saved credentials, or leave it on None (anonymous) for guest shares.
- Tune the throttle row if you need to (see Throttling below).
- Press Enter to start.
The status bar shows two phases: enumerating (walking the share recursively) and then scanning (downloading and analyzing each file). Press s to stop a running scan, and R to resume it later.
Web UI
- Click Create New Scan, then choose the Network share (SMB) scan type.
- Enter the server hostname/IP and the share name you want to scan. If you don't know what share names exist on the server, ask whoever administers it. PII Crawler does not enumerate them.
- Either pick a saved credential, or type a username, password, and (optionally) domain.
- Click Test to verify the server, share, and credentials all work together. A green banner confirms success; a red banner shows the SMB error verbatim.
- Optionally use the subfolder browser to drill into a specific path inside the share, set a scan name, and start the scan.
Command line
For servers, containers, and CI where no browser or interactive terminal is available, the smb subcommand runs the same scan headlessly:
piicrawler smb fileserver Finance -u alice --subfolder HR/2025 --quiet
The connection is tested up front (a bad host or credential fails fast with no scan record left behind), the results are saved to the local database like any other scan, and the new scan ID is printed to stdout for use with report and findings. The password comes from -p/--password or the PIICRAWLER_SMB_PASSWORD environment variable. Unlike the TUI and web UI, the CLI does not persist the credential to the database — see the smb command reference for the full flag list and throttle controls.
Throttling network scans
Network scans have three throttle controls that local-disk scans don't need. The defaults are conservative. Raise them if your link and the file server can handle it.
| Setting | What it does |
|---|---|
| Max Concurrent | Number of files downloaded and scanned in parallel. Higher is faster, but opens more SMB sessions. |
| Delay (ms) | Pause between starting each file read. Useful on shared links or older NAS hardware. |
| BW Limit (MB/s) | Soft cap on aggregate download bandwidth. PII Crawler tracks bytes pulled and waits when you're over budget. |
A reasonable starting point for a LAN file server: 4 concurrent, 0 ms delay, no bandwidth cap. For a slow VPN to a remote office, try 2 concurrent, 100 ms delay, 5 MB/s.
What gets scanned
- The walker recurses through every directory under the chosen subfolder.
- Files larger than the configured max file size are skipped, as are extensions that aren't in the allowed list and any path that matches an exclusion pattern.
- Each remaining file is read into memory over SMB, run through the same extraction and detection pipeline as a local scan, and dropped right after.
- Findings reference the file with its UNC path (
\\server\share\path\to\file.pdf) so you can locate the source from any SMB-aware tool.
Acting on the results
Once triage has confirmed which share files genuinely contain PII, collect them into an action list from the scan's Files view. Delete and quarantine run directly against the share with the same saved credential; quarantine pulls the files into a local holding folder and can restore them to the share later. See Files on network shares for the details.
Folders you can't access
A network scan walks the share with whatever permissions the scanning account has. When it reaches a folder that account can't read, it does not stop the scan. The folder is recorded as an entry with an error status and a "permission denied" reason, and the walk continues with the rest of the share.
This is what you want when you are auditing what a given account can actually reach. Run the scan as a standard user and the protected folders show up in the results as inaccessible, while everything that account can open is scanned normally. To see them, open View Files and filter by the error status, or read the error breakdown on the scan report, which tallies a "permission denied" count.
One exception: if the share itself can't be reached at all (server down, wrong share name, or the account has no access to the share root), the scan fails up front rather than reporting an empty result. Only folders inside a reachable share are recorded-and-skipped this way.
Troubleshooting
Connection failed with a logon error: wrong username, password, or domain. If the share belongs to an Active Directory domain, the Domain field has to match. For a workgroup share, leave it blank. Kerberos-only realms aren't supported yet, so the account needs to accept NTLM.
STATUS_BAD_NETWORK_NAME: the share name is wrong, or the user doesn't have access to it. Double-check the share name (case sensitivity varies by server) and verify the credential has read access to that share.
Scan is very slow: most of the time on a network scan is spent reading bytes. Increase Max Concurrent and remove any bandwidth cap if your link can handle it. If the file server itself is the bottleneck, lowering concurrency may actually help by avoiding queueing. To pinpoint specific files that are dragging the scan down, open the scan in the Web UI's View Files page and click the Duration column header — files sort by per-file scan time, slowest first.
Files keep failing with Download failed: transient SMB errors are retried automatically, but a persistent failure usually means a permission problem on a specific file. Check the Logs view (press l from the TUI scan list) for the underlying error.
Scan finishes with fewer results than expected: folders the scanning account can't read are skipped and recorded as error entries rather than scanned (see Folders you can't access). Filter View Files by the error status to see exactly which paths were inaccessible. If you expected to reach them, re-run with an account that has read access to those folders.