Usage

Getting Started

After you've downloaded PII Crawler and installed PII Crawler click PII Crawler to start the PII Crawler web interface. This will start an HTTP server runing at http://localhost:3001. You can navigate here from your browser or click open from the PII Crawler tray icon.

From here:

  • Click Create New Scan.
  • Choose a starting directory
  • Click Create Scan

This will start scanning all files contained within the starting directory. It may take awhile for the scan to finish but you can start reviewing any results immediately as they come in.

Viewing the results

To view the results click View Files on the scan:

From here you can filter and search through the results. You can also export results in CSV format.

Exact Match

If you have an incident where you know what you are searching for you can provide a group of terms to match against. PII Crawler will do a supplemental exact match search when the exact-match.json file is provided in the same directory. Provide your terms in a map of id/name of the group of terms and a list of strings for terms. If all terms are found within a distance (usually same file) an exact_match will be triggered.

Example:

{
    "Mrs. Hilda Schrader Whitcher": ["whitcher", "078-05-1120"],
    "Hilda Schrader Whitcher no dash": ["whitcher", "078051120"]
}

Note: This is no longer a real SSN but there is an interesting story behind it.

Scan only specific file types

./piicrawler scan ~ --include-exts pdf,csv,xlsx