PII Scanner(Postgres)

We're thrilled to announce the release of our PII Scanner after three months of dedicated effort. This powerful tool scans your PostgreSQL database using three advanced methods: Meta Scan, Data Scan, and NLP Scan.

Our solution outperforms existing alternatives by at least 5X. Leveraging advanced techniques like SpaCy, we can accurately detect PII entities such as first names and last names, which are often missed by traditional regex methods

How to generate HTML Report ?

Please select Option 4 as shown below and enter the values for the prompted inputs

NOTE -- Watch short demo video at https://youtu.be/HtP8N0Op-V4

Sample HTML Report

Metascan method

This method matches based on column names, not data. For instance, if you have a column named 'first_name' or 'SSN,' it will be flagged as a match. Note: This approach only checks column names, not the actual data

Datascan method(Default method)

This method matches based on data . It selects upto 10,000 records per table to identify possible matches

SpaCy method

Leveraging advanced techniques like SpaCy, we can accurately detect PII entities such as first names and last names, which are often missed by traditional regex methods

How does our solution outperform other existing alternatives for this feature?

Our solution supports a wider range of entities than other available options. We also employ advanced techniques like Spacy for matching last names, first names, and more. Additionally, we provide an HTML report, making it easier and faster to identify issues

Supported entities

  • Name

  • Email Address

  • Username

  • SSN

  • PO Box

  • IPAddress (ipv4)

  • Mac Address

  • OAuthToken

  • Location

  • Nationality

  • Gender

  • Bank Account Number

  • PAN Number

  • Adhar card Number

  • ITIN

  • Driving License Number

  • Passport Number

  • NHS Number

  • Password

  • Phone

  • Credit Card

  • Birth Date

  • Zip Code

References taken for some regexes

US Driving Licence https://docs.trellix.com/bundle/data-loss-prevention-11.10.x-classification-definitions-reference-guide/page/GUID-CA4A41FB-B897-4910-809E-ED33DEF9CE77.html

and https://success.skyhighsecurity.com/Skyhigh_Data_Loss_Prevention/Data_Identifiers/U.S._Driver's_License_Numbers

India driving licence | regex source => https://www.geeksforgeeks.org/how-to-validate-indian-driving-license-number-using-regular-expression/

Some regexes from pdscan , piicatcher and other sources

Last updated