18244797. IDENTIFYING PERSONALLY IDENTIFIABLE INFORMATION WITHIN AN UNSTRUCTURED DATA STORE simplified abstract (Snap Inc.)

From WikiPatents
Jump to navigation Jump to search

IDENTIFYING PERSONALLY IDENTIFIABLE INFORMATION WITHIN AN UNSTRUCTURED DATA STORE

Organization Name

Snap Inc.

Inventor(s)

Vasyl Pihur of Santa Monica CA (US)

Subhash Sankuratripati of Playa Vista CA (US)

Dachuan Huang of Santa Monica CA (US)

Leah Fortier of Los Angeles CA (US)

IDENTIFYING PERSONALLY IDENTIFIABLE INFORMATION WITHIN AN UNSTRUCTURED DATA STORE - A simplified explanation of the abstract

This abstract first appeared for US patent application 18244797 titled 'IDENTIFYING PERSONALLY IDENTIFIABLE INFORMATION WITHIN AN UNSTRUCTURED DATA STORE

Simplified Explanation

Methods and systems for identifying personally identifiable information (PII) are disclosed in this patent application. The invention involves generating frequency maps of fields that store known PII information, counting occurrences of unique bigrams in these PII fields. A field of interest is then analyzed to generate a second frequency map. Correlations between the first frequency maps and the second frequency map are generated. If one of the correlations meets certain criteria, it is determined whether the field of interest includes PII or not. Access control for the field of interest is then based on whether it includes PII. Additionally, the storage location of data included in the field of interest may be determined based on whether it includes PII.

  • Frequency maps of fields storing known PII information are generated.
  • Occurrences of unique bigrams in the PII fields are counted.
  • A field of interest is analyzed to generate a second frequency map.
  • Correlations between the first frequency maps and the second frequency map are generated.
  • Access control for the field of interest is determined based on whether it includes PII.
  • The storage location of data included in the field of interest may be determined based on whether it includes PII.

Potential Applications

  • Data privacy protection in various industries such as healthcare, finance, and e-commerce.
  • Compliance with data protection regulations, such as GDPR or HIPAA.
  • Enhancing security measures for sensitive information.

Problems Solved

  • Efficient identification of personally identifiable information within data fields.
  • Automating the process of determining whether a field contains PII.
  • Enabling access control and storage location decisions based on the presence of PII.

Benefits

  • Improved data privacy and protection against unauthorized access.
  • Streamlined compliance with data protection regulations.
  • Enhanced efficiency and accuracy in identifying and handling PII.


Original Abstract Submitted

Methods and systems for identifying personally identifiable information (PII) are disclosed. In some aspects, frequency maps of fields storing known PII information are generated. The frequency maps may count occurrences of unique bigrams in the PII fields. A field of interest may then be analyzed to generate a second frequency map. Correlations between the first frequency maps and the second frequency map may be generated. If one of the correlations meets certain criterion, the disclosed embodiments may determine that the field of interest does or does not include PII. Access control for the field of interest may then be based on whether the field includes PII. In some aspects, a storage location of data included in the field of interest may be based on whether the field includes PII.