18740292. ADVERSARIAL LANGUAGE IMITATION WITH CONSTRAINED EXEMPLARS simplified abstract (Microsoft Technology Licensing, LLC)
Contents
ADVERSARIAL LANGUAGE IMITATION WITH CONSTRAINED EXEMPLARS
Organization Name
Microsoft Technology Licensing, LLC
Inventor(s)
Hamid Palangi of Bellevue WA (US)
Saadia Kai Gabriel of Seattle WA (US)
Thomas Hartvigsen of Cambridge MA (US)
Dipankar Ray of Redmond WA (US)
Semiha Ece Kamar Eden of Redmond WA (US)
ADVERSARIAL LANGUAGE IMITATION WITH CONSTRAINED EXEMPLARS - A simplified explanation of the abstract
This abstract first appeared for US patent application 18740292 titled 'ADVERSARIAL LANGUAGE IMITATION WITH CONSTRAINED EXEMPLARS
The abstract discusses devices, systems, and methods for generating a confusing phrase to a language classifier.
- Determining a classification score of a prompt by the language classifier.
- Predicting likely next words and their probabilities based on a pre-trained language model.
- Determining a second classification score for each likely next word.
- Adversarial classifier determining scores for each likely next word based on various factors.
- Selecting the next word based on the respective scores.
- Key Features and Innovation:**
- Generation of confusing phrases for language classifiers.
- Utilization of pre-trained language models for prediction.
- Adversarial classification to create confusing phrases.
- Potential Applications:**
- Enhancing privacy by confusing language classifiers.
- Improving security in communication.
- Creating challenging text for AI systems.
- Problems Solved:**
- Preventing accurate classification by language models.
- Enhancing privacy and security in communication.
- Challenging AI systems with confusing phrases.
- Benefits:**
- Improved privacy and security.
- Enhanced protection against language analysis.
- Increased difficulty for AI systems to accurately classify text.
- Commercial Applications:**
Potential commercial applications include data security software, communication encryption tools, and AI system testing platforms.
- Prior Art:**
Further research can be conducted in the field of adversarial classification and language model manipulation.
- Frequently Updated Research:**
Stay updated on advancements in language model manipulation and adversarial classification techniques.
- Questions about Language Model Manipulation:**
1. How can language model manipulation impact data security?
- Language model manipulation can enhance data security by confusing classifiers and preventing accurate analysis of text.
2. What are the potential implications of using adversarial classification in communication encryption?
- Adversarial classification can improve the security of encrypted communication by creating challenging text for language classifiers.
Original Abstract Submitted
Generally discussed herein are devices, systems, and methods for generating a phrase that is confusing to a language classifier. A method can include determining, by the LC, a first classification score (CS) of a prompt indicating whether the prompt is a first class or a second class, predicting, based on the prompt and by a pre-trained language model (PLM), likely next words and a corresponding probability for each of the likely next words, determining, by the LC, a second CS for each of the likely next words, determining, by an adversarial classifier, respective scores for each of the likely next words, the respective scores determined based on the first CS of the prompt, the second CS of the likely next words, and the probabilities of the likely next words, and selecting, by an adversarial classifier, a next word of the likely next words based on the respective scores.