17850763. MACHINE LEARNING SYSTEM WITH TWO ENCODER TOWERS FOR SEMANTIC MATCHING simplified abstract (Microsoft Technology Licensing, LLC)
MACHINE LEARNING SYSTEM WITH TWO ENCODER TOWERS FOR SEMANTIC MATCHING
Organization Name
Microsoft Technology Licensing, LLC
Inventor(s)
Sudipto Mukherjee of Seattle WA (US)
Robin Abraham of Redmond WA (US)
MACHINE LEARNING SYSTEM WITH TWO ENCODER TOWERS FOR SEMANTIC MATCHING - A simplified explanation of the abstract
This abstract first appeared for US patent application 17850763 titled 'MACHINE LEARNING SYSTEM WITH TWO ENCODER TOWERS FOR SEMANTIC MATCHING
Simplified Explanation
This patent application describes a machine learning system that uses a two-tower model for retrieving relevant chemical reaction procedures based on a given query chemical reaction. The model utilizes attention-based transformers and neural networks to convert tokenized representations of chemical reactions and procedures into embeddings in a shared embedding space.
- The two-tower model consists of two separate towers, each containing a transformer network, a pooling layer, a normalization layer, and a neural network.
- Labeled data pairs are used to train the model, consisting of a chemical reaction and the corresponding text of a chemical reaction procedure.
- The model can be used to find chemical reaction procedures for a specific chemical reaction and also for similar reactions.
- The architecture and training of the model enable semantic matching based on chemical structures.
- The model achieves high accuracy, with an average recall at K=5 of 95.9%.
Potential Applications
- Chemical research and development
- Pharmaceutical industry
- Chemical manufacturing processes
- Education and training in chemistry
Problems Solved
- Difficulty in finding relevant chemical reaction procedures
- Lack of efficient semantic matching based on chemical structures
- Time-consuming manual search for appropriate procedures
Benefits
- Improved efficiency in finding chemical reaction procedures
- Enhanced accuracy in retrieving relevant procedures
- Time and cost savings in chemical research and development
- Facilitates knowledge sharing and collaboration in the field of chemistry
Original Abstract Submitted
This disclosure describes a machine learning system that includes a contrastive learning based two-tower model for retrieval of relevant chemical reaction procedures given a query chemical reaction. The two-tower model uses attention-based transformers and neural networks to convert tokenized representations of chemical reactions and chemical reaction procedures to embeddings in a shared embedding space. Each tower can include a transformer network, a pooling layer, a normalization layer, and a neural network. The model is trained with labeled data pairs that include a chemical reaction and the text of a chemical reaction procedure for that chemical reaction. New queries can locate chemical reaction procedures for performing a given chemical reaction as well as procedures for similar chemical reactions. The architecture and training of the model make it possible to perform semantic matching based on chemical structures. The model is highly accurate providing an average recall at K=5 of 95.9%.