17850763. MACHINE LEARNING SYSTEM WITH TWO ENCODER TOWERS FOR SEMANTIC MATCHING simplified abstract (Microsoft Technology Licensing, LLC)

From WikiPatents
Jump to navigation Jump to search

MACHINE LEARNING SYSTEM WITH TWO ENCODER TOWERS FOR SEMANTIC MATCHING

Organization Name

Microsoft Technology Licensing, LLC

Inventor(s)

Sudipto Mukherjee of Seattle WA (US)

Liang Du of Redmond WA (US)

Ke Jiang of Bellevue WA (US)

Robin Abraham of Redmond WA (US)

MACHINE LEARNING SYSTEM WITH TWO ENCODER TOWERS FOR SEMANTIC MATCHING - A simplified explanation of the abstract

This abstract first appeared for US patent application 17850763 titled 'MACHINE LEARNING SYSTEM WITH TWO ENCODER TOWERS FOR SEMANTIC MATCHING

Simplified Explanation

This patent application describes a machine learning system that uses a two-tower model for retrieving relevant chemical reaction procedures based on a given query chemical reaction. The model utilizes attention-based transformers and neural networks to convert tokenized representations of chemical reactions and procedures into embeddings in a shared embedding space.

  • The two-tower model consists of two separate towers, each containing a transformer network, a pooling layer, a normalization layer, and a neural network.
  • Labeled data pairs are used to train the model, consisting of a chemical reaction and the corresponding text of a chemical reaction procedure.
  • The model can be used to find chemical reaction procedures for a specific chemical reaction and also for similar reactions.
  • The architecture and training of the model enable semantic matching based on chemical structures.
  • The model achieves high accuracy, with an average recall at K=5 of 95.9%.

Potential Applications

  • Chemical research and development
  • Pharmaceutical industry
  • Chemical manufacturing processes
  • Education and training in chemistry

Problems Solved

  • Difficulty in finding relevant chemical reaction procedures
  • Lack of efficient semantic matching based on chemical structures
  • Time-consuming manual search for appropriate procedures

Benefits

  • Improved efficiency in finding chemical reaction procedures
  • Enhanced accuracy in retrieving relevant procedures
  • Time and cost savings in chemical research and development
  • Facilitates knowledge sharing and collaboration in the field of chemistry


Original Abstract Submitted

This disclosure describes a machine learning system that includes a contrastive learning based two-tower model for retrieval of relevant chemical reaction procedures given a query chemical reaction. The two-tower model uses attention-based transformers and neural networks to convert tokenized representations of chemical reactions and chemical reaction procedures to embeddings in a shared embedding space. Each tower can include a transformer network, a pooling layer, a normalization layer, and a neural network. The model is trained with labeled data pairs that include a chemical reaction and the text of a chemical reaction procedure for that chemical reaction. New queries can locate chemical reaction procedures for performing a given chemical reaction as well as procedures for similar chemical reactions. The architecture and training of the model make it possible to perform semantic matching based on chemical structures. The model is highly accurate providing an average recall at K=5 of 95.9%.