US Patent Application 18345834. INFERRING INFORMATION ABOUT A WEBPAGE BASED UPON A UNIFORM RESOURCE LOCATOR OF THE WEBPAGE simplified abstract

From WikiPatents
Revision as of 04:12, 1 November 2023 by Wikipatents (talk | contribs) (Creating a new page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

INFERRING INFORMATION ABOUT A WEBPAGE BASED UPON A UNIFORM RESOURCE LOCATOR OF THE WEBPAGE

Organization Name

Microsoft Technology Licensing, LLC


Inventor(s)

Siarhei Alonichau of Seattle WA (US)


Aliaksei Bondarionok of Redmond WA (US)


Junaid Ahmed of Bellevue WA (US)


INFERRING INFORMATION ABOUT A WEBPAGE BASED UPON A UNIFORM RESOURCE LOCATOR OF THE WEBPAGE - A simplified explanation of the abstract

  • This abstract for appeared for US patent application number 18345834 Titled 'INFERRING INFORMATION ABOUT A WEBPAGE BASED UPON A UNIFORM RESOURCE LOCATOR OF THE WEBPAGE'

Simplified Explanation

This abstract describes technologies that can infer information about a webpage based on the semantics of its URL. The URL is broken down into individual tokens, and an embedding is created based on these tokens, which represents the meaning or context of the URL. Using this embedding, information about the webpage linked to by the URL is inferred. The webpage is then retrieved, and information is extracted from it based on the inferred information about the webpage.


Original Abstract Submitted

Described herein are technologies related to inferring information about a webpage based upon semantics of a uniform resource location (URL) of the webpage. The URL is tokenized to create a sequence of tokens. An embedding for the URL is generated based upon the sequence of tokens, wherein the embedding is representative of semantics of the URL. Based upon the embedding for the URL, information about the webpage pointed to by the URL is inferred, the webpage is retrieved, and information is extracted from the webpage based upon the information inferred about the webpage.