US Patent Application 17719740. DEEP REINFORCEMENT LEARNING FOR SKILL RECOMMENDATION simplified abstract

From WikiPatents
Revision as of 05:50, 26 October 2023 by Wikipatents (talk | contribs) (Creating a new page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

DEEP REINFORCEMENT LEARNING FOR SKILL RECOMMENDATION

Organization Name

Microsoft Technology Licensing, LLC


Inventor(s)

Chujie Zheng of Foster City CA (US)


Sufeng Niu of Fremont CA (US)


Xiao Yan of Sunnyvale CA (US)


Qidu He of Sunnyvale CA (US)


Jaewon Yang of Sunnyvale CA (US)


Yanen Li of Foster City CA (US)


Yiming Wang of Sunnyvale CA (US)


DEEP REINFORCEMENT LEARNING FOR SKILL RECOMMENDATION - A simplified explanation of the abstract

  • This abstract for appeared for US patent application number 17719740 Titled 'DEEP REINFORCEMENT LEARNING FOR SKILL RECOMMENDATION'

Simplified Explanation

The abstract describes a technique for training a recommendation model for an online service using deep reinforcement learning. The method involves using a Markov decision process, which includes a state space representing reference users, an action space representing user actions, and a reward function. The reward function provides rewards based on user interaction data and user engagement with the online service.


Original Abstract Submitted

Techniques for using deep reinforcement learning for training a recommendation model for an online service are disclosed herein. In some embodiments, a computer-implemented method comprises training a recommendation model using deep reinforcement learning and a Markov decision process, where the Markov decision process has a state space including state embeddings of a plurality of reference users, an action space including action embeddings of the plurality of reference users, and a reward function. The reward function may be configured to issue a first reward based on current impression interaction data and a second reward based on a measurement of engagement of the reference user with the online service.