Jump to content

18627702. ONLINE SYSTEM WITH BANDIT FEATURE AND AUTO-REGRESSIVE TEMPORAL STRUCTURE (Massachusetts Institute of Technology)

From WikiPatents

ONLINE SYSTEM WITH BANDIT FEATURE AND AUTO-REGRESSIVE TEMPORAL STRUCTURE

Organization Name

Massachusetts Institute of Technology

Inventor(s)

Qinyi Chen of Cambridge MA (US)

Negin Golrezaei of Cambridge MA (US)

Djallel Bouneffouf of Poughkeepsie NY (US)

ONLINE SYSTEM WITH BANDIT FEATURE AND AUTO-REGRESSIVE TEMPORAL STRUCTURE

This abstract first appeared for US patent application 18627702 titled 'ONLINE SYSTEM WITH BANDIT FEATURE AND AUTO-REGRESSIVE TEMPORAL STRUCTURE

Original Abstract Submitted

A multi-armed bandit (MAB) problem is obtained and a per-round regret lower bound is determined, wherein a corresponding regret is measured against a benchmark. The multi-armed bandit problem is provided to an algorithm that has a per-round regret that is close to the determined per-round regret lower bound, wherein the algorithm dynamically adapts to changes and discards irrelevant past information by alternating between recently pulled arms and unpulled arms having potential, wherein the alternating comprises updating an estimate of an expected reward of each arm within each epoch and an estimate for an error bound that captures an amount of error contained in the estimate of the expected reward for each arm within each epoch based on the auto-regressive temporal structure with trend components, and restarting the algorithm.

Cookies help us deliver our services. By using our services, you agree to our use of cookies.