×
Well done. You've clicked the tower. This would actually achieve something if you had logged in first. Use the key for that. The name takes you home. This is where all the applicables sit. And you can't apply any changes to my site unless you are logged in.

Our policy is best summarized as "we don't care about _you_, we care about _them_", no emails, so no forgetting your password. You have no rights. It's like you don't even exist. If you publish material, I reserve the right to remove it, or use it myself.

Don't impersonate. Don't name someone involuntarily. You can lose everything if you cross the line, and no, I won't cancel your automatic payments first, so you'll have to do it the hard way. See how serious this sounds? That's how serious you're meant to take these.

×
Register


Required. 150 characters or fewer. Letters, digits and @/./+/-/_ only.
  • Your password can’t be too similar to your other personal information.
  • Your password must contain at least 8 characters.
  • Your password can’t be a commonly used password.
  • Your password can’t be entirely numeric.

Enter the same password as before, for verification.
Login

Grow A Dic
Define A Word
Make Space
Set Task
Mark Post
Apply Votestyle
Create Votes
(From: saved spaces)
Exclude Votes
Apply Dic
Exclude Dic

Click here to flash read.

We investigate the fixed-budget best-arm identification (BAI) problem for
linear bandits in a potentially non-stationary environment. Given a finite arm
set $\mathcal{X}\subset\mathbb{R}^d$, a fixed budget $T$, and an unpredictable
sequence of parameters $\left\lbrace\theta_t\right\rbrace_{t=1}^{T}$, an
algorithm will aim to correctly identify the best arm $x^* :=
\arg\max_{x\in\mathcal{X}}x^\top\sum_{t=1}^{T}\theta_t$ with probability as
high as possible. Prior work has addressed the stationary setting where
$\theta_t = \theta_1$ for all $t$ and demonstrated that the error probability
decreases as $\exp(-T /\rho^*)$ for a problem-dependent constant $\rho^*$. But
in many real-world $A/B/n$ multivariate testing scenarios that motivate our
work, the environment is non-stationary and an algorithm expecting a stationary
setting can easily fail. For robust identification, it is well-known that if
arms are chosen randomly and non-adaptively from a G-optimal design over
$\mathcal{X}$ at each time then the error probability decreases as
$\exp(-T\Delta^2_{(1)}/d)$, where $\Delta_{(1)} = \min_{x \neq x^*} (x^* -
x)^\top \frac{1}{T}\sum_{t=1}^T \theta_t$. As there exist environments where
$\Delta_{(1)}^2/ d \ll 1/ \rho^*$, we are motivated to propose a novel
algorithm $\mathsf{P1}$-$\mathsf{RAGE}$ that aims to obtain the best of both
worlds: robustness to non-stationarity and fast rates of identification in
benign settings. We characterize the error probability of
$\mathsf{P1}$-$\mathsf{RAGE}$ and demonstrate empirically that the algorithm
indeed never performs worse than G-optimal design but compares favorably to the
best algorithms in the stationary setting.

Click here to read this post out
ID: 301565; Unique Viewers: 0
Unique Voters: 0
Total Votes: 0
Votes:
Latest Change: July 31, 2023, 7:30 a.m. Changes:
Dictionaries:
Words:
Spaces:
Views: 12
CC:
No creative common's license
Comments: