×
Well done. You've clicked the tower. This would actually achieve something if you had logged in first. Use the key for that. The name takes you home. This is where all the applicables sit. And you can't apply any changes to my site unless you are logged in.

Our policy is best summarized as "we don't care about _you_, we care about _them_", no emails, so no forgetting your password. You have no rights. It's like you don't even exist. If you publish material, I reserve the right to remove it, or use it myself.

Don't impersonate. Don't name someone involuntarily. You can lose everything if you cross the line, and no, I won't cancel your automatic payments first, so you'll have to do it the hard way. See how serious this sounds? That's how serious you're meant to take these.

×
Register


Required. 150 characters or fewer. Letters, digits and @/./+/-/_ only.
  • Your password can’t be too similar to your other personal information.
  • Your password must contain at least 8 characters.
  • Your password can’t be a commonly used password.
  • Your password can’t be entirely numeric.

Enter the same password as before, for verification.
Login

Grow A Dic
Define A Word
Make Space
Set Task
Mark Post
Apply Votestyle
Create Votes
(From: saved spaces)
Exclude Votes
Apply Dic
Exclude Dic

Click here to flash read.

We study off-policy evaluation (OPE) of contextual bandit policies for large
discrete action spaces where conventional importance-weighting approaches
suffer from excessive variance. To circumvent this variance issue, we propose a
new estimator, called OffCEM, that is based on the conjunct effect model (CEM),
a novel decomposition of the causal effect into a cluster effect and a residual
effect. OffCEM applies importance weighting only to action clusters and
addresses the residual causal effect through model-based reward estimation. We
show that the proposed estimator is unbiased under a new condition, called
local correctness, which only requires that the residual-effect model preserves
the relative expected reward differences of the actions within each cluster. To
best leverage the CEM and local correctness, we also propose a new two-step
procedure for performing model-based estimation that minimizes bias in the
first step and variance in the second step. We find that the resulting OffCEM
estimator substantially improves bias and variance compared to a range of
conventional estimators. Experiments demonstrate that OffCEM provides
substantial improvements in OPE especially in the presence of many actions.

Click here to read this post out
ID: 129639; Unique Viewers: 0
Voters: 0
Latest Change: May 16, 2023, 7:31 a.m. Changes:
Dictionaries:
Words:
Spaces:
Comments:
Newcom
<0:100>