<p>Motivated by time series forecasting, we study Online Linear Optimization
(OLO) under the coupling of two problem structures: the domain is unbounded,
and the performance of an algorithm is measured by its dynamic regret. Handling
either of them requires the regret bound to depend on certain complexity
measure of the comparator sequence -- specifically, the comparator norm in
unconstrained OLO, and the path length in dynamic regret. In contrast to a
recent work (Jacobsen & Cutkosky, 2022) that adapts to the combination of these
two complexity measures, we propose an alternative complexity measure by
recasting the problem into sparse…
Voters: 0
Votes: 0
Views: 0
Latest: Feb. 4, 2023, 11:04 p.m.
<p>Generative models can have distinct mode of failures like mode dropping and
low quality samples, which cannot be captured by a single scalar metric. To
address this, recent works propose evaluating generative models using precision
and recall, where precision measures quality of samples and recall measures the
coverage of the target distribution. Although a variety of discrepancy measures
between the target and estimated distribution are used to train generative
models, it is unclear what precision-recall trade-offs are achieved by various
choices of the discrepancy measures. In this paper, we show that achieving a
specified precision-recal…
Voters: 0
Votes: 0
Views: 0
Latest: Feb. 4, 2023, 11:04 p.m.
<p>The constantly expanding frequency and loss affected by natural disasters
pose a severe challenge to the traditional catastrophe insurance market. This
paper aims to develop an innovative framework of pricing catastrophic bonds
triggered by multiple events with extreme dependence structure. Given the low
contingency of the bond's cash flows and high return, the multiple-event CAT
bond may successfully transfer the catastrophe risk to the big financial
markets meeting the diversification of capital allocations for most potential
investors. The designed hybrid trigger mechanism helps reduce moral hazard and
improve bond attractiveness wi…
Voters: 0
Votes: 0
Views: 0
Latest: Feb. 4, 2023, 11:04 p.m.
<p>Extended cure survival models enable to separate covariates that affect the
probability of an event (or `long-term' survival) from those only affecting the
event timing (or `short-term' survival). We propose to generalize the bounded
cumulative hazard model to handle additive terms for time-varying (exogenous)
covariates jointly impacting long- and short-term survival. The selection of
the penalty parameters is a challenge in that framework. A fast algorithm based
on Laplace approximations in Bayesian P-spline models is proposed. The
methodology is motivated by fertility studies where women's characteristics
such as the emplo…
Voters: 0
Votes: 0
Views: 0
Latest: Feb. 4, 2023, 11:04 p.m.
<p>The Causality field aims to find systematic methods for uncovering
cause-effect relationships. Such methods can find applications in many research
fields, justifying a great interest in this domain. Machine Learning models
have shown success in a large variety of tasks by extracting correlation
patterns from high-dimensional data but still struggle when generalizing out of
their initial distribution. As causal engines aim to learn mechanisms that are
independent from a data distribution, combining Machine Learning with Causality
has the potential to bring benefits to the two fields. In our work, we motivate
this assumption and provide appli…
Voters: 0
Votes: 0
Views: 0
Latest: Feb. 4, 2023, 11:04 p.m.
<p>Two-sample testing tests whether the distributions generating two samples are
identical. We pose the two-sample testing problem in a new scenario where the
sample measurements (or sample features) are inexpensive to access, but their
group memberships (or labels) are costly. We devise the first \emph{active
sequential two-sample testing framework} that not only sequentially but also
\emph{actively queries} sample labels to address the problem. Our test
statistic is a likelihood ratio where one likelihood is found by maximization
over all class priors, and the other is given by a classification model. The
classification model is adaptively u…
Voters: 0
Votes: 0
Views: 0
Latest: Feb. 4, 2023, 11:04 p.m.
<p>Assortment optimization has received active explorations in the past few
decades due to its practical importance. Despite the extensive literature
dealing with optimization algorithms and latent score estimation, uncertainty
quantification for the optimal assortment still needs to be explored and is of
great practical significance. Instead of estimating and recovering the complete
optimal offer set, decision-makers may only be interested in testing whether a
given property holds true for the optimal assortment, such as whether they
should include several products of interest in the optimal set, or how many
categories of products the optimal…
Voters: 0
Votes: 0
Views: 0
Latest: Feb. 4, 2023, 11:04 p.m.
<p>We introduce prediction-powered inference $\unicode{x2013}$ a framework for
performing valid statistical inference when an experimental data set is
supplemented with predictions from a machine-learning system. Our framework
yields provably valid conclusions without making any assumptions on the
machine-learning algorithm that supplies the predictions. Higher accuracy of
the predictions translates to smaller confidence intervals, permitting more
powerful inference. Prediction-powered inference yields simple algorithms for
computing valid confidence intervals for statistical objects such as means,
quantiles, and linear and logistic regression…
Voters: 0
Votes: 0
Views: 0
Latest: Feb. 4, 2023, 11:04 p.m.
<p>Off-policy evaluation (OPE) is a method for estimating the return of a target
policy using some pre-collected observational data generated by a potentially
different behavior policy. In some cases, there may be unmeasured variables
that can confound the action-reward or action-next-state relationships,
rendering many existing OPE approaches ineffective. This paper develops an
instrumental variable (IV)-based method for consistent OPE in confounded Markov
decision processes (MDPs). Similar to single-stage decision making, we show
that IV enables us to correctly identify the target policy's value in infinite
horizon settings as well. Fur…
Voters: 0
Votes: 0
Views: 0
Latest: Feb. 4, 2023, 11:04 p.m.
<p>Economists frequently estimate average treatment effects (ATEs) for
transformations of the outcome that are well-defined at zero but behave like
$\log(y)$ when $y$ is large (e.g., $\log(1+y)$, $\mathrm{arcsinh}(y)$). We show
that these ATEs depend arbitrarily on the units of the outcome, and thus cannot
be interpreted as percentage effects. Moreover, we prove that when the outcome
can equal zero, there is no parameter of the form $E_P[g(Y(1),Y(0))]$ that is
point-identified and unit-invariant. We discuss sensible alternative target
parameters for settings with zero-valued outcomes that relax at least one of
these requirements.
</p&am…
Voters: 0
Votes: 0
Views: 0
Latest: Feb. 4, 2023, 11:04 p.m.
<p>Many industrial applications rely on real-time optimization to improve key
performance indicators. In the case of unknown process characteristics,
real-time optimization becomes challenging, particularly for the satisfaction
of safety constraints. In this paper, we demonstrate the application of an
adaptive and explorative real-time optimization framework to an industrial
refrigeration process, where we learn the process characteristics through
changes in process control targets and through exploration to satisfy safety
constraints. We quantify the uncertainty in unknown compressor characteristics
of the refrigeration plant by using Gaussia…
Voters: 0
Votes: 0
Views: 0
Latest: Feb. 4, 2023, 11:04 p.m.
<p>Performance of machine learning models may differ between training and
deployment for many reasons. For instance, model performance can change between
environments due to changes in data quality, observing a different population
than the one in training, or changes in the relationship between labels and
features. These changes result in distribution shifts across environments.
Attributing model performance changes to specific shifts is critical for
identifying sources of model failures, and for taking mitigating actions that
ensure robust models. In this work, we introduce the problem of attributing
performance differences between environme…
Voters: 0
Votes: 0
Views: 0
Latest: Feb. 4, 2023, 11:04 p.m.
<p>This paper studies a semiparametric quantile regression model with endogenous
variables and random right censoring. The endogeneity issue is solved using
instrumental variables. It is assumed that the structural quantile of the
logarithm of the outcome variable is linear in the covariates and censoring is
independent. The regressors and instruments can be either continuous or
discrete. The specification generates a continuum of equations of which the
quantile regression coefficients are a solution. Identification is obtained
when this system of equations has a unique solution. Our estimation procedure
solves an empirical analogue of the sys…
Voters: 0
Votes: 0
Views: 0
Latest: Feb. 4, 2023, 11:04 p.m.
<p>We expect the generalization error to improve with more samples from a
similar task, and to deteriorate with more samples from an out-of-distribution
(OOD) task. In this work, we show a counter-intuitive phenomenon: the
generalization error of a task can be a non-monotonic function of the number of
OOD samples. As the number of OOD samples increases, the generalization error
on the target task improves before deteriorating beyond a threshold. In other
words, there is value in training on small amounts of OOD data. We use Fisher's
Linear Discriminant on synthetic datasets and deep networks on computer vision
benchmarks such as MNIST, CI…
Voters: 0
Votes: 0
Views: 0
Latest: Feb. 4, 2023, 11:04 p.m.
<p>This paper proposes a data-driven approximate Bayesian computation framework
for parameter estimation and uncertainty quantification of epidemic models,
which incorporates two novelties: (i) the identification of the initial
conditions by using plausible dynamic states that are compatible with
observational data; (ii) learning of an informative prior distribution for the
model parameters via the cross-entropy method. The new methodology's
effectiveness is illustrated with the aid of actual data from the COVID-19
epidemic in Rio de Janeiro city in Brazil, employing an ordinary differential
equation-based model with a generalized SEIR me…
Voters: 0
Votes: 0
Views: 0
Latest: Feb. 4, 2023, 11:04 p.m.
<p>Substandard and falsified pharmaceuticals, prevalent in low- and
middle-income countries, substantially increase levels of morbidity, mortality
and drug resistance. Regulatory agencies combat this problem using post-market
surveillance by collecting and testing samples where consumers purchase
products. Existing analysis tools for post-market surveillance data focus
attention on the locations of positive samples. This paper looks to expand such
analysis through underutilized supply-chain information to provide inference on
sources of substandard and falsified products. We first establish the presence
of unidentifiability issues when integra…
Voters: 0
Votes: 0
Views: 0
Latest: Feb. 4, 2023, 11:04 p.m.
<p>Maximum likelihood estimation in logistic regression with mixed effects is
known to often result in estimates on the boundary of the parameter space. Such
estimates, which include infinite values for fixed effects and singular or
infinite variance components, can cause havoc to numerical estimation
procedures and inference. We introduce an appropriately scaled additive penalty
to the log-likelihood function, or an approximation thereof, which penalizes
the fixed effects by the Jeffreys' invariant prior for the model with no random
effects and the variance components by a composition of negative Huber loss
functions. The resulting maxim…
Voters: 0
Votes: 0
Views: 0
Latest: Feb. 4, 2023, 11:04 p.m.
<p>Concept Bottleneck Models (CBMs) map the inputs onto a set of interpretable
concepts (``the bottleneck'') and use the concepts to make predictions. A
concept bottleneck enhances interpretability since it can be investigated to
understand what concepts the model "sees" in an input and which of these
concepts are deemed important. However, CBMs are restrictive in practice as
they require dense concept annotations in the training data to learn the
bottleneck. Moreover, CBMs often do not match the accuracy of an unrestricted
neural network, reducing the incentive to deploy them in practice. In this
work, we address these lim…
Voters: 0
Votes: 0
Views: 0
Latest: Feb. 4, 2023, 11:04 p.m.
<p>Optimal designs minimize the number of experimental runs (samples) needed to
accurately estimate model parameters, resulting in algorithms that, for
instance, efficiently minimize parameter estimate variance. Governed by
knowledge of past observations, adaptive approaches adjust sampling constraints
online as model parameter estimates are refined, continually maximizing
expected information gained or variance reduced. We apply adaptive Bayesian
inference to estimate transition rates of Markov chains, a common class of
models for stochastic processes in nature. Unlike most previous studies, our
sequential Bayesian optimal design is updated w…
Voters: 0
Votes: 0
Views: 0
Latest: Feb. 4, 2023, 11:04 p.m.
<p>We study the dynamics and implicit bias of gradient flow (GF) on univariate
ReLU neural networks with a single hidden layer in a binary classification
setting. We show that when the labels are determined by the sign of a target
network with $r$ neurons, with high probability over the initialization of the
network and the sampling of the dataset, GF converges in direction (suitably
defined) to a network achieving perfect training accuracy and having at most
$\mathcal{O}(r)$ linear regions, implying a generalization bound. Unlike many
other results in the literature, under an additional assumption on the
distribution of the data, our result h…
Voters: 0
Votes: 0
Views: 0
Latest: Feb. 4, 2023, 11:04 p.m.
<p>Using gradient descent (GD) with fixed or decaying step-size is a standard
practice in unconstrained optimization problems. However, when the loss
function is only locally convex, such a step-size schedule artificially slows
GD down as it cannot explore the flat curvature of the loss function. To
overcome that issue, we propose to exponentially increase the step-size of the
GD algorithm. Under homogeneous assumptions on the loss function, we
demonstrate that the iterates of the proposed \emph{exponential step size
gradient descent} (EGD) algorithm converge linearly to the optimal solution.
Leveraging that optimization insight, we then consi…
Voters: 0
Votes: 0
Views: 0
Latest: Feb. 4, 2023, 11:04 p.m.
<p>In a round-robin tournament, a team may lack the incentive to win if its
final rank does not depend on the outcome of the matches still to be played.
This paper introduces a classification scheme to determine these weakly (where
one team is indifferent) or strongly (where both teams are indifferent)
stakeless matches in a double round-robin contest with four teams. The
probability that such matches arise can serve as a novel fairness criterion to
compare and evaluate match schedules. Our approach is illustrated by the UEFA
Champions League group stage. A simulation model is built to compare the 12
valid schedules for the group matches. Some…
Voters: 0
Votes: 0
Views: 0
Latest: Feb. 4, 2023, 11:04 p.m.
<p>A fundamental question in designing lossy data compression schemes is how
well one can do in comparison with the rate-distortion function, which
describes the known theoretical limits of lossy compression. Motivated by the
empirical success of deep neural network (DNN) compressors on large, real-world
data, we investigate methods to estimate the rate-distortion function on such
data, which would allow comparison of DNN compressors with optimality. While
one could use the empirical distribution of the data and apply the
Blahut-Arimoto algorithm, this approach presents several computational
challenges and inaccuracies when the datasets are la…
Voters: 0
Votes: 0
Views: 0
Latest: Feb. 4, 2023, 11:04 p.m.
<p>A lot of studies on the summary measures of predictive strength of
categorical response models consider the likelihood ratio index (LRI), also
known as the McFadden-$R^2$, a better option than many other measures. We
propose a simple modification of the LRI that adjusts for the effect of the
number of response categories on the measure and that also rescales its values,
mimicking an underlying latent measure. The modified measure is applicable to
both binary and ordinal response models fitted by maximum likelihood. Results
from simulation studies and a real data example on the olfactory perception of
boar taint show that the proposed measur…
Voters: 0
Votes: 0
Views: 0
Latest: Feb. 4, 2023, 11:04 p.m.
<p>Many models for point process data are defined through a thinning procedure
where locations of a base process (often Poisson) are either kept (observed) or
discarded (thinned). In this paper, we go back to the fundamentals of the
distribution theory for point processes and provide a colouring theorem that
characterizes the joint density of thinned and observed locations in any of
such models. In practice, the marginal model of observed points is often
intractable, but thinned locations can be instantiated from their conditional
distribution and typical data augmentation schemes can be employed to
circumvent this problem. Such approaches hav…
Voters: 0
Votes: 0
Views: 0
Latest: Feb. 4, 2023, 11:04 p.m.