A Unified Study on Sequentiality in Universal Classification with Empirically Observed Statistics

Author:

Ching-Fang Li, I-Hsiang Wang

Keyword:

Computer Science, Information Theory, Information Theory (cs.IT)

journal:

--

date:

2024-01-29 00:00:00

Abstract

In hypothesis testing problems, taking samples sequentially and stopping opportunistically to make the inference greatly enhances the reliability. The design of the stopping and inference policy, however, critically relies on the knowledge of the underlying distribution of each hypothesis. When the knowledge of distributions, say, $P_0$ and $P_1$ in the binary-hypothesis case, is replaced by empirically observed statistics from the respective distributions, the gain of sequentiality is less understood when subject to universality constraints. In this work, the gap is mended by a unified study on sequentiality in the universal binary classification problem. We propose a unified framework where the universality constraints are set on the expected stopping time as well as the type-I error exponent. The type-I error exponent is required to achieve a pre-set distribution-dependent constraint $\lambda(P_0,P_1)$ for all $P_0,P_1$. The framework is employed to investigate a semi-sequential and a fully-sequential setup, so that fair comparison can be made with the fixed-length setup. The optimal type-II error exponents in different setups are characterized when the function $\lambda$ satisfies mild continuity conditions. The benefit of sequentiality is shown by comparing the semi-sequential, the fully-sequential, and the fixed-length cases in representative examples of $\lambda$. Conditions under which sequentiality eradicates the trade-off between error exponents are also derived.