Reducing Dueling Bandits to Cardinal Bandits

Ailon, Nir; Joachims, Thorsten; Karnin, Zohar

Computer Science > Machine Learning

arXiv:1405.3396 (cs)

[Submitted on 14 May 2014]

Title:Reducing Dueling Bandits to Cardinal Bandits

Authors:Nir Ailon, Thorsten Joachims, Zohar Karnin

View PDF

Abstract:We present algorithms for reducing the Dueling Bandits problem to the conventional (stochastic) Multi-Armed Bandits problem. The Dueling Bandits problem is an online model of learning with ordinal feedback of the form "A is preferred to B" (as opposed to cardinal feedback like "A has value 2.5"), giving it wide applicability in learning from implicit user feedback and revealed and stated preferences. In contrast to existing algorithms for the Dueling Bandits problem, our reductions -- named $\Doubler$, $\MultiSbm$ and $\DoubleSbm$ -- provide a generic schema for translating the extensive body of known results about conventional Multi-Armed Bandit algorithms to the Dueling Bandits setting. For $\Doubler$ and $\MultiSbm$ we prove regret upper bounds in both finite and infinite settings, and conjecture about the performance of $\DoubleSbm$ which empirically outperforms the other two as well as previous algorithms in our experiments. In addition, we provide the first almost optimal regret bound in terms of second order terms, such as the differences between the values of the arms.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:1405.3396 [cs.LG]
	(or arXiv:1405.3396v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1405.3396

Submission history

From: Nir Ailon [view email]
[v1] Wed, 14 May 2014 08:03:08 UTC (132 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2014-05

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Nir Ailon
Thorsten Joachims
Zohar Shay Karnin

export BibTeX citation

Computer Science > Machine Learning

Title:Reducing Dueling Bandits to Cardinal Bandits

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Reducing Dueling Bandits to Cardinal Bandits

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators