skip to main content
10.1145/3610978.3640604acmconferencesArticle/Chapter ViewAbstractPublication PageshriConference Proceedingsconference-collections
short-paper
Open access

Discovering Undiscovered States in Human Robot Verbal Interaction

Published: 11 March 2024 Publication History

Abstract

Despite the abilities of automatic speech recognition systems such as CMU Sphinx, Google Speech-to-Text API, and Amazon Tran- scribe to recognize a variety of voices, they often face challenges in accurately processing complete information. To overcome this limitation, we propose a novel approach utilizing Markov Decision Processes. Our research involves an intelligent agent that evaluates human speech (n=1) and identifies new states through learning, enabling it to process more comprehensive information compared to traditional systems. The paper illustrates two scenarios : one where the intelligent agent explores by detecting undiscovered states and ultimately reaches the goal state, and another where while discovering new states it also revisits the previous states.

Supplemental Material

MP4 File
Supplemental video

References

[1]
Imon Banerjee, Yuan Ling, Matthew C Chen, Sadid A Hasan, Curtis P Langlotz, Nathaniel Moradzadeh, Brian Chapman, Timothy Amrhein, David Mong, Daniel L Rubin, et al. 2019. Comparative effectiveness of convolutional neural network (CNN) and recurrent neural network (RNN) architectures for radiology text report classification. Artificial intelligence in medicine 97 (2019), 79--88.
[2]
Steven Bird, Ewan Klein, and Edward Loper. 2009. Natural language processing with Python: analyzing text with the natural language toolkit. " O'Reilly Media, Inc.".
[3]
V�ctor Campos, Alexander Trott, Caiming Xiong, Richard Socher, Xavier Gir�-i Nieto, and Jordi Torres. 2020. Explore, discover and learn: Unsupervised discovery of state-covering skills. In International Conference on Machine Learning. PMLR, 1317--1327.
[4]
Simon Du, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal, Miroslav Dudik, and John Langford. 2019. Provably efficient rl with rich observations via latent state decoding. In International Conference on Machine Learning. PMLR, 1665--1674.
[5]
Sean Kennedy, Haipeng Li, Chenggang Wang, Hao Liu, Boyang Wang, and Wenhai Sun. 2019. I can hear your alexa: Voice command fingerprinting on smart home speakers. In 2019 IEEE Conference on Communications and Network Security (CNS). IEEE, 232--240.
[6]
Alex Lamb, Riashat Islam, Yonathan Efroni, Aniket Didolkar, Dipendra Misra, Dylan Foster, Lekan Molu, Rajan Chari, Akshay Krishnamurthy, and John Langford. 2022. Guaranteed discovery of controllable latent states with multi-step inverse models. arXiv preprint arXiv:2207.08229 (2022).
[7]
Sudha Morwal, Nusrat Jahan, and Deepti Chopra. 2012. Named entity recognition using hidden Markov model (HMM). International Journal on Natural Language Computing (IJNLC) Vol 1 (2012).
[8]
Sayanti Roy, Emily Kieson, Charles Abramson, and Christopher Crick. 2017. Semantic structure for robotic teaching and learning. In 2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN). IEEE, 391--396.
[9]
Sayanti Roy, Harshal Maske, Girish Chowdhary, and Christopher Crick. 2017. Teaching and learning using semantic labels. In Proceedings of the Companion of the 2017 ACM/IEEE International Conference on Human-Robot Interaction. 271--272.
[10]
Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction. MIT press.
[11]
Jean Tarbouriech, Matteo Pirotta, Michal Valko, and Alessandro Lazaric. 2020. Improved sample complexity for incremental autonomous exploration in mdps. Advances in Neural Information Processing Systems 33 (2020), 11273--11284.
[12]
Stefanie Tellex, Nakul Gopalan, Hadas Kress-Gazit, and Cynthia Matuszek. 2020. Robots that use language. Annual Review of Control, Robotics, and Autonomous Systems 3 (2020), 25--55.
[13]
Wenpeng Yin, Katharina Kann, Mo Yu, and Hinrich Sch�tze. 2017. Comparative study of CNN and RNN for natural language processing. arXiv preprint arXiv:1702.01923 (2017).

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
HRI '24: Companion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction
March 2024
1408 pages
ISBN:9798400703232
DOI:10.1145/3610978
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 March 2024

Check for updates

Author Tags

  1. algorithm
  2. speech recognition
  3. state discovery

Qualifiers

  • Short-paper

Funding Sources

  • Purdue Northwest Catalyst Grant

Conference

HRI '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 268 of 1,124 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 78
    Total Downloads
  • Downloads (Last 12 months)78
  • Downloads (Last 6 weeks)15
Reflects downloads up to 18 Oct 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media