skip to main content
10.1145/3652583.3658047acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Dynamic Segmentation for Efficient Retrieval of Podcasts: The Repping Algorithm

Published: 07 June 2024 Publication History

Abstract

In the following article, we present a method that makes it possible to find specific segments in a podcast from a large collection using a query (keywords or question). What differentiates our method is that there is no segmentation process at the beginning, but rather the segmentation is done dynamically according to the query entered. The core of our method is that for each term a position-based index is spanned over each individual document. These indices are laid over the individual documents like small threads of information. This multitude of threads maps the inner semantic structure of each individual document in the collection. The corresponding response segments are then individually determined according to the query at runtime using this index. Our initial tests have shown that this method significantly outperforms all current podcast-retrieval methods.

References

[1]
Thorsten Brants, Francine Chen, and Ioannis Tsochantaridis. 2002. Topic-Based Document Segmentation with Probabilistic Latent Semantic Analysis. In Proceedings of the Eleventh International Conference on Information and Knowledge Management (McLean, Virginia, USA) (CIKM '02). Association for Computing Machinery, New York, NY, USA, 211--218. https://doi.org/10.1145/584792.584829
[2]
Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, and Jimmy Lin. 2022. Overview of the TREC 2021 deep learning track. In Text REtrieval Conference (TREC). TREC.
[3]
Jacob Devlin, MingWei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. CoRR abs/1810.04805 (2018). arXiv:1810.04805
[4]
Dominik Flejter, Karol Wieloch, and Witold Abramowicz. 2007. Unsupervised Methods of Topical Text Segmentation for Polish. In Proceedings of the Workshop on Balto-Slavonic Natural Language Processing. Association for Computational Linguistics, Prague, Czech Republic, 51--58. https://aclanthology.org/W07-1707
[5]
Petra Galusc�kov�, Suraj Nair, and DouglasW. Oard. 2020. Combine and Re-Rank: The University of Maryland at the TREC 2020 Podcasts Track. In Proceedings of the Twenty-Ninth Text REtrieval Conference, TREC 2020, Virtual Event [Gaithersburg, Maryland, USA], November 16-20, 2020 (NIST Special Publication, Vol. 1266), Ellen M. Voorhees and Angela Ellis (Eds.). National Institute of Standards and Technology (NIST). https://trec.nist.gov/pubs/trec29/papers/UMD_IR.P.pdf
[6]
M.A.K. Halliday and Ruqaiya Hasan. 1976. Cohesion in English. Longman, London, UK. https://doi.org/10.4324/9781315836010
[7]
Arezki Hammache and Mohand Boughanem. 2021. Term position-based language model for information retrieval. Journal of the Association for Information Science and Technology 72, 5 (2021), 627--642. https://doi.org/10.1002/asi.24431 arXiv:https://asistdl.onlinelibrary.wiley.com/doi/pdf/10.1002/asi.24431
[8]
Xiangji Huang, Fuchun Peng, Dale Schuurmans, Nick Cercone, and Stephen E. Robertson. 2003. Applying Machine Learning to Text Segmentation for Information Retrieval. Inf. Retr. 6, 3--4 (2003), 333--362. https://doi.org/10.1023/A: 1026028229881
[9]
Gareth J. F. Jones. 2019. About Sound and Vision: CLEF Beyond Text Retrieval Tasks. Springer International Publishing. 307--329 pages.
[10]
Karen Sparck Jones, SteveWalker, and Stephen E. Robertson. 2000. A probabilistic model of information retrieval: development and comparative experiments - Part 2. Inf. Process. Manag. 36, 6 (2000), 809--840. https://doi.org/10.1016/S0306-4573(00)00016-9
[11]
Jussi Karlgren, R Jones, B Carterette, A Clifton, M Eskevich, GJF Jones, Sravana Reddy, Edgar Tanaka, and MI Tanveer. 2022. TREC 2021 Podcasts Track Overview. In Text REtrieval Conference (TREC). NIST Special Publication. https://trec.nist. gov/pubs/trec30/trec2021.html
[12]
Alexandre Lacoste, Alexandra Luccioni, Victor Schmidt, and Thomas Dandres. 2019. Quantifying the Carbon Emissions of Machine Learning. arXiv preprint arXiv:1910.09700 (2019).
[13]
Yuanhua Lv and ChengXiang Zhai. 2009. Positional language models for information retrieval. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval (Boston, MA, USA) (SIGIR '09). Association for Computing Machinery, New York, NY, USA, 299--306. https://doi.org/10.1145/1571941.1571994
[14]
Jane Morris and Graeme Hirst. 1991. Lexical cohesion computed by thesaural relations as an indicator of the structure of text. Computational Linguistics 17, 1 (1991), 21--48.
[15]
Hyo-Jung Oh, Sung Hyon Myaeng, and Myung-Gil Jang. 2007. Semantic passage segmentation based on sentence topics for question answering. Information Sciences 177, 18 (2007), 3696--3717. https://doi.org/10.1016/j.ins.2007.02.038
[16]
Paul Owoicho and Jeff Dalton. 2020. Glasgow Representation and Information Learning Lab (GRILL) at TREC 2020 Podcasts Track. In Proceedings of the Twenty-Ninth Text REtrieval Conference, TREC 2020, Virtual Event [Gaithersburg, Maryland, USA], November 16-20, 2020 (NIST Special Publication, Vol. 1266), Ellen M. Voorhees and Angela Ellis (Eds.). National Institute of Standards and Technology (NIST). https://trec.nist.gov/pubs/trec29/papers/uog_msc.P.pdf
[17]
Oyebade K. Oyedotun and Adnan Khashman. 2016. Document Segmentation Using Textural Features Summarization and Feedforward Neural Network. Applied Intelligence 45, 1 (jul 2016), 198--212. https://doi.org/10.1007/s10489-015-0753-z
[18]
Deepak P., Karthik Visweswariah, Nirmalie Wiratunga, and Sadiq Sani. 2012. Two-Part Segmentation of Text Documents. In Proceedings of the 21st ACMInternational Conference on Information and Knowledge Management (Maui, Hawaii, USA) (CIKM '12). Association for Computing Machinery, New York, NY, USA, 793--802. https://doi.org/10.1145/2396761.2396862
[19]
Irina Pak and Phoey Lee Teh. 2018. Text Segmentation Techniques: A Critical Review. Springer International Publishing, Cham, 167--181. https://doi.org/10. 1007/978-3-319-66984-7_10
[20]
Moayad Yousif Potrus, Umi Kalthum Ngah, and Bestoun S. Ahmed. 2014. An evolutionary harmony search algorithm with dominant point detection for recognitionbased segmentation of online Arabic text recognition. Ain Shams Engineering Journal 5, 4 (2014), 1129--1139. https://doi.org/10.1016/j.asej.2014.05.003
[21]
Stephan Repp. 2009. Extraktion von semantischen Informationen aus audiovisuellen Vorlesungsaufzeichnungen: Sprachtranskripte der Vorlesungsvideos als Informationsressource. dissertation. Hasso Plattner Institut, Universit�t Potsdam. magna cum laude
[22]
Stephan Repp, Andreas Gro�, and Christoph Meinel. 2008. Browsing within Lecture Videos Based on the Chain Index of Speech Transcription. IEEE Trans. Learn. Technol. 1, 3 (2008), 145--156.
[23]
Stephan Repp, Serge Linckels, and Christoph Meinel. 2008. Question answering from lecture videos based on an automatic semantic annotation. In Proceedings of the 13th Annual SIGCSE Conference on Innovation and Technology in Computer Science Education, ITiCSE 2008, Madrid, Spain, June 30 - July 2, 2008, June Amillo, Cary Laxer, Ernestina Menasalvas Ruiz, and Alison Young (Eds.). ACM, 17--21.
[24]
Stephan Repp, Serge Linckels, and Christoph Meinel. 2008. Question Answering from Lecture Videos Based on Automatically-Generated Learning Objects. In Advances in Web Based Learning - ICWL 2008, 7th International Conference, Jinhua, China, August 20-22, 2008. Proceedings (Lecture Notes in Computer Science, Vol. 5145), Frederick W. B. Li, Jianmin Zhao, Timothy K. Shih, Rynson W. H. Lau, Qing Li, and Dennis McLeod (Eds.). Springer, 509--520.
[25]
Stephan Repp and Christoph Meinel. 2006. Segmenting of Recorded Lecture Videos -The Algorithm VoiceSeg. In SIGMAP 2006 - Proceedings of the International Conference on Signal Processing and Multimedia Applications, Set�bal, Portugal, August 7-10, 2006, SIGMAP is part of ICETE - The International Joint Conference on e-Business and Telecommunications, Pedro A. Amado Assun��o and S�rgio M. M. de Faria (Eds.). INSTICC Press, 317--322.
[26]
Stephan Repp and Christoph Meinel. 2006. Semantic Indexing for Recorded Educational Lecture Videos. In 4th IEEE Conference on Pervasive Computing and Communications Workshops (PerCom 2006 Workshops), 13-17 March 2006, Pisa, Italy. IEEE Computer Society, 240--245. https://doi.org/10.1109/PERCOMW.2006. 122
[27]
Stephan Repp and Christoph Meinel. 2008. Segmentation of Lecture Videos Based on Spontaneous Speech Recognition. In Tenth IEEE International Symposium on Multimedia (ISM2008), December 15-17, 2008, Berkeley, California, USA. IEEE Computer Society, 692--697. https://doi.org/10.1109/ISM.2008.20
[28]
Stephan Repp and Christoph Meinel. 2008. Segmentation of Lecture Videos Based on Spontaneous Speech Recognition. In Tenth IEEE International Symposium on Multimedia (ISM2008), December 15-17, 2008, Berkeley, California, USA. IEEE Computer Society, 692--697.
[29]
Stephan Repp and Christoph Meinel. 2009. Automatic Extraction of Semantic Descriptions from the Lecturer's Speech. In Proceedings of the 3rd IEEE International Conference on Semantic Computing (ICSC 2009), 14-16 September 2009, Berkeley, CA, USA. IEEE Computer Society, 513--520.
[30]
Stephan Repp, J�rg Waitelonis, Harald Sack, and Christoph Meinel. 2007. Segmentation and Annotation of Audiovisual Recordings Based on Automated Speech Recognition. In Intelligent Data Engineering and Automated Learning - IDEAL 2007, 8th International Conference, Birmingham, UK, December 16-19, 2007, Proceedings (Lecture Notes in Computer Science, Vol. 4881), Hujun Yin, Peter Ti�o, Emilio Corchado, William Byrne, and Xin Yao (Eds.). Springer, 620--629. https://doi.org/10.1007/978-3-540-77226-2_63
[31]
J. Reynar. 1998. Topic Segmentation: Algorithm and applications. Ph.D. Dissertation. University of Pennsylvania, Pennsylvania.
[32]
Stephen E. Robertson, Steve Walker, and Micheline Hancock-Beaulieu. 1995. Large Test Collection Experiments on an Operational, Interactive System: Okapi at TREC. Inf. Process. Manag. 31, 3 (1995), 345--360. https://doi.org/10.1016/0306-4573(94)00051-4
[33]
Stephen E. Robertson, Steve Walker, and Micheline Hancock-Beaulieu. 1998. Okapi at TREC-7: Automatic Ad Hoc, Filtering, VLC and Interactive. In Proceedings of The Seventh Text REtrieval Conference, TREC 1998, Gaithersburg, Maryland, USA, November 9-11, 1998 (NIST Special Publication, Vol. 500--242), Ellen M. Voorhees and Donna K. Harman (Eds.). National Institute of Standards and Technology (NIST), 199--210.
[34]
Stephen E. Robertson, Steve Walker, Susan Jones, Micheline Hancock-Beaulieu, and Mike Gatford. 1994. Okapi at TREC-3. In Proceedings of The Third Text REtrieval Conference, TREC 1994, Gaithersburg, Maryland, USA, November 2-4, 1994 (NIST Special Publication, Vol. 500--225), Donna K. Harman (Ed.). National Institute of Standards and Technology (NIST), 109--126. http://trec.nist.gov/pubs/ trec3/papers/city.ps.gz
[35]
Martin Scaiano, Diana Inkpen, Robert Lagani�re, and Adele Reinhartz. 2010. Automatic Text Segmentation for Movie Subtitles. In Advances in Artificial Intelligence, Atefeh Farzindar and Vlado Keelj (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 295--298.
[36]
Roy Schwartz, Jesse Dodge, Noah A. Smith, and Oren Etzioni. 2020. Green AI. Commun. ACM 63, 12 (nov 2020), 54--63. https://doi.org/10.1145/3381831
[37]
Fei Song, William M. Darling, Adnan Duric, and FredW. Kroon. 2011. An Iterative Approach to Text Segmentation. In Advances in Information Retrieval, Paul Clough, Colum Foley, Cathal Gurrin, Gareth J. F. Jones, Wessel Kraaij, Hyowon Lee, and Vanessa Mudoch (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 629--640.
[38]
K. Sparck Jones, S. Walker, and S.E. Robertson. 2000. A probabilistic model of information retrieval: development and comparative experiments: Part 1. Information Processing and Management 36, 6 (2000), 779--808. https://doi.org/ 10.1016/S0306-4573(00)00015-7
[39]
Nicola Stokes. 2004. Applications of Lexical Cohesion Analysis in the Topic Detection and Tracking Domain. Ph.D. Dissertation. Department of Computer Science, University College Dublin, Dublin.
[40]
Yongze Yu, Jussi Karlgren, Ann Clifton, Md. Iftekhar Tanveer, Rosie Jones, and Hamed R. Bonab. 2020. Spotify at the TREC 2020 Podcasts Track: Segment Retrieval. In Proceedings of the Twenty-Ninth Text REtrieval Conference, TREC 2020, Virtual Event [Gaithersburg, Maryland, USA], November 16-20, 2020 (NIST Special Publication, Vol. 1266), Ellen M. Voorhees and Angela Ellis (Eds.). National Institute of Standards and Technology (NIST).

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMR '24: Proceedings of the 2024 International Conference on Multimedia Retrieval
May 2024
1379 pages
ISBN:9798400706196
DOI:10.1145/3652583
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 June 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. podcast information retrieval
  2. question answering
  3. segment retrieval
  4. speech browsing

Qualifiers

  • Research-article

Conference

ICMR '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 41
    Total Downloads
  • Downloads (Last 12 months)41
  • Downloads (Last 6 weeks)3
Reflects downloads up to 17 Oct 2024

Other Metrics

Citations

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media