skip to main content
10.1145/3589334.3648153acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article
Open access

Infrastructure Ombudsman: Mining Future Failure Concerns from Structural Disaster Response

Published: 13 May 2024 Publication History

Abstract

Current research concentrates on studying discussions on social media related to structural failures to improve disaster response strategies. However, detecting social web posts discussing concerns about anticipatory failures is under-explored. If such concerns are channeled to the appropriate authorities, it can aid in the prevention and mitigation of potential infrastructural failures. In this paper, we develop an infrastructure ombudsman -- that automatically detects specific infrastructure concerns. Our work considers several recent structural failures in the US. We present a first-of-its-kind dataset of 2,662 social web instances for this novel task mined from Reddit and YouTube.

Supplemental Material

MP4 File
Supplemental video

References

[1]
[n. d.]. GPT-3.5 Turbo fine-tuning and API updates. https://openai.com/blog/gpt- 3--5-turbo-fine-tuning-and-api-updates
[2]
Samuel R. Bowman, Gabor Angeli, Christopher Potts, and Christopher D. Manning. 2015. A large annotated corpus for learning natural language inference. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.
[3]
Michael Bugert, Nils Reimers, Shany Barhom, Ido Dagan, and Iryna Gurevych. 2020. Breaking the Subtopic Barrier in Cross-Document Event Coreference Resolution. In Text2story@ ecir. 23--29.
[4]
Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, and others. 2022. Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311 (2022).
[5]
Jishnu Ray Chowdhury, Cornelia Caragea, and Doina Caragea. 2020. Crosslingual disaster-related multi-label tweet classification with manifold mixup. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop. 292--298.
[6]
Jishnu Ray Chowdhury, Cornelia Caragea, and Doina Caragea. 2020. On identifying hashtags in disaster twitter data. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 498--506.
[7]
WMark Crain and Lisa K Oakley. 1995. The politics of infrastructure. The Journal of Law and Economics 38, 1 (1995), 1--17.
[8]
Thomas Davidson, Dana Warmsley, Michael Macy, and Ingmar Weber. 2017. Automated hate speech detection and the problem of offensive language. In Proceedings of the international AAAI conference on web and social media, Vol. 11. 512--515.
[9]
Dorottya Demszky, Nikhil Garg, Rob Voigt, James Zou, Jesse Shapiro, Matthew Gentzkow, and Dan Jurafsky. 2019. Analyzing Polarization in Social Media: Method and Application to Tweets on 21 Mass Shootings. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019). Association for Computational Linguistics, 2970--3005.
[10]
Shrey Desai, Cornelia Caragea, and Junyi Jessy Li. 2020. Detecting perceived emotions in hurricane disasters. arXiv preprint arXiv:2004.14299 (2020).
[11]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[12]
Sujan Dutta, Beibei Li, Daniel S. Nagin, and Ashiqur R. KhudaBukhsh. 2022. A Murder and Protests, the Capitol Riot, and the Chauvin Trial: Estimating Disparate News Media Stance. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI 2022. 5059--5065.
[13]
Yu Feng, Claus Brenner, and Monika Sester. 2020. Flood severity mapping from Volunteered Geographic Information by interpreting water level from images containing people: A case study of Hurricane Harvey. ISPRS Journal of Photogrammetry and Remote Sensing 169 (2020), 301--319.
[14]
Huiji Gao, Geoffrey Barbier, and Rebecca Goolsby. 2011. Harnessing the Crowdsourcing Power of Social Media for Disaster Relief. IEEE Intelligent Systems 26, 3 (May 2011), 10--14. https://doi.org/10.1109/MIS.2011.52 Conference Name: IEEE Intelligent Systems.
[15]
Lei Gao and Ruihong Huang. 2017. Detecting Online Hate Speech Using Context Aware Models. In RANLP 2017, Ruslan Mitkov and Galia Angelova (Eds.). INCOMA Ltd., 260--266. https://doi.org/10.26615/978--954--452-049--6_036
[16]
Fabrizio Gilardi, Meysam Alizadeh, and Ma�l Kubli. 2023. ChatGPT outperforms crowd workers for text-annotation tasks. Proceedings of the National Academy of Sciences of the United States of America 120 (2023). https://api.semanticscholar. org/CorpusID:257766307
[17]
Andrew Halterman, Katherine A. Keith, Sheikh Muhammad Sarwar, and Brendan O'Connor. 2021. Corpus-Level Evaluation for Event QA: The IndiaPoliceEvents Corpus Covering the 2002 Gujarat Violence. In Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, Online Event, August 1--6, 2021 (Findings of ACL, Vol. ACL/IJCNLP 2021). Association for Computational Linguistics, 4240--4253. https://doi.org/10.18653/V1/2021.FINDINGS-ACL.371
[18]
Fl�vio EA Horita, Jo�o Porto de Albuquerque, Victor Marchezini, and Eduardo M Mendiondo. 2017. Bridging the gap between decision-making and emerging big data sources: An application of a model-based framework to disaster management in Brazil. Decision Support Systems 97 (2017), 12--22.
[19]
Tamanna Hossain, Robert L. Logan IV, Arjuna Ugarte, Yoshitomo Matsubara, Sean Young, and Sameer Singh. 2020. COVIDLies: Detecting COVID-19 Misinformation on Social Media. In Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020.
[20]
Pei-Yun Hsueh, Prem Melville, and Vikas Sindhwani. 2009. Data quality from crowdsourcing: a study of annotation selection criteria. In Proceedings of the NAACL HLT 2009 workshop on active learning for natural language processing. 27--35.
[21]
Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, L�lio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, ThomasWang, Timoth�e Lacroix, and William El Sayed. 2023. Mistral 7B. https://doi.org/10.48550/arXiv.2310.06825 arXiv:2310.06825 [cs].
[22]
Aditya Joshi, Pushpak Bhattacharyya, and Mark J Carman. 2017. Automatic sarcasm detection: A survey. ACM Computing Surveys (CSUR) 50, 5 (2017), 1--22.
[23]
Ashiqur R KhudaBukhsh, Rupak Sarkar, Mark S Kamlet, and Tom Mitchell. 2021. We Don't Speak the Same Language: Interpreting Polarization through Machine Translation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 14893--14901.
[24]
Ashiqur R. KhudaBukhsh, Rupak Sarkar, Mark S. Kamlet, and Tom M. Mitchell. 2022. Fringe News Networks: Dynamics of US News Viewership following the 2020 Presidential Election. In WebSci '22: 14th ACM Web Science Conference 2022. ACM, 269--278.
[25]
Diederik P. Kingma and Jimmy Ba. 2017. Adam: A Method for Stochastic Optimization. https://doi.org/10.48550/arXiv.1412.6980 arXiv:1412.6980 [cs].
[26]
Elisa Leonardelli, Stefano Menini, Alessio Palmero Aprosio, Marco Guerini, and Sara Tonelli. 2021. Agreeing to Disagree: Annotating Offensive Language Datasets with Annotators' Disagreement. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 10528--10539.
[27]
Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov, and Luke Zettlemoyer. 2019. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. (2019). https://doi.org/10.48550/ARXIV.1910.13461 Publisher: arXiv Version Number: 1.
[28]
Hongmin Li, Doina Caragea, and Cornelia Caragea. 2017. Towards Practical Usage of a Domain Adaptation Algorithm in the Early Hours of a Disaster. In ISCRAM.
[29]
Tiezhong Liu, Huyuan Zhang, and Hubo Zhang. 2020. The impact of social media on risk communication of disasters-a comparative study based on sina weibo blogs related to tianjin explosion and typhoon pigeon. International journal of environmental research and public health 17, 3 (2020), 883.
[30]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. (2019). https://doi.org/10. 48550/ARXIV.1907.11692 Publisher: arXiv Version Number: 1.
[31]
Bill MacCartney and Christopher D Manning. 2008. Modeling semantic containment and exclusion in natural language inference. In Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008). 521--528.
[32]
Ines Montani, Matthew Honnibal, Matthew Honnibal, Adriane Boyd, Sofie Van Landeghem, and Henning Peters. 2023. explosion/spaCy: v3.7.2: Fixes for APIs and requirements. https://doi.org/10.5281/ZENODO.1212303
[33]
Venkata K Neppalli, Cornelia Caragea, Anna Squicciarini, Andrea Tapia, and Sam Stehle. 2017. Sentiment analysis during Hurricane Sandy in emergency response. International journal of disaster risk reduction 21 (2017), 213--222.
[34]
Thi Huyen Nguyen and Koustav Rudra. 2022. Towards an Interpretable Approach to Classify and Summarize Crisis Events from Microblogs. In Proceedings of the ACM Web Conference 2022 (WWW '22). Association for Computing Machinery, New York, NY, USA, 3641--3650. https://doi.org/10.1145/3485447.3512259
[35]
Jishnu Ray Chowdhury, Cornelia Caragea, and Doina Caragea. 2019. Keyphrase extraction from disaster-related tweets. In The world wide web conference. 1555-- 1566.
[36]
Piervincenzo Rizzo and Alireza Enshaeian. 2021. Challenges in bridge health monitoring: A review. Sensors 21, 13 (2021), 4336.
[37]
Koustav Rudra, Pawan Goyal, Niloy Ganguly, Muhammad Imran, and Prasenjit Mitra. 2019. Summarizing Situational Tweets in Crisis Scenarios: An Extractive- Abstractive Approach. IEEE Transactions on Computational Social Systems 6, 5 (Oct. 2019), 981--993. https://doi.org/10.1109/TCSS.2019.2937899 Conference Name: IEEE Transactions on Computational Social Systems.
[38]
Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. 2012. Tweet analysis for real-time event detection and earthquake reporting system development. IEEE transactions on knowledge and Data Engineering 25, 4 (2012), 919--931.
[39]
Maarten Sap, Swabha Swayamdipta, Laura Vianna, Xuhui Zhou, Yejin Choi, and Noah A. Smith. 2022. Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 5884-- 5906.
[40]
Rupak Sarkar, Sayantan Mahinder, Hirak Sarkar, and Ashiqur Khudabukhsh. 2020. Social Media Attributions in the Context of Water Crisis. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 1402--1412.
[41]
Jaromir Savelka and Kevin Dean Ashley. 2023. The Unreasonable Effectiveness of Large Language Models in Zero-shot Semantic Annotation of Legal Texts. Frontiers in Artificial Intelligence 6 (2023), 1279794.
[42]
Vikas Sindhwani, Prem Melville, and Richard D. Lawrence. 2009. Uncertainty sampling and transductive experimental design for active dual supervision. In ICML. 953--960.
[43]
Iustin Sirbu, Tiberiu Sosea, Cornelia Caragea, Doina Caragea, and Traian Rebedea. 2022. Multimodal Semi-supervised Learning for Disaster Tweet Classification. In Proceedings of the 29th International Conference on Computational Linguistics. 2711--2723.
[44]
Tiberiu Sosea, Iustin Sirbu, Cornelia Caragea, Doina Caragea, and Traian Rebedea. 2021. Using the image-text relationship to improve multimodal disaster tweet classification. In The 18th International Conference on Information Systems for Crisis Response and Management (ISCRAM 2021).
[45]
Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez, Madian Khabsa, Isabel Kloumann, Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov, Pushkar Mishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith, Ranjan Subramanian, Xiaoqing Ellen Tan, Binh Tang, Ross Taylor, Adina Williams, Jian Xiang Kuan, Puxin Xu, Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan, Melanie Kambadur, Sharan Narang, Aurelien Rodriguez, Robert Stojnic, Sergey Edunov, and Thomas Scialom. 2023. Llama 2: Open Foundation and Fine-Tuned Chat Models. https://doi.org/10.48550/arXiv. 2307.09288 arXiv:2307.09288 [cs].
[46]
Istv�n Varga, Motoki Sano, Kentaro Torisawa, Chikara Hashimoto, Kiyonori Ohtake, Takao Kawai, Jong-Hoon Oh, and Stijn De Saeger. 2013. Aid is Out There: Looking for Help from Tweets during a Large Scale Disaster. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1619--1629.
[47]
Sarah Vieweg, Carlos Castillo, and Muhammad Imran. 2014. Integrating Social Media Communications into the Rapid Assessment of Sudden Onset Disasters. In Social Informatics: 6th International Conference, SocInfo 2014, Barcelona, Spain, November 11--13, 2014. Proceedings, Luca Maria Aiello and Daniel McFarland (Eds.). Springer International Publishing, Cham, 444--461. https://doi.org/10.1007/978- 3--319--13734--6_32
[48]
Tharindu Cyril Weerasooriya, Sujan Dutta, Tharindu Ranasinghe, Marcos Zamperi, Christopher M. Homan, and Ashiqur R. KhudaBukhsh. 2023. Vicarious Offense and Noise Audit of Offensive Speech Classifiers: Unifying Human and Machine Disagreement on What is Offensive. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
[49]
Joseph W Westphal. 2008. The politics of infrastructure. Social Research: An International Quarterly 75, 3 (2008), 793--804.
[50]
Michael Wiegand, Josef Ruppenhofer, and Thomas Kleinbauer. 2019. Detection of abusive language: the problem of biased datasets. In Proceedings of the 2019 conference of the North American Chapter of the Association for Computational Linguistics: human language technologies, volume 1 (long and short papers). 602-- 608.
[51]
Wenpeng Yin, Jamaal Hay, and Dan Roth. 2019. Benchmarking zero-shot text classification: Datasets, evaluation and entailment approach. arXiv preprint arXiv:1909.00161 (2019).
[52]
Clay H Yoo and Ashiqur R KhudaBukhsh. 2023. Auditing and robustifying COVID-19 misinformation datasets via anticontent sampling. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 15260--15268.
[53]
Caleb Ziems, William Held, Omar Shaikh, Jiaao Chen, Zhehao Zhang, and Diyi Yang. 2023. Can Large Language Models Transform Computational Social Science? https://doi.org/10.48550/arXiv.2305.03514 arXiv:2305.03514 [cs].
[54]
Henry Peng Zou, Yue Zhou, Cornelia Caragea, and Doina Caragea. 2023. Crisismatch: Semi-supervised few-shot learning for fine-grained disaster tweet classification. arXiv preprint arXiv:2310.14627 (2023).
[55]
Lei Zou, Danqing Liao, Nina SN Lam, Michelle A Meyer, Nasir G Gharaibeh, Heng Cai, Bing Zhou, and Dongying Li. 2023. Social media for emergency rescue: An analysis of rescue requests on Twitter during Hurricane Harvey. International Journal of Disaster Risk Reduction 85 (2023), 103513.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WWW '24: Proceedings of the ACM Web Conference 2024
May 2024
4826 pages
ISBN:9798400701719
DOI:10.1145/3589334
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2024

Check for updates

Author Tags

  1. infrastructure ombudsman
  2. social web mining
  3. structural failure concerns

Qualifiers

  • Research-article

Funding Sources

Conference

WWW '24
Sponsor:
WWW '24: The ACM Web Conference 2024
May 13 - 17, 2024
Singapore, Singapore

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 139
    Total Downloads
  • Downloads (Last 12 months)139
  • Downloads (Last 6 weeks)46
Reflects downloads up to 19 Oct 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media