skip to main content
10.1145/3578741.3578778acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmlnlpConference Proceedingsconference-collections
research-article
Open access

Fusing Attribute Type Features for Attribute Value Extraction from Product via Question Answering

Published: 06 March 2023 Publication History

Abstract

Extracting attribute values from product titles is a crucial e-commerce task. Previous attribute extraction method was insufficient because the exist dataset lacked attribute type information. To overcome these obstacles and promote product attribute research, we first improve the AliExpress dataset with appending entity type information to better facilitate research on product attribute value extraction. Then, we propose an Attribute Type Attention via Question Answering (ATAQA) approach to fully utilize the attribute type information for attribute value extraction, which captures semantic interaction of attribute and attribute type with a sentence and get the vectors containing attributes, attribute types and sentence features. We combine the attribute type vector and the attribute vector to get the processed vector, and then we use the self-attention layer connect the attribute vector and the processed vector, finally export a sentence-attribute-comprehension representation to the CRF for final tagging. The proposed approach outperforms previous best methods by a significant margin, as shown by the experimental results. Our Data is available at https://github.com/wandugu/AE-improved.

References

[1]
Ahuja, A., Rao, N., Katariya, S., Subbian, K. and Reddy, C.K. 2020. Language-agnostic representation learning for product search on e-commerce platforms. (2020), 7–15.
[2]
Ai, Q., Zhang, Y., Bi, K., Chen, X. and Croft, W.B. 2017. Learning a hierarchical embedding model for personalized product search. (2017), 645–654.
[3]
Carmel, D., Lewin-Eytan, L. and Maarek, Y. 2018. Product question answering using customer generated content-research challenges. (2018), 1349–1350.
[4]
Chen, K., Feng, L., Chen, Q., Chen, G. and Shou, L. 2019. EXACT: Attributed Entity Extraction By Annotating Texts. Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (Paris France, Jul. 2019), 1349–1352.
[5]
Chiticariu, L., Krishnamurthy, R., Li, Y., Reiss, F. and Vaithyanathan, S. 2010. Domain adaptation of rule-based annotators for named-entity recognition tasks. (2010), 1002–1012.
[6]
Dirie, A.-H.A. 2017. Extracting diverse attribute-value information from product catalog text via transfer learning. (2017).
[7]
Gao, S., Ren, Z., Zhao, Y., Zhao, D., Yin, D. and Yan, R. 2019. Product-aware answer generation in e-commerce question-answering. (2019), 429–437.
[8]
Ghani, R., Probst, K., Liu, Y., Krema, M. and Fano, A. 2006. Text mining for product attribute extraction. ACM SIGKDD Explorations Newsletter. 8, 1 (Jun. 2006), 41–48.
[9]
Gopalakrishnan, V., Iyengar, S.P., Madaan, A., Rastogi, R. and Sengamedu, S. 2012. Matching product titles using web-based enrichment. (2012), 605–614.
[10]
Hochreiter, S. and Schmidhuber, J. 1997. Long short-term memory. Neural computation. 9, 8 (1997), 1735–1780.
[11]
Huang, Z., Xu, W. and Yu, K. 2015. Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991. (2015).
[12]
Ishwari, K., Aneeze, A., Sudheesan, S., Karunaratne, H., Nugaliyadde, A. and Mallawarrachchi, Y. 2019. Advances in natural language question answering: A review. arXiv preprint arXiv:1904.05276. (2019).
[13]
IV, R.L.L., Humeau, S. and Singh, S. 2017. Multimodal Attribute Extraction. (2017).
[14]
Jain, M., Bhattacharya, S., Jain, H., Shaik, K. and Chelliah, M. 2021. Learning cross-task attribute-attribute similarity for multi-task attribute-value extraction. (2021), 79–87.
[15]
Karamanolakis, G., Ma, J. and Dong, X. 2020. TXtract: Taxonomy-Aware Knowledge Extraction for Thousands of Product Categories. (2020).
[16]
Kenton, J.D.M.-W.C. and Toutanova, L.K. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. (2019), 4171–4186.
[17]
Kodra, L. and Meçe, E.K. 2017. Question answering systems: A review on present developments, challenges and trends. International Journal of Advanced Computer Science and Applications. 8, 9 (2017).
[18]
Kozareva, Z., Li, Q., Zhai, K. and Guo, W. 2016. Recognizing salient entities in shopping queries. (2016), 107–111.
[19]
Le, T.-H. and Lauw, H.W. 2021. Explainable recommendation with comparative constraints on product aspects. (2021), 967–975.
[20]
Li, J., Fei, H., Liu, J., Wu, S., Zhang, M., Teng, C., Ji, D. and Li, F. 2021. Unified Named Entity Recognition as Word-Word Relation Classification. arXiv.
[21]
Lin, R., He, X., Feng, J., Zalmout, N., Liang, Y., Xiong, L. and Dong, X. 2021. PAM: Understanding Product Images in Cross Product Category Attribute Extraction. (2021).
[22]
Liu, Y., Zhang, S., Song, R., Feng, S. and Xiao, Y. 2020. Knowledge-guided Open Attribute Value Extraction with Reinforcement Learning. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (Online, 2020), 8595–8604.
[23]
Martinez-Gomez, P., Papachristoudis, G., Blauvelt, J., Rachlin, E. and Simhon, S. 2021. Enhancement and analysis of tars few-shot learning model for product attribute extraction from unstructured text. (2021).
[24]
Meng, Z., Liang, S., Fang, J. and Xiao, T. 2019. Semi-supervisedly co-embedding attributed networks. Advances in neural information processing systems. 32, (2019).
[25]
Mikheev, A., Moens, M. and Grover, C. 1999. Named entity recognition without gazetteers. (1999), 1–8.
[26]
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S. and Dean, J. 2013. Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems. 26, (2013).
[27]
Nadeau, D. and Sekine, S. 2007. A survey of named entity recognition and classification. Lingvisticae Investigationes. 30, 1 (2007), 3–26.
[28]
Pan, G., Yao, Y., Tong, H., Xu, F. and Lu, J. 2021. Unsupervised attributed network embedding via cross fusion. (2021), 797–805.
[29]
Probst, K., Ghani, R., Krema, M., Fano, A.E. and Liu, Y. 2007. Semi-Supervised Learning of Attribute-Value Pairs from Product Descriptions. (2007), 2838–2843.
[30]
Rajpurkar, P., Zhang, J., Lopyrev, K. and Liang, P. 2016. SQuAD: 100, 000+ Questions for Machine Comprehension of Text. (2016).
[31]
Rezk, M., Alemany, L.A., Nio, L. and Zhang, T. 2019. Accurate Product Attribute Extraction on the Field. (2019).
[32]
Roy, K., Goyal, P. and Pandey, M. 2021. Attribute Value Generation from Product Title using Language Models. (2021).
[33]
Shinzato, K. and Sekine, S. 2013. Unsupervised extraction of attributes and their values from product description. (2013), 1339–1347.
[34]
Sun, C., Liu, H., Liu, M., Ren, Z., Gan, T. and Nie, L. 2020. LARA: attribute-to-feature adversarial learning for new-item recommendation. (2020), 582–590.
[35]
Vandic, D., Van Dam, J.-W. and Frasincar, F. 2012. Faceted product search powered by the semantic web. Decision Support Systems. 53, 3 (2012), 425–437.
[36]
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł. and Polosukhin, I. 2017. Attention is all you need. Advances in neural information processing systems. 30, (2017).
[37]
Wang, Q., Yang, L., Kanagal, B., Sanghai, S., Sivakumar, D., Shu, B., Yu, Z. and Elsas, J. 2020. Learning to Extract Attribute Value from Product via Question Answering: A Multi-task Approach. Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (Virtual Event CA USA, Aug. 2020), 47–55.
[38]
Xu, H., Wang, W., Mao, X., Jiang, X. and Lan, M. 2019. Scaling up Open Tagging from Tens to Thousands: Comprehension Empowered Attribute Value Extraction from Product Title. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (Florence, Italy, 2019), 5214–5223.
[39]
Yan, J., Zalmout, N., Liang, Y., Grant, C., Ren, X. and Dong, X.L. 2021. AdaTag: Multi-Attribute Value Extraction from Product Profiles with Adaptive Decoding. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (Online, 2021), 4694–4705.
[40]
Yang, L., Wang, Q., Yu, Z., Kulkarni, A., Sanghai, S., Shu, B., Elsas, J. and Kanagal, B. 2021. MAVE: A Product Dataset for Multi-source Attribute Value Extraction. arXiv.
[41]
Zhao, J., Guan, Z. and Sun, H. 2019. Riker: Mining rich keyword representations for interpretable product question answering. (2019), 1389–1398.
[42]
Zheng, G., Mukherjee, S., Dong, X.L. and Li, F. 2018. OpenTag: Open Attribute Value Extraction from Product Profiles [Deep Learning, Active Learning, Named Entity Recognition]. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (Jul. 2018), 1049–1058.
[43]
Zhu, T., Wang, Y., Li, H., Wu, Y., He, X. and Zhou, B. 2020. Multimodal Joint Attribute Prediction and Value Extraction for E-commerce Product. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (Online, 2020), 2129–2139.

Cited By

View all

Index Terms

  1. Fusing Attribute Type Features for Attribute Value Extraction from Product via Question Answering
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      MLNLP '22: Proceedings of the 2022 5th International Conference on Machine Learning and Natural Language Processing
      December 2022
      406 pages
      ISBN:9781450399067
      DOI:10.1145/3578741
      This work is licensed under a Creative Commons Attribution-NonCommercial International 4.0 License.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 06 March 2023

      Check for updates

      Author Tags

      1. attribute value extraction
      2. open tag extraction
      3. question answering

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      MLNLP 2022

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 246
        Total Downloads
      • Downloads (Last 12 months)176
      • Downloads (Last 6 weeks)20
      Reflects downloads up to 17 Oct 2024

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media