Hu Zhang


2024

pdf bib
FRVA: Fact-Retrieval and Verification Augmented Entailment Tree Generation for Explainable Question Answering
Yue Fan | Hu Zhang | Ru Li | YuJie Wang | Hongye Tan | Jiye Liang
Findings of the Association for Computational Linguistics ACL 2024

Structured entailment tree can exhibit the reasoning chains from knowledge facts to predicted answers, which is important for constructing an explainable question answering system. Existing works mainly include directly generating the entire tree and stepwise generating the proof steps. The stepwise methods can exploit combinatoriality and generalize to longer steps, but they have large fact search spaces and error accumulation problems resulting in the generation of invalid steps. In this paper, inspired by the Dual Process Theory in cognitive science, we propose FRVA, a Fact-Retrieval and Verification Augmented bidirectional entailment tree generation method that contains two systems. Specifically, System 1 makes intuitive judgments through the fact retrieval module and filters irrelevant facts to reduce the search space. System 2 designs a deductive-abductive bidirectional reasoning module, and we construct cross-verification and multi-view contrastive learning to make the generated proof steps closer to the target hypothesis. We enhance the reliability of the stepwise proofs to mitigate error propagation. Experiment results on EntailmentBank show that FRVA outperforms previous models and achieves state-of-the-art performance in fact selection and structural correctness.

pdf bib
Hyperspherical Multi-Prototype with Optimal Transport for Event Argument Extraction
Guangjun Zhang | Hu Zhang | YuJie Wang | Ru Li | Hongye Tan | Jiye Liang
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Event Argument Extraction (EAE) aims to extract arguments for specified events from a text. Previous research has mainly focused on addressing long-distance dependencies of arguments, modeling co-occurrence relationships between roles and events, but overlooking potential inductive biases: (i) semantic differences among arguments of the same type and (ii) large margin separation between arguments of the different types. Inspired by prototype networks, we introduce a new model named HMPEAE, which takes the two inductive biases above as targets to locate prototypes and guide the model to learn argument representations based on these prototypes.Specifically, we set multiple prototypes to represent each role to capture intra-class differences. Simultaneously, we use hypersphere as the output space for prototypes, defining large margin separation between prototypes to encourage the model to learn significant differences between different types of arguments effectively.We solve the “argument-prototype” assignment as an optimal transport problem to optimize the argument representation and minimize the absolute distance between arguments and prototypes to achieve compactness within sub-clusters. Experimental results on the RAMS and WikiEvents datasets show that HMPEAE achieves state-of-the-art performances.

2023

pdf bib
Dynamic Heterogeneous-Graph Reasoning with Language Models and Knowledge Representation Learning for Commonsense Question Answering
Yujie Wang | Hu Zhang | Jiye Liang | Ru Li
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Recently, knowledge graphs (KGs) have won noteworthy success in commonsense question answering. Existing methods retrieve relevant subgraphs in the KGs through key entities and reason about the answer with language models (LMs) and graph neural networks. However, they ignore (i) optimizing the knowledge representation and structure of subgraphs and (ii) deeply fusing heterogeneous QA context with subgraphs. In this paper, we propose a dynamic heterogeneous-graph reasoning method with LMs and knowledge representation learning (DHLK), which constructs a heterogeneous knowledge graph (HKG) based on multiple knowledge sources and optimizes the structure and knowledge representation of the HKG using a two-stage pruning strategy and knowledge representation learning (KRL). It then performs joint reasoning by LMs and Relation Mask Self-Attention (RMSA). Specifically, DHLK filters key entities based on the dictionary vocabulary to achieve the first-stage pruning while incorporating the paraphrases in the dictionary into the subgraph to construct the HKG. Then, DHLK encodes and fuses the QA context and HKG using LM, and dynamically removes irrelevant KG entities based on the attention weights of LM for the second-stage pruning. Finally, DHLK introduces KRL to optimize the knowledge representation and perform answer reasoning on the HKG by RMSA.We evaluate DHLK at CommonsenseQA and OpenBookQA, and show its improvement on existing LM and LM+KG methods.

2021

pdf bib
A Knowledge-Guided Framework for Frame Identification
Xuefeng Su | Ru Li | Xiaoli Li | Jeff Z. Pan | Hu Zhang | Qinghua Chai | Xiaoqi Han
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Frame Identification (FI) is a fundamental and challenging task in frame semantic parsing. The task aims to find the exact frame evoked by a target word in a given sentence. It is generally regarded as a classification task in existing work, where frames are treated as discrete labels or represented using onehot embeddings. However, the valuable knowledge about frames is neglected. In this paper, we propose a Knowledge-Guided Frame Identification framework (KGFI) that integrates three types frame knowledge, including frame definitions, frame elements and frame-to-frame relations, to learn better frame representation, which guides the KGFI to jointly map target words and frames into the same embedding space and subsequently identify the best frame by calculating the dot-product similarity scores between the target word embedding and all of the frame embeddings. The extensive experimental results demonstrate KGFI significantly outperforms the state-of-the-art methods on two benchmark datasets.

pdf bib
Integrating Semantic Scenario and Word Relations for Abstractive Sentence Summarization
Yong Guan | Shaoru Guo | Ru Li | Xiaoli Li | Hu Zhang
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Recently graph-based methods have been adopted for Abstractive Text Summarization. However, existing graph-based methods only consider either word relations or structure information, which neglect the correlation between them. To simultaneously capture the word relations and structure information from sentences, we propose a novel Dual Graph network for Abstractive Sentence Summarization. Specifically, we first construct semantic scenario graph and semantic word relation graph based on FrameNet, and subsequently learn their representations and design graph fusion method to enhance their correlation and obtain better semantic representation for summary generation. Experimental results show our model outperforms existing state-of-the-art methods on two popular benchmark datasets, i.e., Gigaword and DUC 2004.

2020

pdf bib
基于语料库的武侠与仙侠网络小说文体、词汇及主题对比分析(A Corpus-based Contrastive Analysis of Style, Vocabulary and Theme of Wuxia and Xianxia Internet Novels)
Sanle Zhang (张三乐) | Pengyuan Liu (刘鹏远) | Hu Zhang (张虎)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

网络文学在我国发展迅猛,其数量和影响力呈现逐年上升的趋势,但目前尚无公开的较大规模网络文学作品语料库,鲜见基于语料库对网络文学具体类别作品的定量研究。本文初步建立了一个网络文学语料库,其中包括武侠和仙侠网络小说,使用文本计量、词频统计以及主题挖掘的方法对两类小说的文体风格、具体词汇使用和小说主题进行对比分析。通过比较,我们发现两类小说的文体风格大致相同,它们在词汇的使用和主题上既有共性又各具特色。从微观到宏观,从表面到内容,将定量统计和定性分析相结合,多角度、多层次的对武侠和仙侠网络小说进行比较。

2008

pdf bib
A Study on Consistency Checking Method of Part-Of-Speech Tagging for Chinese Corpora
Hu Zhang | Jiaheng Zheng
International Journal of Computational Linguistics & Chinese Language Processing, Volume 13, Number 2, June 2008

pdf bib
A Chinese Word Segmentation System Based on Cascade Model
Jianfeng Zhang | Jiaheng Zheng | Hu Zhang | Hongye Tan
Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing

2005

pdf bib
A Classification-based Algorithm for Consistency Check of Part-of-Speech Tagging for Chinese Corpora
Hu Zhang | Jia-heng Zheng | Ying Zhao
Companion Volume to the Proceedings of Conference including Posters/Demos and tutorial abstracts