Skip to main content

Showing 51–89 of 89 results for author: Shou, L

.
  1. arXiv:2105.13239  [pdf, other

    cs.CL cs.SE

    CoSQA: 20,000+ Web Queries for Code Search and Question Answering

    Authors: Junjie Huang, Duyu Tang, Linjun Shou, Ming Gong, Ke Xu, Daxin Jiang, Ming Zhou, Nan Duan

    Abstract: Finding codes given natural language query isb eneficial to the productivity of software developers. Future progress towards better semantic matching between query and code requires richer supervised training resources. To remedy this, we introduce the CoSQA dataset.It includes 20,604 labels for pairs of natural language queries and codes, each annotated by at least 3 human annotators. We further… ▽ More

    Submitted 27 May, 2021; originally announced May 2021.

    Comments: ACL 2021 main conference. The CoSQA data and leaderboard are available at https://github.com/microsoft/CodeXGLUE/tree/main/Text-Code/NL-code-search-WebQuery. The code is available at https://github.com/Jun-jie-Huang/CoCLR

  2. arXiv:2105.11174  [pdf, other

    cs.CL cs.AI

    Retrieval Enhanced Model for Commonsense Generation

    Authors: Han Wang, Yang Liu, Chenguang Zhu, Linjun Shou, Ming Gong, Yichong Xu, Michael Zeng

    Abstract: Commonsense generation is a challenging task of generating a plausible sentence describing an everyday scenario using provided concepts. Its requirement of reasoning over commonsense knowledge and compositional generalization ability even puzzles strong pre-trained language generation models. We propose a novel framework using retrieval methods to enhance both the pre-training and fine-tuning for… ▽ More

    Submitted 24 May, 2021; originally announced May 2021.

    Comments: Findings of ACL-IJCNLP 2021

  3. Towards Crowd-aware Indoor Path Planning (Extended Version)

    Authors: Tiantian Liu, Huan Li, Hua Lu, Muhammad Aamir Cheema, Lidan Shou

    Abstract: Indoor venues accommodate many people who collectively form crowds. Such crowds in turn influence people's routing choices, e.g., people may prefer to avoid crowded rooms when walking from A to B. This paper studies two types of crowd-aware indoor path planning queries. The Indoor Crowd-Aware Fastest Path Query (FPQ) finds a path with the shortest travel time in the presence of crowds, whereas the… ▽ More

    Submitted 29 April, 2021; v1 submitted 12 April, 2021; originally announced April 2021.

    Comments: The extension of a VLDB'21 paper "Towards Crowd-aware Indoor Path Planning"

  4. arXiv:2104.01767  [pdf, other

    cs.CL

    WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach

    Authors: Junjie Huang, Duyu Tang, Wanjun Zhong, Shuai Lu, Linjun Shou, Ming Gong, Daxin Jiang, Nan Duan

    Abstract: Producing the embedding of a sentence in an unsupervised way is valuable to natural language matching and retrieval problems in practice. In this work, we conduct a thorough examination of pretrained model based unsupervised sentence embeddings. We study on four pretrained models and conduct massive experiments on seven datasets regarding sentence semantics. We have there main findings. First, ave… ▽ More

    Submitted 8 April, 2021; v1 submitted 5 April, 2021; originally announced April 2021.

  5. Global weak solutions for compressible Navier-Stokes-Vlasov-Fokker-Planck system

    Authors: Hai-Liang Li, Ling-Yun Shou

    Abstract: The one-dimensional compressible Navier-Stokes-Vlasov-Fokker-Planck system with density-dependent viscosity and drag force coefficients is investigated in the present paper. The existence, uniqueness, and regularity of global weak solution to the initial value problem for general initial data are established in spatial periodic domain. Moreover, the long time behavior of the weak solution is analy… ▽ More

    Submitted 6 April, 2023; v1 submitted 30 March, 2021; originally announced March 2021.

    Comments: 42 pages

    MSC Class: 35Q30; 35Q84; 82C40

    Journal ref: Communications in Mathematical Research 39 (1), (2023), 136-172

  6. arXiv:2102.11114  [pdf, other

    cs.CL cs.SD eess.AS

    Generating Human Readable Transcript for Automatic Speech Recognition with Pre-trained Language Model

    Authors: Junwei Liao, Yu Shi, Ming Gong, Linjun Shou, Sefik Eskimez, Liyang Lu, Hong Qu, Michael Zeng

    Abstract: Modern Automatic Speech Recognition (ASR) systems can achieve high performance in terms of recognition accuracy. However, a perfectly accurate transcript still can be challenging to read due to disfluency, filter words, and other errata common in spoken communication. Many downstream tasks and human readers rely on the output of the ASR system; therefore, errors introduced by the speaker and ASR s… ▽ More

    Submitted 22 February, 2021; originally announced February 2021.

    Comments: Accepted in 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2021)

  7. arXiv:2102.06578  [pdf, other

    cs.CL

    Improving Zero-shot Neural Machine Translation on Language-specific Encoders-Decoders

    Authors: Junwei Liao, Yu Shi, Ming Gong, Linjun Shou, Hong Qu, Michael Zeng

    Abstract: Recently, universal neural machine translation (NMT) with shared encoder-decoder gained good performance on zero-shot translation. Unlike universal NMT, jointly trained language-specific encoders-decoders aim to achieve universal representation across non-shared modules, each of which is for a language or language family. The non-shared architecture has the advantage of mitigating internal languag… ▽ More

    Submitted 12 February, 2021; originally announced February 2021.

  8. arXiv:2102.04664  [pdf, other

    cs.SE cs.CL

    CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation

    Authors: Shuai Lu, Daya Guo, Shuo Ren, Junjie Huang, Alexey Svyatkovskiy, Ambrosio Blanco, Colin Clement, Dawn Drain, Daxin Jiang, Duyu Tang, Ge Li, Lidong Zhou, Linjun Shou, Long Zhou, Michele Tufano, Ming Gong, Ming Zhou, Nan Duan, Neel Sundaresan, Shao Kun Deng, Shengyu Fu, Shujie Liu

    Abstract: Benchmark datasets have a significant impact on accelerating research in programming language tasks. In this paper, we introduce CodeXGLUE, a benchmark dataset to foster machine learning research for program understanding and generation. CodeXGLUE includes a collection of 10 tasks across 14 datasets and a platform for model evaluation and comparison. CodeXGLUE also features three baseline systems,… ▽ More

    Submitted 16 March, 2021; v1 submitted 9 February, 2021; originally announced February 2021.

    Comments: 14 pages; Revise CodeBLEU scores for all models on text-to-code task

  9. arXiv:2012.14116  [pdf, other

    cs.CL

    Syntax-Enhanced Pre-trained Model

    Authors: Zenan Xu, Daya Guo, Duyu Tang, Qinliang Su, Linjun Shou, Ming Gong, Wanjun Zhong, Xiaojun Quan, Nan Duan, Daxin Jiang

    Abstract: We study the problem of leveraging the syntactic structure of text to enhance pre-trained models such as BERT and RoBERTa. Existing methods utilize syntax of text either in the pre-training stage or in the fine-tuning stage, so that they suffer from discrepancy between the two stages. Such a problem would lead to the necessity of having human-annotated syntactic information, which limits the appli… ▽ More

    Submitted 29 May, 2021; v1 submitted 28 December, 2020; originally announced December 2020.

    Comments: Accepted by ACL-IJCNLP 2021: The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing

  10. arXiv:2012.06048  [pdf, other

    cs.CL cs.LG

    Reinforced Multi-Teacher Selection for Knowledge Distillation

    Authors: Fei Yuan, Linjun Shou, Jian Pei, Wutao Lin, Ming Gong, Yan Fu, Daxin Jiang

    Abstract: In natural language processing (NLP) tasks, slow inference speed and huge footprints in GPU usage remain the bottleneck of applying pre-trained deep models in production. As a popular method for model compression, knowledge distillation transfers knowledge from one or multiple large (teacher) models to a small (student) model. When multiple teacher models are available in distillation, the state-o… ▽ More

    Submitted 13 December, 2020; v1 submitted 11 December, 2020; originally announced December 2020.

    Comments: AAAI 2021

  11. arXiv:2012.05048  [pdf, ps, other

    math.AP

    Global well-posedness of one-dimensional compressible Navier-Stokes-Vlasov system

    Authors: Hai-Liang Li, Ling-Yun Shou

    Abstract: A fluid-particle model is investigated in the present paper, which consists of the compressible Navier-Stokes equations coupled with the Vlasov equation though a nonlinear drag force. We consider the initial value problem for the one-dimensional compressible Navier-Stokes-Vlasov system and establish the global existence and uniqueness of the weak solution for general initial data in either spatial… ▽ More

    Submitted 15 September, 2021; v1 submitted 9 December, 2020; originally announced December 2020.

    Comments: 51 pages

  12. arXiv:2011.11928  [pdf, other

    cs.CL

    GLGE: A New General Language Generation Evaluation Benchmark

    Authors: Dayiheng Liu, Yu Yan, Yeyun Gong, Weizhen Qi, Hang Zhang, Jian Jiao, Weizhu Chen, Jie Fu, Linjun Shou, Ming Gong, Pengcheng Wang, Jiusheng Chen, Daxin Jiang, Jiancheng Lv, Ruofei Zhang, Winnie Wu, Ming Zhou, Nan Duan

    Abstract: Multi-task benchmarks such as GLUE and SuperGLUE have driven great progress of pretraining and transfer learning in Natural Language Processing (NLP). These benchmarks mostly focus on a range of Natural Language Understanding (NLU) tasks, without considering the Natural Language Generation (NLG) models. In this paper, we present the General Language Generation Evaluation (GLGE), a new multi-task b… ▽ More

    Submitted 1 June, 2021; v1 submitted 24 November, 2020; originally announced November 2020.

    Comments: Findings of Association for Computational Linguistics. ACL 2021

  13. arXiv:2011.11160  [pdf, other

    cs.LG cs.DC

    LINDT: Tackling Negative Federated Learning with Local Adaptation

    Authors: Hong Lin, Lidan Shou, Ke Chen, Gang Chen, Sai Wu

    Abstract: Federated Learning (FL) is a promising distributed learning paradigm, which allows a number of data owners (also called clients) to collaboratively learn a shared model without disclosing each client's data. However, FL may fail to proceed properly, amid a state that we call negative federated learning (NFL). This paper addresses the problem of negative federated learning. We formulate a rigorous… ▽ More

    Submitted 22 November, 2020; originally announced November 2020.

  14. arXiv:2011.05723  [pdf, other

    cs.CL cs.LG

    CalibreNet: Calibration Networks for Multilingual Sequence Labeling

    Authors: Shining Liang, Linjun Shou, Jian Pei, Ming Gong, Wanli Zuo, Daxin Jiang

    Abstract: Lack of training data in low-resource languages presents huge challenges to sequence labeling tasks such as named entity recognition (NER) and machine reading comprehension (MRC). One major obstacle is the errors on the boundary of predicted answers. To tackle this problem, we propose CalibreNet, which predicts answers in two steps. In the first step, any existing sequence labeling method can be a… ▽ More

    Submitted 11 November, 2020; originally announced November 2020.

    Comments: Long paper in WSDM 2021

  15. arXiv:2010.14271  [pdf, other

    cs.CL cs.AI

    Cross-lingual Machine Reading Comprehension with Language Branch Knowledge Distillation

    Authors: Junhao Liu, Linjun Shou, Jian Pei, Ming Gong, Min Yang, Daxin Jiang

    Abstract: Cross-lingual Machine Reading Comprehension (CLMRC) remains a challenging problem due to the lack of large-scale annotated datasets in low-source languages, such as Arabic, Hindi, and Vietnamese. Many previous approaches use translation data by translating from a rich-source language, such as English, to low-source languages as auxiliary supervision. However, how to effectively leverage translatio… ▽ More

    Submitted 27 October, 2020; originally announced October 2020.

    Comments: Accepted as long paper in COLING 2020

  16. arXiv:2010.07606   

    cs.CL

    Learning Better Representation for Tables by Self-Supervised Tasks

    Authors: Liang Li, Can Ma, Yinliang Yue, Linjun Shou, Dayong Hu

    Abstract: Table-to-text generation aims at automatically generating natural text to help people to conveniently obtain the important information in tables. Although neural models for table-to-text have achieved remarkable progress, some problems still overlooked. The first is that the values recorded in many tables are mostly numbers in practice. The existing approaches do not do special treatment for these… ▽ More

    Submitted 30 March, 2021; v1 submitted 15 October, 2020; originally announced October 2020.

    Comments: This article is writing messy, and some of the experiments are inadequate, which may mislead the reader about our work

  17. arXiv:2010.06801  [pdf, other

    cs.CL cs.AI

    A Graph Representation of Semi-structured Data for Web Question Answering

    Authors: Xingyao Zhang, Linjun Shou, Jian Pei, Ming Gong, Lijie Wen, Daxin Jiang

    Abstract: The abundant semi-structured data on the Web, such as HTML-based tables and lists, provide commercial search engines a rich information source for question answering (QA). Different from plain text passages in Web documents, Web tables and lists have inherent structures, which carry semantic correlations among various elements in tables and lists. Many existing studies treat tables and lists as fl… ▽ More

    Submitted 14 October, 2020; originally announced October 2020.

    Comments: Accepted as long paper in COLING 2020

  18. arXiv:2010.03910  [pdf, other

    cs.DB cs.DS

    An Experimental Analysis of Indoor Spatial Queries: Modeling, Indexing, and Processing

    Authors: Tiantian Liu, Huan Li, Hua Lu, Muhammad Aamir Cheema, Lidan Shou

    Abstract: Indoor location-based services (LBS), such as POI search and routing, are often built on top of typical indoor spatial queries. To support such queries and indoor LBS, multiple techniques including model/indexes and search algorithms have been proposed. In this work, we conduct an extensive experimental study on existing proposals for indoor spatial queries. We survey five model/indexes, compare t… ▽ More

    Submitted 8 October, 2020; originally announced October 2020.

    Comments: An Experiment and Analysis Paper

  19. arXiv:2009.14348  [pdf, other

    cs.CL

    MaP: A Matrix-based Prediction Approach to Improve Span Extraction in Machine Reading Comprehension

    Authors: Huaishao Luo, Yu Shi, Ming Gong, Linjun Shou, Tianrui Li

    Abstract: Span extraction is an essential problem in machine reading comprehension. Most of the existing algorithms predict the start and end positions of an answer span in the given corresponding context by generating two probability vectors. In this paper, we propose a novel approach that extends the probability vector to a probability matrix. Such a matrix can cover more start-end position pairs. Precise… ▽ More

    Submitted 29 September, 2020; originally announced September 2020.

    Comments: to appear at AACL-IJCNLP 2020

  20. arXiv:2009.12056  [pdf, other

    cs.CL cs.LG

    No Answer is Better Than Wrong Answer: A Reflection Model for Document Level Machine Reading Comprehension

    Authors: Xuguang Wang, Linjun Shou, Ming Gong, Nan Duan, Daxin Jiang

    Abstract: The Natural Questions (NQ) benchmark set brings new challenges to Machine Reading Comprehension: the answers are not only at different levels of granularity (long and short), but also of richer types (including no-answer, yes/no, single-span and multi-span). In this paper, we target at this challenge and handle all answer types systematically. In particular, we propose a novel approach called Refl… ▽ More

    Submitted 29 September, 2020; v1 submitted 25 September, 2020; originally announced September 2020.

    Comments: Accepted by Findings of EMNLP 2020

  21. arXiv:2009.07406  [pdf, other

    cs.CL cs.AI

    Tag and Correct: Question aware Open Information Extraction with Two-stage Decoding

    Authors: Martin Kuo, Yaobo Liang, Lei Ji, Nan Duan, Linjun Shou, Ming Gong, Peng Chen

    Abstract: Question Aware Open Information Extraction (Question aware Open IE) takes question and passage as inputs, outputting an answer tuple which contains a subject, a predicate, and one or more arguments. Each field of answer is a natural language word sequence and is extracted from the passage. The semi-structured answer has two advantages which are more readable and falsifiable compared to span answer… ▽ More

    Submitted 15 September, 2020; originally announced September 2020.

    Comments: 11 pages, 1 figure, 4 tables

    MSC Class: 68T50; 68T01

  22. Mining Implicit Relevance Feedback from User Behavior for Web Question Answering

    Authors: Linjun Shou, Shining Bo, Feixiang Cheng, Ming Gong, Jian Pei, Daxin Jiang

    Abstract: Training and refreshing a web-scale Question Answering (QA) system for a multi-lingual commercial search engine often requires a huge amount of training examples. One principled idea is to mine implicit relevance feedback from user behavior recorded in search engine logs. All previous works on mining implicit relevance feedback target at relevance of web documents rather than passages. Due to seve… ▽ More

    Submitted 15 June, 2020; v1 submitted 13 June, 2020; originally announced June 2020.

    Comments: Accepted by KDD 2020

  23. arXiv:2004.14069  [pdf, other

    cs.CL cs.AI

    Enhancing Answer Boundary Detection for Multilingual Machine Reading Comprehension

    Authors: Fei Yuan, Linjun Shou, Xuanyu Bai, Ming Gong, Yaobo Liang, Nan Duan, Yan Fu, Daxin Jiang

    Abstract: Multilingual pre-trained models could leverage the training data from a rich source language (such as English) to improve performance on low resource languages. However, the transfer quality for multilingual Machine Reading Comprehension (MRC) is significantly worse than sentence classification tasks mainly due to the requirement of MRC to detect the word level answer boundary. In this paper, we p… ▽ More

    Submitted 8 May, 2020; v1 submitted 29 April, 2020; originally announced April 2020.

    Comments: Accepted to ACL 2020

  24. arXiv:2004.13659  [pdf, other

    cs.CL

    LogicalFactChecker: Leveraging Logical Operations for Fact Checking with Graph Module Network

    Authors: Wanjun Zhong, Duyu Tang, Zhangyin Feng, Nan Duan, Ming Zhou, Ming Gong, Linjun Shou, Daxin Jiang, Jiahai Wang, Jian Yin

    Abstract: Verifying the correctness of a textual statement requires not only semantic reasoning about the meaning of words, but also symbolic reasoning about logical operations like count, superlative, aggregation, etc. In this work, we propose LogicalFactChecker, a neural network approach capable of leveraging logical operations for fact checking. It achieves the state-of-the-art performance on TABFACT, a… ▽ More

    Submitted 28 April, 2020; originally announced April 2020.

    Comments: 13 pages; 7 figures; Accepted by ACL2020 as a long paper

  25. arXiv:2004.05568  [pdf, other

    cs.CL

    Pre-training Text Representations as Meta Learning

    Authors: Shangwen Lv, Yuechen Wang, Daya Guo, Duyu Tang, Nan Duan, Fuqing Zhu, Ming Gong, Linjun Shou, Ryan Ma, Daxin Jiang, Guihong Cao, Ming Zhou, Songlin Hu

    Abstract: Pre-training text representations has recently been shown to significantly improve the state-of-the-art in many natural language processing tasks. The central goal of pre-training is to learn text representations that are useful for subsequent tasks. However, existing approaches are optimized by minimizing a proxy objective, such as the negative log likelihood of language modeling. In this work, w… ▽ More

    Submitted 12 April, 2020; originally announced April 2020.

    Comments: 2 figures, 3 tables

  26. arXiv:2004.04438  [pdf, other

    cs.CL

    Improving Readability for Automatic Speech Recognition Transcription

    Authors: Junwei Liao, Sefik Emre Eskimez, Liyang Lu, Yu Shi, Ming Gong, Linjun Shou, Hong Qu, Michael Zeng

    Abstract: Modern Automatic Speech Recognition (ASR) systems can achieve high performance in terms of recognition accuracy. However, a perfectly accurate transcript still can be challenging to read due to grammatical errors, disfluency, and other errata common in spoken communication. Many downstream tasks and human readers rely on the output of the ASR system; therefore, errors introduced by the speaker and… ▽ More

    Submitted 9 April, 2020; originally announced April 2020.

  27. arXiv:2004.03070  [pdf, other

    cs.CL cs.AI

    Inferential Text Generation with Multiple Knowledge Sources and Meta-Learning

    Authors: Daya Guo, Akari Asai, Duyu Tang, Nan Duan, Ming Gong, Linjun Shou, Daxin Jiang, Jian Yin, Ming Zhou

    Abstract: We study the problem of generating inferential texts of events for a variety of commonsense like \textit{if-else} relations. Existing approaches typically use limited evidence from training examples and learn for each relation individually. In this work, we use multiple knowledge sources as fuels for the model. Existing commonsense knowledge bases like ConceptNet are dominated by taxonomic knowled… ▽ More

    Submitted 15 April, 2020; v1 submitted 6 April, 2020; originally announced April 2020.

  28. arXiv:2004.01401  [pdf, ps, other

    cs.CL

    XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training, Understanding and Generation

    Authors: Yaobo Liang, Nan Duan, Yeyun Gong, Ning Wu, Fenfei Guo, Weizhen Qi, Ming Gong, Linjun Shou, Daxin Jiang, Guihong Cao, Xiaodong Fan, Ruofei Zhang, Rahul Agrawal, Edward Cui, Sining Wei, Taroon Bharti, Ying Qiao, Jiun-Hung Chen, Winnie Wu, Shuguang Liu, Fan Yang, Daniel Campos, Rangan Majumder, Ming Zhou

    Abstract: In this paper, we introduce XGLUE, a new benchmark dataset that can be used to train large-scale cross-lingual pre-trained models using multilingual and bilingual corpora and evaluate their performance across a diverse set of cross-lingual tasks. Comparing to GLUE(Wang et al., 2019), which is labeled in English for natural language understanding tasks only, XGLUE has two main advantages: (1) it pr… ▽ More

    Submitted 22 May, 2020; v1 submitted 3 April, 2020; originally announced April 2020.

  29. arXiv:2002.08155  [pdf, other

    cs.CL cs.PL

    CodeBERT: A Pre-Trained Model for Programming and Natural Languages

    Authors: Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, Ming Zhou

    Abstract: We present CodeBERT, a bimodal pre-trained model for programming language (PL) and nat-ural language (NL). CodeBERT learns general-purpose representations that support downstream NL-PL applications such as natural language codesearch, code documentation generation, etc. We develop CodeBERT with Transformer-based neural architecture, and train it with a hybrid objective function that incorporates t… ▽ More

    Submitted 18 September, 2020; v1 submitted 19 February, 2020; originally announced February 2020.

    Comments: Accepted to Findings of EMNLP 2020. 12 pages

  30. Optimization of CNOT circuits on limited connectivity architecture

    Authors: Bujiao Wu, Xiaoyu He, Shuai Yang, Lifu Shou, Guojing Tian, Jialin Zhang, Xiaoming Sun

    Abstract: A CNOT circuit is the key gadget for entangling qubits in quantum computing systems. However, the qubit connectivity of noisy intermediate-scale quantum (NISQ) devices is constrained by their {limited connectivity architecture}. To improve the performance of CNOT circuits on NISQ devices, we investigate the optimization of the size/depth of CNOT circuits under the limited connectivity architecture… ▽ More

    Submitted 2 February, 2023; v1 submitted 31 October, 2019; originally announced October 2019.

    Comments: 24 pages, 13 figures

    Journal ref: Physical Review Research, 5 (2023) 013065

  31. arXiv:1910.08381  [pdf, other

    cs.CL

    Model Compression with Two-stage Multi-teacher Knowledge Distillation for Web Question Answering System

    Authors: Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang

    Abstract: Deep pre-training and fine-tuning models (such as BERT and OpenAI GPT) have demonstrated excellent results in question answering areas. However, due to the sheer amount of model parameters, the inference speed of these models is very slow. How to apply these complex models to real business scenarios becomes a challenging but practical problem. Previous model compression methods usually suffer from… ▽ More

    Submitted 18 October, 2019; originally announced October 2019.

    Comments: Accepted by WSDM 2020

  32. arXiv:1909.05311  [pdf, other

    cs.CL

    Graph-Based Reasoning over Heterogeneous External Knowledge for Commonsense Question Answering

    Authors: Shangwen Lv, Daya Guo, Jingjing Xu, Duyu Tang, Nan Duan, Ming Gong, Linjun Shou, Daxin Jiang, Guihong Cao, Songlin Hu

    Abstract: Commonsense question answering aims to answer questions which require background knowledge that is not explicitly expressed in the question. The key challenge is how to obtain evidence from external knowledge and make predictions based on the evidence. Recent works either learn to generate evidence from human-annotated evidence which is expensive to collect, or extract evidence from either structu… ▽ More

    Submitted 8 June, 2020; v1 submitted 9 September, 2019; originally announced September 2019.

    Comments: 8 pages, 7 figure, AAAI 2020

  33. arXiv:1909.00964  [pdf, other

    cs.CL

    Unicoder: A Universal Language Encoder by Pre-training with Multiple Cross-lingual Tasks

    Authors: Haoyang Huang, Yaobo Liang, Nan Duan, Ming Gong, Linjun Shou, Daxin Jiang, Ming Zhou

    Abstract: We present Unicoder, a universal language encoder that is insensitive to different languages. Given an arbitrary NLP task, a model can be trained with Unicoder using training data in one language and directly applied to inputs of the same task in other languages. Comparing to similar efforts such as Multilingual BERT and XLM, three new cross-lingual pre-training tasks are proposed, including cross… ▽ More

    Submitted 4 September, 2019; v1 submitted 3 September, 2019; originally announced September 2019.

    Comments: Accepted to EMNLP2019; 10 pages, 2 figures

  34. arXiv:1904.09636  [pdf, other

    cs.CL

    Model Compression with Multi-Task Knowledge Distillation for Web-scale Question Answering System

    Authors: Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang

    Abstract: Deep pre-training and fine-tuning models (like BERT, OpenAI GPT) have demonstrated excellent results in question answering areas. However, due to the sheer amount of model parameters, the inference speed of these models is very slow. How to apply these complex models to real business scenarios becomes a challenging but practical problem. Previous works often leverage model compression approaches t… ▽ More

    Submitted 21 April, 2019; originally announced April 2019.

    Comments: 9 pages, 2 figures

  35. arXiv:1904.09535  [pdf, other

    cs.CL

    NeuronBlocks: Building Your NLP DNN Models Like Playing Lego

    Authors: Ming Gong, Linjun Shou, Wutao Lin, Zhijie Sang, Quanjia Yan, Ze Yang, Feixiang Cheng, Daxin Jiang

    Abstract: Deep Neural Networks (DNN) have been widely employed in industry to address various Natural Language Processing (NLP) tasks. However, many engineers find it a big overhead when they have to choose from multiple frameworks, compare different types of models, and understand various optimization mechanisms. An NLP toolkit for DNN models with both generality and flexibility can greatly improve the pro… ▽ More

    Submitted 18 October, 2019; v1 submitted 20 April, 2019; originally announced April 2019.

    Comments: 6 pages, 3 figures

    Journal ref: EMNLP 2019

  36. arXiv:1904.03898  [pdf, other

    cs.CL cs.AI cs.LG

    Semi-Supervised Few-Shot Learning for Dual Question-Answer Extraction

    Authors: Jue Wang, Ke Chen, Lidan Shou, Sai Wu, Sharad Mehrotra

    Abstract: This paper addresses the problem of key phrase extraction from sentences. Existing state-of-the-art supervised methods require large amounts of annotated data to achieve good performance and generalization. Collecting labeled data is, however, often expensive. In this paper, we redefine the problem as question-answer extraction, and present SAMIE: Self-Asking Model for Information Ixtraction, a se… ▽ More

    Submitted 8 April, 2019; originally announced April 2019.

    Comments: 7 pages, 5 figures, submission to IJCAI19

  37. arXiv:1807.03596  [pdf, other

    astro-ph.CO gr-qc

    Constraints on the generalized natural inflation after Planck 2018

    Authors: Nan Zhang, Ya-Bo Wu, Jun-Wang Lu, Chu-Wen Sun, Li-Jie Shou, Hai-Zhou Xu

    Abstract: Based on the dynamics of single scalar field slow-roll inflation and the theory of reheating, we investigate the generalized natural inflationary (GNI) model. Concretely, we give constraints on the scalar spectral index $n_{s}$ and tensor-to scalar ratio $r$ for $Λ$CDM $+r$ model according to the latest data from Plack 2018 TT,TE,EE+lowE+lensing (P18) and BICEP2/Keck 2015 season (BK15), i.e.,… ▽ More

    Submitted 1 July, 2020; v1 submitted 10 July, 2018; originally announced July 2018.

    Comments: 15 pages, 15 figures, accepted for publication in CPC

  38. arXiv:1510.08408  [pdf, ps, other

    math-ph

    Trace formulas for Schrödinger operators on star graphs

    Authors: Semra Demirel-Frank, Laura Shou

    Abstract: We derive trace formulas of the Buslaev-Faddeev type for quantum star graphs. One of the new ingredients is high energy asymptotics of the perturbation determinant.

    Submitted 28 October, 2015; originally announced October 2015.

  39. arXiv:1509.05279  [pdf, other

    math-ph

    Subcritical behavior for quasi-periodic Schrödinger cocycles with trigonometric potentials

    Authors: C. A. Marx, L. H. Shou, J. L. Wellens

    Abstract: We give a criterion implying subcritical behavior for quasi-periodic Schrödinger operators where the potential sampling function is given by a trigonometric polynomial. Subcritical behavior, in the sense of Avila's global theory, is known to imply purely absolutely continuous spectrum for all irrational frequencies and all phases.

    Submitted 31 October, 2015; v1 submitted 17 September, 2015; originally announced September 2015.

    Comments: to appear in the Journal of Spectral Theory