Skip to main content

Showing 1–50 of 89 results for author: Yao, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2410.06074  [pdf, other

    cs.LG

    Scalable Mechanistic Neural Networks

    Authors: Jiale Chen, Dingling Yao, Adeel Pervez, Dan Alistarh, Francesco Locatello

    Abstract: We propose Scalable Mechanistic Neural Network (S-MNN), an enhanced neural network framework designed for scientific machine learning applications involving long temporal sequences. By reformulating the original Mechanistic Neural Network (MNN) (Pervez et al., 2024), we reduce the computational time and space complexities from cubic and quadratic with respect to the sequence length, respectively,… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

  2. arXiv:2410.04203  [pdf, other

    cs.AI

    RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization

    Authors: Hanyang Zhao, Genta Indra Winata, Anirban Das, Shi-Xiong Zhang, David D. Yao, Wenpin Tang, Sambit Sahu

    Abstract: Recently, numerous preference optimization algorithms have been introduced as extensions to the Direct Preference Optimization (DPO) family. While these methods have successfully aligned models with human preferences, there is a lack of understanding regarding the contributions of their additional components. Moreover, fair and consistent comparisons are scarce, making it difficult to discern whic… ▽ More

    Submitted 5 October, 2024; originally announced October 2024.

  3. arXiv:2409.17874  [pdf, other

    cs.AI

    DarkSAM: Fooling Segment Anything Model to Segment Nothing

    Authors: Ziqi Zhou, Yufei Song, Minghui Li, Shengshan Hu, Xianlong Wang, Leo Yu Zhang, Dezhong Yao, Hai Jin

    Abstract: Segment Anything Model (SAM) has recently gained much attention for its outstanding generalization to unseen data and tasks. Despite its promising prospect, the vulnerabilities of SAM, especially to universal adversarial perturbation (UAP) have not been thoroughly investigated yet. In this paper, we propose DarkSAM, the first prompt-free universal attack framework against SAM, including a semantic… ▽ More

    Submitted 26 September, 2024; originally announced September 2024.

    Comments: This paper has been accepted by the 38th Annual Conference on Neural Information Processing Systems (NeurIPS'24)

  4. arXiv:2409.15101  [pdf, other

    cs.SD eess.AS

    GALD-SE: Guided Anisotropic Lightweight Diffusion for Efficient Speech Enhancement

    Authors: Chengzhong Wang, Jianjun Gu, Dingding Yao, Junfeng Li, Yonghong Yan

    Abstract: Speech enhancement is designed to enhance the intelligibility and quality of speech across diverse noise conditions. Recently, diffusion model has gained lots of attention in speech enhancement area, achieving competitive results. Current diffusion-based methods blur the signal with isotropic Gaussian noise and recover clean speech from the prior. However, these methods often suffer from a substan… ▽ More

    Submitted 9 October, 2024; v1 submitted 23 September, 2024; originally announced September 2024.

    Comments: We make reassessment and update the author list. All authors have approved this version of the manuscript

  5. arXiv:2409.11564  [pdf, other

    cs.CL cs.AI cs.CV cs.LG eess.AS

    Preference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A Survey

    Authors: Genta Indra Winata, Hanyang Zhao, Anirban Das, Wenpin Tang, David D. Yao, Shi-Xiong Zhang, Sambit Sahu

    Abstract: Preference tuning is a crucial process for aligning deep generative models with human preferences. This survey offers a thorough overview of recent advancements in preference tuning and the integration of human feedback. The paper is organized into three main sections: 1) introduction and preliminaries: an introduction to reinforcement learning frameworks, preference tuning tasks, models, and data… ▽ More

    Submitted 17 September, 2024; originally announced September 2024.

    Comments: Survey paper

  6. arXiv:2409.08858  [pdf, other

    cs.DC

    Exploring System-Heterogeneous Federated Learning with Dynamic Model Selection

    Authors: Dixi Yao

    Abstract: Federated learning is a distributed learning paradigm in which multiple mobile clients train a global model while keeping data local. These mobile clients can have various available memory and network bandwidth. However, to achieve the best global model performance, how we can utilize available memory and network bandwidth to the maximum remains an open challenge. In this paper, we propose to assi… ▽ More

    Submitted 13 September, 2024; originally announced September 2024.

  7. arXiv:2409.08503  [pdf, other

    cs.LG cs.CR

    Enhancing Privacy in ControlNet and Stable Diffusion via Split Learning

    Authors: Dixi Yao

    Abstract: With the emerging trend of large generative models, ControlNet is introduced to enable users to fine-tune pre-trained models with their own data for various use cases. A natural question arises: how can we train ControlNet models while ensuring users' data privacy across distributed devices? Exploring different distributed training schemes, we find conventional federated learning and split learnin… ▽ More

    Submitted 12 September, 2024; originally announced September 2024.

  8. arXiv:2409.08482  [pdf, other

    cs.LG cs.CR cs.CV

    Risks When Sharing LoRA Fine-Tuned Diffusion Model Weights

    Authors: Dixi Yao

    Abstract: With the emerging trend in generative models and convenient public access to diffusion models pre-trained on large datasets, users can fine-tune these models to generate images of personal faces or items in new contexts described by natural language. Parameter efficient fine-tuning (PEFT) such as Low Rank Adaptation (LoRA) has become the most common way to save memory and computation usage on the… ▽ More

    Submitted 12 September, 2024; originally announced September 2024.

  9. arXiv:2409.08400  [pdf, ps, other

    cs.LG cs.AI

    Scores as Actions: a framework of fine-tuning diffusion models by continuous-time reinforcement learning

    Authors: Hanyang Zhao, Haoxian Chen, Ji Zhang, David D. Yao, Wenpin Tang

    Abstract: Reinforcement Learning from human feedback (RLHF) has been shown a promising direction for aligning generative models with human intent and has also been explored in recent works for alignment of diffusion generative models. In this work, we provide a rigorous treatment by formulating the task of fine-tuning diffusion models, with reward functions learned from human feedback, as an exploratory con… ▽ More

    Submitted 12 September, 2024; originally announced September 2024.

  10. arXiv:2409.03976  [pdf, other

    cs.HC

    DECAN: A Denoising Encoder via Contrastive Alignment Network for Dry Electrode EEG Emotion Recognition

    Authors: Meihong Zhang, Shaokai Zhao, Shuai Wang, Zhiguo Luo, Liang Xie, Tiejun Liu, Dezhong Yao, Ye Yan, Erwei Yin

    Abstract: EEG signal is important for brain-computer interfaces (BCI). Nevertheless, existing dry and wet electrodes are difficult to balance between high signal-to-noise ratio and portability in EEG recording, which limits the practical use of BCI. In this study, we propose a Denoising Encoder via Contrastive Alignment Network (DECAN) for dry electrode EEG, under the assumption of the EEG representation co… ▽ More

    Submitted 5 September, 2024; originally announced September 2024.

  11. arXiv:2409.02772  [pdf, other

    cs.LG stat.ML

    Unifying Causal Representation Learning with the Invariance Principle

    Authors: Dingling Yao, Dario Rancati, Riccardo Cadei, Marco Fumero, Francesco Locatello

    Abstract: Causal representation learning aims at recovering latent causal variables from high-dimensional observations to solve causal downstream tasks, such as predicting the effect of new interventions or more robust classification. A plethora of methods have been developed, each tackling carefully crafted problem settings that lead to different types of identifiability. The folklore is that these differe… ▽ More

    Submitted 4 September, 2024; originally announced September 2024.

    Comments: 36 pages

  12. arXiv:2408.13522  [pdf, other

    cs.SD eess.AS

    StreamAAD: Decoding Spatial Auditory Attention with a Streaming Architecture

    Authors: Zelin Qiu, Dingding Yao, Junfeng Li

    Abstract: In this paper, we present our approach for the Track 1 of the Chinese Auditory Attention Decoding (Chinese AAD) Challenge at ISCSLP 2024. Most existing spatial auditory attention decoding (Sp-AAD) methods employ an isolated window architecture, focusing solely on global invariant features without considering relationships between different decision windows, which can lead to suboptimal performance… ▽ More

    Submitted 24 August, 2024; originally announced August 2024.

  13. arXiv:2408.06300  [pdf

    cond-mat.mtrl-sci cs.LG

    Inverse designing metamaterials with programmable nonlinear functional responses in graph space

    Authors: Marco Maurizi, Derek Xu, Yu-Tong Wang, Desheng Yao, David Hahn, Mourad Oudich, Anish Satpati, Mathieu Bauchy, Wei Wang, Yizhou Sun, Yun Jing, Xiaoyu Rayne Zheng

    Abstract: Material responses to static and dynamic stimuli, represented as nonlinear curves, are design targets for engineering functionalities like structural support, impact protection, and acoustic and photonic bandgaps. Three-dimensional metamaterials offer significant tunability due to their internal structure, yet existing methods struggle to capture their complex behavior-to-structure relationships.… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

    Comments: 19 pages, 5 figures

  14. arXiv:2408.04310  [pdf, other

    cs.LG cs.CR

    Constructing Adversarial Examples for Vertical Federated Learning: Optimal Client Corruption through Multi-Armed Bandit

    Authors: Duanyi Yao, Songze Li, Ye Xue, Jin Liu

    Abstract: Vertical federated learning (VFL), where each participating client holds a subset of data features, has found numerous applications in finance, healthcare, and IoT systems. However, adversarial attacks, particularly through the injection of adversarial examples (AEs), pose serious challenges to the security of VFL models. In this paper, we investigate such vulnerabilities through developing a nove… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: Published on ICLR2024

  15. arXiv:2407.06498  [pdf, other

    cs.HC

    Enhancing spatial auditory attention decoding with neuroscience-inspired prototype training

    Authors: Zelin Qiu, Jianjun Gu, Dingding Yao, Junfeng Li

    Abstract: The spatial auditory attention decoding (Sp-AAD) technology aims to determine the direction of auditory attention in multi-talker scenarios via neural recordings. Despite the success of recent Sp-AAD algorithms, their performance is hindered by trial-specific features in EEG data. This study aims to improve decoding performance against these features. Studies in neuroscience indicate that spatial… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  16. arXiv:2407.05869  [pdf, other

    cs.AI

    PORCA: Root Cause Analysis with Partially Observed Data

    Authors: Chang Gong, Di Yao, Jin Wang, Wenbin Li, Lanting Fang, Yongtao Xie, Kaiyu Feng, Peng Han, Jingping Bi

    Abstract: Root Cause Analysis (RCA) aims at identifying the underlying causes of system faults by uncovering and analyzing the causal structure from complex systems. It has been widely used in many application domains. Reliable diagnostic conclusions are of great importance in mitigating system failures and financial losses. However, previous studies implicitly assume a full observation of the system, which… ▽ More

    Submitted 11 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

  17. arXiv:2407.00541  [pdf

    cs.CL cs.AI cs.IR

    Answering real-world clinical questions using large language model based systems

    Authors: Yen Sia Low, Michael L. Jackson, Rebecca J. Hyde, Robert E. Brown, Neil M. Sanghavi, Julian D. Baldwin, C. William Pike, Jananee Muralidharan, Gavin Hui, Natasha Alexander, Hadeel Hassan, Rahul V. Nene, Morgan Pike, Courtney J. Pokrzywa, Shivam Vedak, Adam Paul Yan, Dong-han Yao, Amy R. Zipursky, Christina Dinh, Philip Ballentine, Dan C. Derieg, Vladimir Polony, Rehan N. Chawdry, Jordan Davies, Brigham B. Hyde , et al. (2 additional authors not shown)

    Abstract: Evidence to guide healthcare decisions is often limited by a lack of relevant and trustworthy literature as well as difficulty in contextualizing existing research for a specific patient. Large language models (LLMs) could potentially address both challenges by either summarizing published literature or generating new studies based on real-world data (RWD). We evaluated the ability of five LLM-bas… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: 28 pages (2 figures, 3 tables) inclusive of 8 pages of supplemental materials (4 supplemental figures and 4 supplemental tables)

  18. arXiv:2407.00014  [pdf, other

    cs.RO eess.SY

    Kinetic and Kinematic Sensors-free Approach for Estimation of Continuous Force and Gesture in sEMG Prosthetic Hands

    Authors: Gang Liu, Zhenxiang Wang, Chuanmei Xi, Ziyang He, Shanshan Guo, Rui Zhang, Dezhong Yao

    Abstract: Regression-based sEMG prosthetic hands are widely used for their ability to provide continuous kinetic and kinematic parameters. However, establishing these models requires complex sensors systems to collect corresponding kinetic and kinematic data in synchronization with sEMG, which is cumbersome and user-unfriendly. This paper proposes a kinetic and kinematic sensors-free approach for controllin… ▽ More

    Submitted 16 September, 2024; v1 submitted 1 May, 2024; originally announced July 2024.

    Comments: 17 pages

  19. arXiv:2406.19065  [pdf, other

    cs.CL

    STBench: Assessing the Ability of Large Language Models in Spatio-Temporal Analysis

    Authors: Wenbin Li, Di Yao, Ruibo Zhao, Wenjie Chen, Zijie Xu, Chengxue Luo, Chang Gong, Quanliang Jing, Haining Tan, Jingping Bi

    Abstract: The rapid evolution of large language models (LLMs) holds promise for reforming the methodology of spatio-temporal data mining. However, current works for evaluating the spatio-temporal understanding capability of LLMs are somewhat limited and biased. These works either fail to incorporate the latest language models or only focus on assessing the memorized spatio-temporal knowledge. To address thi… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  20. CausalMMM: Learning Causal Structure for Marketing Mix Modeling

    Authors: Chang Gong, Di Yao, Lei Zhang, Sheng Chen, Wenbin Li, Yueyang Su, Jingping Bi

    Abstract: In online advertising, marketing mix modeling (MMM) is employed to predict the gross merchandise volume (GMV) of brand shops and help decision-makers to adjust the budget allocation of various advertising channels. Traditional MMM methods leveraging regression techniques can fail in handling the complexity of marketing. Although some efforts try to encode the causal structures for better predictio… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: WSDM 2024, full version

  21. HiFGL: A Hierarchical Framework for Cross-silo Cross-device Federated Graph Learning

    Authors: Zhuoning Guo, Duanyi Yao, Qiang Yang, Hao Liu

    Abstract: Federated Graph Learning (FGL) has emerged as a promising way to learn high-quality representations from distributed graph data with privacy preservation. Despite considerable efforts have been made for FGL under either cross-device or cross-silo paradigm, how to effectively capture graph knowledge in a more complicated cross-silo cross-device environment remains an under-explored problem. However… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: Accepted by SIGKDD 2024

  22. Reliable Object Tracking by Multimodal Hybrid Feature Extraction and Transformer-Based Fusion

    Authors: Hongze Sun, Rui Liu, Wuque Cai, Jun Wang, Yue Wang, Huajin Tang, Yan Cui, Dezhong Yao, Daqing Guo

    Abstract: Visual object tracking, which is primarily based on visible light image sequences, encounters numerous challenges in complicated scenarios, such as low light conditions, high dynamic ranges, and background clutter. To address these challenges, incorporating the advantages of multiple visual modalities is a promising solution for achieving reliable object tracking. However, the existing approaches… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 16 pages, 7 figures, 9 tabes; This work has been submitted for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  23. arXiv:2405.16848  [pdf, other

    cs.CV

    A re-calibration method for object detection with multi-modal alignment bias in autonomous driving

    Authors: Zhihang Song, Lihui Peng, Jianming Hu, Danya Yao, Yi Zhang

    Abstract: Multi-modal object detection in autonomous driving has achieved great breakthroughs due to the usage of fusing complementary information from different sensors. The calibration in fusion between sensors such as LiDAR and camera is always supposed to be precise in previous work. However, in reality, calibration matrices are fixed when the vehicles leave the factory, but vibration, bumps, and data l… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 10 pages, 6 figures

  24. arXiv:2405.14953  [pdf, other

    cs.LG cs.AI stat.ML

    MallowsPO: Fine-Tune Your LLM with Preference Dispersions

    Authors: Haoxian Chen, Hanyang Zhao, Henry Lam, David Yao, Wenpin Tang

    Abstract: Direct Preference Optimization (DPO) has recently emerged as a popular approach to improve reinforcement learning with human feedback (RLHF), leading to better techniques to fine-tune large language models (LLM). A weakness of DPO, however, lies in its lack of capability to characterize the diversity of human preferences. Inspired by Mallows' theory of preference ranking, we develop in this paper… ▽ More

    Submitted 2 October, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  25. arXiv:2405.14291  [pdf, other

    cs.LG cs.AI cs.DC

    Variational Bayes for Federated Continual Learning

    Authors: Dezhong Yao, Sanmu Li, Yutong Dai, Zhiqiang Xu, Shengshan Hu, Peilin Zhao, Lichao Sun

    Abstract: Federated continual learning (FCL) has received increasing attention due to its potential in handling real-world streaming data, characterized by evolving data distributions and varying client classes over time. The constraints of storage limitations and privacy concerns confine local models to exclusively access the present data within each learning cycle. Consequently, this restriction induces p… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  26. arXiv:2405.13888  [pdf, other

    cs.LG stat.ML

    Marrying Causal Representation Learning with Dynamical Systems for Science

    Authors: Dingling Yao, Caroline Muller, Francesco Locatello

    Abstract: Causal representation learning promises to extend causal models to hidden causal variables from raw entangled measurements. However, most progress has focused on proving identifiability results in different settings, and we are not aware of any successful real-world application. At the same time, the field of dynamical systems benefited from deep learning and scaled to countless applications but d… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 21 pages, 8 figures, 6 tables

  27. arXiv:2405.07626  [pdf, other

    cs.LG cs.AI

    AnomalyLLM: Few-shot Anomaly Edge Detection for Dynamic Graphs using Large Language Models

    Authors: Shuo Liu, Di Yao, Lanting Fang, Zhetao Li, Wenbin Li, Kaiyu Feng, XiaoWen Ji, Jingping Bi

    Abstract: Detecting anomaly edges for dynamic graphs aims to identify edges significantly deviating from the normal pattern and can be applied in various domains, such as cybersecurity, financial transactions and AIOps. With the evolving of time, the types of anomaly edges are emerging and the labeled anomaly samples are few for each type. Current methods are either designed to detect randomly inserted edge… ▽ More

    Submitted 28 August, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

    Comments: 13pages

  28. arXiv:2405.00696  [pdf, other

    cs.RO

    Life-long Learning and Testing for Automated Vehicles via Adaptive Scenario Sampling as A Continuous Optimization Process

    Authors: Jingwei Ge, Pengbo Wang, Cheng Chang, Yi Zhang, Danya Yao, Li Li

    Abstract: Sampling critical testing scenarios is an essential step in intelligence testing for Automated Vehicles (AVs). However, due to the lack of prior knowledge on the distribution of critical scenarios in sampling space, we can hardly efficiently find the critical scenarios or accurately evaluate the intelligence of AVs. To solve this problem, we formulate the testing as a continuous optimization proce… ▽ More

    Submitted 28 March, 2024; originally announced May 2024.

  29. arXiv:2404.19582  [pdf, other

    cs.LG cs.CR

    Leveraging Label Information for Stealthy Data Stealing in Vertical Federated Learning

    Authors: Duanyi Yao, Songze Li, Xueluan Gong, Sizai Hou, Gaoning Pan

    Abstract: We develop DMAVFL, a novel attack strategy that evades current detection mechanisms. The key idea is to integrate a discriminator with auxiliary classifier that takes a full advantage of the label information (which was completely ignored in previous attacks): on one hand, label information helps to better characterize embeddings of samples from distinct classes, yielding an improved reconstructio… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  30. arXiv:2403.10801  [pdf, other

    cs.CV

    Securely Fine-tuning Pre-trained Encoders Against Adversarial Examples

    Authors: Ziqi Zhou, Minghui Li, Wei Liu, Shengshan Hu, Yechao Zhang, Wei Wan, Lulu Xue, Leo Yu Zhang, Dezhong Yao, Hai Jin

    Abstract: With the evolution of self-supervised learning, the pre-training paradigm has emerged as a predominant solution within the deep learning landscape. Model providers furnish pre-trained encoders designed to function as versatile feature extractors, enabling downstream users to harness the benefits of expansive models with minimal effort through fine-tuning. Nevertheless, recent works have exposed a… ▽ More

    Submitted 18 March, 2024; v1 submitted 16 March, 2024; originally announced March 2024.

  31. arXiv:2403.08335  [pdf, other

    cs.LG cs.AI stat.ML

    A Sparsity Principle for Partially Observable Causal Representation Learning

    Authors: Danru Xu, Dingling Yao, S�bastien Lachapelle, Perouz Taslakian, Julius von K�gelgen, Francesco Locatello, Sara Magliacane

    Abstract: Causal representation learning aims at identifying high-level causal variables from perceptual data. Most methods assume that all latent causal variables are captured in the high-dimensional observations. We instead consider a partially observed setting, in which each measurement only provides information about a subset of the underlying causal state. Prior work has studied this setting with multi… ▽ More

    Submitted 15 June, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

    Comments: 45 pages, 32 figures, 16 tables

  32. arXiv:2403.02975   

    cs.CL cs.AI

    A General and Flexible Multi-concept Parsing Framework for Multilingual Semantic Matching

    Authors: Dong Yao

    Abstract: Sentence semantic matching is a research hotspot in natural language processing, which is considerably significant in various key scenarios, such as community question answering, searching, chatbot, and recommendation. Since most of the advanced models directly model the semantic relevance among words between two sentences while neglecting the \textit{keywords} and \textit{intents} concepts of the… ▽ More

    Submitted 3 April, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: arXiv admin comment: This version has been removed by arXiv administrators as the submitter did not have the rights to agree to the license at the time of submission

  33. arXiv:2402.01348  [pdf, other

    cs.LG cs.AI

    CORE: Mitigating Catastrophic Forgetting in Continual Learning through Cognitive Replay

    Authors: Jianshu Zhang, Yankai Fu, Ziheng Peng, Dongyu Yao, Kun He

    Abstract: This paper introduces a novel perspective to significantly mitigate catastrophic forgetting in continuous learning (CL), which emphasizes models' capacity to preserve existing knowledge and assimilate new information. Current replay-based methods treat every task and data sample equally and thus can not fully exploit the potential of the replay buffer. In response, we propose COgnitive REplay (COR… ▽ More

    Submitted 9 April, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: Accepted by CogSci24 as oral presentation

  34. arXiv:2401.16687  [pdf, other

    cs.CR cs.LG

    Revisiting Gradient Pruning: A Dual Realization for Defending against Gradient Attacks

    Authors: Lulu Xue, Shengshan Hu, Ruizhi Zhao, Leo Yu Zhang, Shengqing Hu, Lichao Sun, Dezhong Yao

    Abstract: Collaborative learning (CL) is a distributed learning framework that aims to protect user privacy by allowing users to jointly train a model by sharing their gradient updates only. However, gradient inversion attacks (GIAs), which recover users' training data from shared gradients, impose severe privacy threats to CL. Existing defense methods adopt different techniques, e.g., differential privacy,… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

  35. arXiv:2401.15668  [pdf, other

    cs.CV

    Lips Are Lying: Spotting the Temporal Inconsistency between Audio and Visual in Lip-Syncing DeepFakes

    Authors: Weifeng Liu, Tianyi She, Jiawei Liu, Run Wang, Dongyu Yao, Ziyou Liang

    Abstract: In recent years, DeepFake technology has achieved unprecedented success in high-quality video synthesis, whereas these methods also pose potential and severe security threats to humanity. DeepFake can be bifurcated into entertainment applications like face swapping and illicit uses such as lip-syncing fraud. However, lip-forgery videos, which neither change identity nor have discernible visual art… ▽ More

    Submitted 28 January, 2024; originally announced January 2024.

    Comments: The first two authors contributed equally to this work

  36. arXiv:2401.11089  [pdf, other

    cs.CR cs.AI cs.DC cs.IR

    FedRKG: A Privacy-preserving Federated Recommendation Framework via Knowledge Graph Enhancement

    Authors: Dezhong Yao, Tongtong Liu, Qi Cao, Hai Jin

    Abstract: Federated Learning (FL) has emerged as a promising approach for preserving data privacy in recommendation systems by training models locally. Recently, Graph Neural Networks (GNN) have gained popularity in recommendation tasks due to their ability to capture high-order interactions between users and items. However, privacy concerns prevent the global sharing of the entire user-item graph. To addre… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

  37. arXiv:2312.06197  [pdf, other

    cs.SD cs.MM eess.AS

    MART: Learning Hierarchical Music Audio Representations with Part-Whole Transformer

    Authors: Dong Yao, Jieming Zhu, Jiahao Xun, Shengyu Zhang, Zhou Zhao, Liqun Deng, Wenqiao Zhang, Zhenhua Dong, Xin Jiang

    Abstract: Recent research in self-supervised contrastive learning of music representations has demonstrated remarkable results across diverse downstream tasks. However, a prevailing trend in existing methods involves representing equally-sized music clips in either waveform or spectrogram formats, often overlooking the intrinsic part-whole hierarchies within music. In our quest to comprehend the bottom-up s… ▽ More

    Submitted 19 April, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

    Comments: Short paper accepted by WWW 2024. This is revised and condensed based on the previous version titled "Music-PAW: Learning Music Representations via Hierarchical Part-whole Interaction and Contrast". For more experimental details and discussions, please refer to the original long paper at arXiv:2312.06197v1

  38. arXiv:2311.04056  [pdf, other

    cs.LG cs.AI

    Multi-View Causal Representation Learning with Partial Observability

    Authors: Dingling Yao, Danru Xu, S�bastien Lachapelle, Sara Magliacane, Perouz Taslakian, Georg Martius, Julius von K�gelgen, Francesco Locatello

    Abstract: We present a unified framework for studying the identifiability of representations learned from simultaneously observed views, such as different data modalities. We allow a partially observed setting in which each view constitutes a nonlinear mixture of a subset of underlying latent variables, which can be causally related. We prove that the information shared across all subsets of any number of v… ▽ More

    Submitted 8 March, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

    Comments: 28 pages, 10 figures, 11 tables

  39. arXiv:2311.04044  [pdf, other

    cs.CL cs.CR

    PrivLM-Bench: A Multi-level Privacy Evaluation Benchmark for Language Models

    Authors: Haoran Li, Dadi Guo, Donghao Li, Wei Fan, Qi Hu, Xin Liu, Chunkit Chan, Duanyi Yao, Yuan Yao, Yangqiu Song

    Abstract: The rapid development of language models (LMs) brings unprecedented accessibility and usage for both models and users. On the one hand, powerful LMs achieve state-of-the-art performance over numerous downstream NLP tasks. On the other hand, more and more attention is paid to unrestricted model accesses that may bring malicious privacy risks of data leakage. To address these issues, many recent wor… ▽ More

    Submitted 1 June, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

    Comments: To appear at ACL 2024

  40. arXiv:2310.15930  [pdf, other

    cs.SD eess.AS

    CDSD: Chinese Dysarthria Speech Database

    Authors: Mengyi Sun, Ming Gao, Xinchen Kang, Shiru Wang, Jun Du, Dengfeng Yao, Su-Jing Wang

    Abstract: We present the Chinese Dysarthria Speech Database (CDSD) as a valuable resource for dysarthria research. This database comprises speech data from 24 participants with dysarthria. Among these participants, one recorded an additional 10 hours of speech data, while each recorded one hour, resulting in 34 hours of speech material. To accommodate participants with varying cognitive levels, our text poo… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: 9 pages, 3 figures

  41. arXiv:2310.11994  [pdf

    cs.HC eess.SP q-bio.NC

    Spectral homogeneity cross frequencies can be a quality metric for the large-scale resting EEG preprocessing

    Authors: Shiang Hu, Jie Ruan, Nicolas Langer, Jorge Bosch-Bayard, Zhao Lv, Dezhong Yao, Pedro Antonio Valdes-Sosa

    Abstract: The brain projects require the collection of massive electrophysiological data, aiming to the longitudinal, sectional, or populational neuroscience studies. Quality metrics automatically label the data after centralized preprocessing. However, although the waveforms-based metrics are partially useful, they may be unreliable by neglecting the spectral profiles. Here, we detected the phenomenon of p… ▽ More

    Submitted 4 December, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

  42. arXiv:2310.06119  [pdf, other

    cs.LG cs.AI

    Exploring Progress in Multivariate Time Series Forecasting: Comprehensive Benchmarking and Heterogeneity Analysis

    Authors: Zezhi Shao, Fei Wang, Yongjun Xu, Wei Wei, Chengqing Yu, Zhao Zhang, Di Yao, Tao Sun, Guangyin Jin, Xin Cao, Gao Cong, Christian S. Jensen, Xueqi Cheng

    Abstract: Multivariate Time Series (MTS) analysis is crucial to understanding and managing complex systems, such as traffic and energy systems, and a variety of approaches to MTS forecasting have been proposed recently. However, we often observe inconsistent or seemingly contradictory performance findings across different studies. This hinders our understanding of the merits of different approaches and slow… ▽ More

    Submitted 16 October, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: Accepted by TKDE (IEEE Transactions on Knowledge and Data Engineering); The codebase is accessible at: https://github.com/GestaltCogTeam/BasicTS

  43. FuzzLLM: A Novel and Universal Fuzzing Framework for Proactively Discovering Jailbreak Vulnerabilities in Large Language Models

    Authors: Dongyu Yao, Jianshu Zhang, Ian G. Harris, Marcel Carlsson

    Abstract: Jailbreak vulnerabilities in Large Language Models (LLMs), which exploit meticulously crafted prompts to elicit content that violates service guidelines, have captured the attention of research communities. While model owners can defend against individual jailbreak prompts through safety training strategies, this relatively passive approach struggles to handle the broader category of similar jailb… ▽ More

    Submitted 14 April, 2024; v1 submitted 11 September, 2023; originally announced September 2023.

    Comments: Publish by ICASSP 2024 on 3/18/2024; Extended Arxiv version

  44. arXiv:2308.13881  [pdf, other

    cs.GT q-fin.GN

    Transaction fee mechanism for Proof-of-Stake protocol

    Authors: Wenpin Tang, David D. Yao

    Abstract: We study a mechanism design problem in the blockchain proof-of-stake (PoS) protocol. Our main objective is to extend the transaction fee mechanism (TFM) recently proposed in Chung and Shi (SODA, p.3856-3899, 2023), so as to incorporate a long-run utility model for the miner into the burning second-price auction mechanism $\texttt{BSP}(γ)$ proposed in Chung and Shi (where $γ$ is a key parameter in… ▽ More

    Submitted 29 August, 2023; v1 submitted 26 August, 2023; originally announced August 2023.

    Comments: 18 pages, 3 figures

  45. arXiv:2307.07972  [pdf, other

    cs.CV

    Dual-level Interaction for Domain Adaptive Semantic Segmentation

    Authors: Dongyu Yao, Boheng Li

    Abstract: Self-training approach recently secures its position in domain adaptive semantic segmentation, where a model is trained with target domain pseudo-labels. Current advances have mitigated noisy pseudo-labels resulting from the domain gap. However, they still struggle with erroneous pseudo-labels near the boundaries of the semantic classifier. In this paper, we tackle this issue by proposing a dual-l… ▽ More

    Submitted 10 August, 2023; v1 submitted 16 July, 2023; originally announced July 2023.

    Comments: Accepted to ICCVW on Uncertainty Quantification for Computer Vision (UnCV), 2023

  46. arXiv:2305.18901  [pdf, other

    cs.LG math.OC

    Policy Optimization for Continuous Reinforcement Learning

    Authors: Hanyang Zhao, Wenpin Tang, David D. Yao

    Abstract: We study reinforcement learning (RL) in the setting of continuous time and space, for an infinite horizon with a discounted objective and the underlying dynamics driven by a stochastic differential equation. Built upon recent advances in the continuous approach to RL, we develop a notion of occupation time (specifically for a discounted objective), and show how it can be effectively used to derive… ▽ More

    Submitted 18 October, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

  47. arXiv:2305.01915  [pdf, other

    cs.IR cs.MM

    Denoising Multi-modal Sequential Recommenders with Contrastive Learning

    Authors: Dong Yao, Shengyu Zhang, Zhou Zhao, Jieming Zhu, Wenqiao Zhang, Rui Zhang, Xiaofei He, Fei Wu

    Abstract: There is a rapidly-growing research interest in engaging users with multi-modal data for accurate user modeling on recommender systems. Existing multimedia recommenders have achieved substantial improvements by incorporating various modalities and devising delicate modules. However, when users decide to interact with items, most of them do not fully read the content of all modalities. We refer to… ▽ More

    Submitted 3 May, 2023; originally announced May 2023.

  48. arXiv:2304.13407  [pdf, ps, other

    cs.LG cs.CR cs.IT

    FedVS: Straggler-Resilient and Privacy-Preserving Vertical Federated Learning for Split Models

    Authors: Songze Li, Duanyi Yao, Jin Liu

    Abstract: In a vertical federated learning (VFL) system consisting of a central server and many distributed clients, the training data are vertically partitioned such that different features are privately stored on different clients. The problem of split VFL is to train a model split between the server and the clients. This paper aims to address two major challenges in split VFL: 1) performance degradation… ▽ More

    Submitted 6 July, 2023; v1 submitted 26 April, 2023; originally announced April 2023.

    Comments: Accepted to ICML 2023

  49. Synthetic Datasets for Autonomous Driving: A Survey

    Authors: Zhihang Song, Zimin He, Xingyu Li, Qiming Ma, Ruibo Ming, Zhiqi Mao, Huaxin Pei, Lihui Peng, Jianming Hu, Danya Yao, Yi Zhang

    Abstract: Autonomous driving techniques have been flourishing in recent years while thirsting for huge amounts of high-quality data. However, it is difficult for real-world datasets to keep up with the pace of changing requirements due to their expensive and time-consuming experimental and labeling costs. Therefore, more and more researchers are turning to synthetic datasets to easily generate rich and chan… ▽ More

    Submitted 27 February, 2024; v1 submitted 24 April, 2023; originally announced April 2023.

    Comments: 19 pages, 5 figures

    Journal ref: in IEEE Transactions on Intelligent Vehicles, vol. 9, no. 1, pp. 1847-1864, Jan. 2024

  50. arXiv:2304.07735  [pdf, other

    cs.CR

    Permutation Equivariance of Transformers and Its Applications

    Authors: Hengyuan Xu, Liyao Xiang, Hangyu Ye, Dixi Yao, Pengzhi Chu, Baochun Li

    Abstract: Revolutionizing the field of deep learning, Transformer-based models have achieved remarkable performance in many tasks. Recent research has recognized these models are robust to shuffling but are limited to inter-token permutation in the forward propagation. In this work, we propose our definition of permutation equivariance, a broader concept covering both inter- and intra- token permutation in… ▽ More

    Submitted 31 March, 2024; v1 submitted 16 April, 2023; originally announced April 2023.

    Comments: Accepted by CVPR 2024