Skip to main content

Showing 1–50 of 88 results for author: Du, Y

Searching in archive eess. Search in all archives.
.
  1. arXiv:2410.13720  [pdf, other

    cs.CV cs.AI cs.LG eess.IV

    Movie Gen: A Cast of Media Foundation Models

    Authors: Adam Polyak, Amit Zohar, Andrew Brown, Andros Tjandra, Animesh Sinha, Ann Lee, Apoorv Vyas, Bowen Shi, Chih-Yao Ma, Ching-Yao Chuang, David Yan, Dhruv Choudhary, Dingkang Wang, Geet Sethi, Guan Pang, Haoyu Ma, Ishan Misra, Ji Hou, Jialiang Wang, Kiran Jagadeesh, Kunpeng Li, Luxin Zhang, Mannat Singh, Mary Williamson, Matt Le , et al. (63 additional authors not shown)

    Abstract: We present Movie Gen, a cast of foundation models that generates high-quality, 1080p HD videos with different aspect ratios and synchronized audio. We also show additional capabilities such as precise instruction-based video editing and generation of personalized videos based on a user's image. Our models set a new state-of-the-art on multiple tasks: text-to-video synthesis, video personalization,… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  2. arXiv:2410.08135  [pdf, other

    eess.SY

    State Feedback System Level Synthesis in Continuous Time

    Authors: Yaozhi Du, Jing Shuang Li

    Abstract: System level synthesis (SLS) is a controller parameterization technique that facilitates distributed structured control via convex techniques. Results on SLS are primarily in the discrete-time setting; this paper extends SLS to the continuous-time setting. We translate the parametrization and associated constraints to continuous time, and propose a controller design procedure consisting of two ste… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

    Comments: 8 pages, 6 figures, conference

  3. arXiv:2410.04261  [pdf, other

    cs.RO cs.LG eess.SY math.OC

    Compositional Diffusion Models for Powered Descent Trajectory Generation with Flexible Constraints

    Authors: Julia Briden, Yilun Du, Enrico M. Zucchelli, Richard Linares

    Abstract: This work introduces TrajDiffuser, a compositional diffusion-based flexible and concurrent trajectory generator for 6 degrees of freedom powered descent guidance. TrajDiffuser is a statistical model that learns the multi-modal distributions of a dataset of simulated optimal trajectories, each subject to only one or few constraints that may vary for different trajectories. During inference, the tra… ▽ More

    Submitted 5 October, 2024; originally announced October 2024.

    Comments: Full manuscript submitted to IEEE Aerospace 2025 on 4-Oct-2024

  4. arXiv:2409.15911  [pdf, other

    cs.CL cs.SD eess.AS

    A Modular-based Strategy for Mitigating Gradient Conflicts in Simultaneous Speech Translation

    Authors: Xiaoqian Liu, Yangfan Du, Jianjin Wang, Yuan Ge, Chen Xu, Tong Xiao, Guocheng Chen, Jingbo Zhu

    Abstract: Simultaneous Speech Translation (SimulST) involves generating target language text while continuously processing streaming speech input, presenting significant real-time challenges. Multi-task learning is often employed to enhance SimulST performance but introduces optimization conflicts between primary and auxiliary tasks, potentially compromising overall efficiency. The existing model-level conf… ▽ More

    Submitted 24 September, 2024; originally announced September 2024.

  5. arXiv:2409.14739  [pdf, other

    cs.ET eess.SY

    AmpAgent: An LLM-based Multi-Agent System for Multi-stage Amplifier Schematic Design from Literature for Process and Performance Porting

    Authors: Chengjie Liu, Weiyu Chen, Anlan Peng, Yuan Du, Li Du, Jun Yang

    Abstract: Multi-stage amplifiers are widely applied in analog circuits. However, their large number of components, complex transfer functions, and intricate pole-zero distributions necessitate extensive manpower for derivation and param sizing to ensure their stability. In order to achieve efficient derivation of the transfer function and simplify the difficulty of circuit design, we propose AmpAgent: a mul… ▽ More

    Submitted 23 September, 2024; originally announced September 2024.

  6. arXiv:2409.13863  [pdf, other

    eess.IV

    Unsupervised Learning of Multi-modal Affine Registration for PET/CT

    Authors: Junyu Chen, Yihao Liu, Shuwen Wei, Aaron Carass, Yong Du

    Abstract: Affine registration plays a crucial role in PET/CT imaging, where aligning PET with CT images is challenging due to their respective functional and anatomical representations. Despite the significant promise shown by recent deep learning (DL)-based methods in various medical imaging applications, their application to multi-modal PET/CT affine registration remains relatively unexplored. This study… ▽ More

    Submitted 20 September, 2024; originally announced September 2024.

    Comments: Accepted by IEEE NSS/MIC/RTSD'24 ((c) IEEE). Code available at https://github.com/junyuchen245/Correlation_Ratio

  7. Reliability-Based Planning of Cable Layout for Offshore Wind Farm Electrical Collector System Considering Post-Fault Network Reconfiguration

    Authors: Xiaochi Ding, Yunfei Du, Xinwei Shen, Qiuwei Wu, Xuan Zhang, Nikos D. Hatziargyriou

    Abstract: The electrical collector system (ECS) plays a crucial role in determining the performance of offshore wind farms (OWFs). Existing research has predominantly restricted ECS cable layouts to conventional radial or ring structures and employed graph theory heuristics for solutions. However, both economic efficiency and reliability of the OWFs heavily depend on their ECS structure, and the optimal ECS… ▽ More

    Submitted 19 September, 2024; originally announced September 2024.

    Comments: 13 pages

  8. arXiv:2408.06185  [pdf, other

    eess.SY cs.CY cs.GT cs.NI

    Hi-SAM: A high-scalable authentication model for satellite-ground Zero-Trust system using mean field game

    Authors: Xuesong Wu, Tianshuai Zheng, Runfang Wu, Jie Ren, Junyan Guo, Ye Du

    Abstract: As more and more Internet of Thing (IoT) devices are connected to satellite networks, the Zero-Trust Architecture brings dynamic security to the satellite-ground system, while frequent authentication creates challenges for system availability. To make the system's accommodate more IoT devices, this paper proposes a high-scalable authentication model (Hi-SAM). Hi-SAM introduces the Proof-of-Work id… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

  9. arXiv:2407.15335  [pdf, other

    eess.SP

    Addressing Out-of-Distribution Challenges in Image Semantic Communication Systems with Multi-modal Large Language Models

    Authors: Feifan Zhang, Yuyang Du, Kexin Chen, Yulin Shao, Soung Chang Liew

    Abstract: Semantic communication is a promising technology for next-generation wireless networks. However, the out-of-distribution (OOD) problem, where a pre-trained machine learning (ML) model is applied to unseen tasks that are outside the distribution of its training data, may compromise the integrity of semantic compression. This paper explores the use of multi-modal large language models (MLLMs) to add… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

  10. arXiv:2407.11333  [pdf, other

    cs.RO cs.SD eess.AS

    Disentangled Acoustic Fields For Multimodal Physical Scene Understanding

    Authors: Jie Yin, Andrew Luo, Yilun Du, Anoop Cherian, Tim K. Marks, Jonathan Le Roux, Chuang Gan

    Abstract: We study the problem of multimodal physical scene understanding, where an embodied agent needs to find fallen objects by inferring object properties, direction, and distance of an impact sound source. Previous works adopt feed-forward neural networks to directly regress the variables from sound, leading to poor generalization and domain adaptation issues. In this paper, we illustrate that learning… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  11. arXiv:2407.04675  [pdf, other

    eess.AS cs.SD

    Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition

    Authors: Ye Bai, Jingping Chen, Jitong Chen, Wei Chen, Zhuo Chen, Chuang Ding, Linhao Dong, Qianqian Dong, Yujiao Du, Kepan Gao, Lu Gao, Yi Guo, Minglun Han, Ting Han, Wenchao Hu, Xinying Hu, Yuxiang Hu, Deyu Hua, Lu Huang, Mingkun Huang, Youjia Huang, Jishuo Jin, Fanliu Kong, Zongwei Lan, Tianyu Li , et al. (30 additional authors not shown)

    Abstract: Modern automatic speech recognition (ASR) model is required to accurately transcribe diverse speech signals (from different domains, languages, accents, etc) given the specific contextual information in various application scenarios. Classic end-to-end models fused with extra language models perform well, but mainly in data matching scenarios and are gradually approaching a bottleneck. In this wor… ▽ More

    Submitted 10 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

  12. arXiv:2407.02913  [pdf, other

    cs.LG cs.AI eess.IV eess.SP math.NA

    SFC: Achieve Accurate Fast Convolution under Low-precision Arithmetic

    Authors: Liulu He, Yufei Zhao, Rui Gao, Yuan Du, Li Du

    Abstract: Fast convolution algorithms, including Winograd and FFT, can efficiently accelerate convolution operations in deep models. However, these algorithms depend on high-precision arithmetic to maintain inference accuracy, which conflicts with the model quantization. To resolve this conflict and further improve the efficiency of quantized convolution, we proposes SFC, a new algebra transform for fast co… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: ICML 2024

  13. arXiv:2407.00008  [pdf, other

    q-bio.NC eess.IV

    Spectral Brain Graph Neural Network for Prediction of Anxiety in Children with Autism Spectrum Disorder

    Authors: Peiyu Duan, Nicha C. Dvornek, Jiyao Wang, Jeffrey Eilbott, Yuexi Du, Denis G. Sukhodolsky, James S. Duncan

    Abstract: Children with Autism Spectrum Disorder (ASD) frequently exhibit comorbid anxiety, which contributes to impairment and requires treatment. Therefore, it is critical to investigate co-occurring autism and anxiety with functional imaging tools to understand the brain mechanisms of this comorbidity. Multidimensional Anxiety Scale for Children, 2nd edition (MASC-2) score is a common tool to evaluate th… ▽ More

    Submitted 23 April, 2024; originally announced July 2024.

    Comments: ISBI 2024 Oral

  14. arXiv:2406.16754  [pdf, other

    cs.LG cs.CV eess.IV

    The MRI Scanner as a Diagnostic: Image-less Active Sampling

    Authors: Yuning Du, Rohan Dharmakumar, Sotirios A. Tsaftaris

    Abstract: Despite the high diagnostic accuracy of Magnetic Resonance Imaging (MRI), using MRI as a Point-of-Care (POC) disease identification tool poses significant accessibility challenges due to the use of high magnetic field strength and lengthy acquisition times. We ask a simple question: Can we dynamically optimise acquired samples, at the patient level, according to an (automated) downstream decision… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Accepted in MICCAI 2024

  15. arXiv:2406.11546  [pdf, other

    eess.AS cs.CL cs.SD

    GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement

    Authors: Yifan Yang, Zheshu Song, Jianheng Zhuo, Mingyu Cui, Jinpeng Li, Bo Yang, Yexing Du, Ziyang Ma, Xunying Liu, Ziyuan Wang, Ke Li, Shuai Fan, Kai Yu, Wei-Qiang Zhang, Guoguo Chen, Xie Chen

    Abstract: The evolution of speech technology has been spurred by the rapid increase in dataset sizes. Traditional speech models generally depend on a large amount of labeled training data, which is scarce for low-resource languages. This paper presents GigaSpeech 2, a large-scale, multi-domain, multilingual speech recognition corpus. It is designed for low-resource languages and does not rely on paired spee… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Under review

  16. arXiv:2406.05954  [pdf, other

    cs.AI cs.LG eess.SY

    Aligning Large Language Models with Representation Editing: A Control Perspective

    Authors: Lingkai Kong, Haorui Wang, Wenhao Mu, Yuanqi Du, Yuchen Zhuang, Yifei Zhou, Yue Song, Rongzhi Zhang, Kai Wang, Chao Zhang

    Abstract: Aligning large language models (LLMs) with human objectives is crucial for real-world applications. However, fine-tuning LLMs for alignment often suffers from unstable training and requires substantial computing resources. Test-time alignment techniques, such as prompting and guided decoding, do not modify the underlying model, and their performance remains dependent on the original model's capabi… ▽ More

    Submitted 11 June, 2024; v1 submitted 9 June, 2024; originally announced June 2024.

    Comments: fix typos

  17. arXiv:2406.00497  [pdf, ps, other

    cs.SD cs.AI cs.CL eess.AS

    Recent Advances in End-to-End Simultaneous Speech Translation

    Authors: Xiaoqian Liu, Guoqiang Hu, Yangfan Du, Erfeng He, Yingfeng Luo, Chen Xu, Tong Xiao, Jingbo Zhu

    Abstract: Simultaneous speech translation (SimulST) is a demanding task that involves generating translations in real-time while continuously processing speech input. This paper offers a comprehensive overview of the recent developments in SimulST research, focusing on four major challenges. Firstly, the complexities associated with processing lengthy and continuous speech streams pose significant hurdles.… ▽ More

    Submitted 20 August, 2024; v1 submitted 1 June, 2024; originally announced June 2024.

    Comments: Accepted by IJCAI 2024

  18. Cepstral Analysis Based Artifact Detection, Recognition and Removal for Prefrontal EEG

    Authors: Siqi Han, Chao Zhang, Jiaxin Lei, Qingquan Han, Yuhui Du, Anhe Wang, Shuo Bai, Milin Zhang

    Abstract: This paper proposes to use cepstrum for artifact detection, recognition and removal in prefrontal EEG. This work focuses on the artifact caused by eye movement. A database containing artifact-free EEG and eye movement contaminated EEG from different subjects is established. A cepstral analysis-based feature extraction with support vector machine (SVM) based classifier is designed to identify the a… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: 5 pages, 4 figures, published by TCAS-II

    Journal ref: IEEE Transactions on Circuits and Systems II: Express Briefs, 2023

  19. arXiv:2404.01082  [pdf, other

    eess.IV

    The state-of-the-art in Cardiac MRI Reconstruction: Results of the CMRxRecon Challenge in MICCAI 2023

    Authors: Jun Lyu, Chen Qin, Shuo Wang, Fanwen Wang, Yan Li, Zi Wang, Kunyuan Guo, Cheng Ouyang, Michael T�nzer, Meng Liu, Longyu Sun, Mengting Sun, Qin Li, Zhang Shi, Sha Hua, Hao Li, Zhensen Chen, Zhenlin Zhang, Bingyu Xin, Dimitris N. Metaxas, George Yiasemis, Jonas Teuwen, Liping Zhang, Weitian Chen, Yidong Zhao , et al. (25 additional authors not shown)

    Abstract: Cardiac MRI, crucial for evaluating heart structure and function, faces limitations like slow imaging and motion artifacts. Undersampling reconstruction, especially data-driven algorithms, has emerged as a promising solution to accelerate scans and enhance imaging performance using highly under-sampled data. Nevertheless, the scarcity of publicly available cardiac k-space datasets and evaluation p… ▽ More

    Submitted 16 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: 25 pages, 17 figures

  20. arXiv:2403.13148  [pdf, other

    eess.IV cs.CV cs.LG

    SIFT-DBT: Self-supervised Initialization and Fine-Tuning for Imbalanced Digital Breast Tomosynthesis Image Classification

    Authors: Yuexi Du, Regina J. Hooley, John Lewin, Nicha C. Dvornek

    Abstract: Digital Breast Tomosynthesis (DBT) is a widely used medical imaging modality for breast cancer screening and diagnosis, offering higher spatial resolution and greater detail through its 3D-like breast volume imaging capability. However, the increased data volume also introduces pronounced data imbalance challenges, where only a small fraction of the volume contains suspicious tissue. This further… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: Accepted by IEEE ISBI 2024

  21. arXiv:2403.05111  [pdf, other

    eess.IV cs.CV

    From Registration Uncertainty to Segmentation Uncertainty

    Authors: Junyu Chen, Yihao Liu, Shuwen Wei, Zhangxing Bian, Aaron Carass, Yong Du

    Abstract: Understanding the uncertainty inherent in deep learning-based image registration models has been an ongoing area of research. Existing methods have been developed to quantify both transformation and appearance uncertainties related to the registration process, elucidating areas where the model may exhibit ambiguity regarding the generated deformation. However, our study reveals that neither uncert… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: Accepted by IEEE ISBI'24 ((c) IEEE). Code available at https://bit.ly/42VOZER

  22. arXiv:2403.02565  [pdf, other

    eess.SP

    Deep Cooperation in ISAC System: Resource, Node and Infrastructure Perspectives

    Authors: Zhiqing Wei, Haotian Liu, Zhiyong Feng, Huici Wu, Fan Liu, Qixun Zhang, Yucong Du

    Abstract: With the emerging Integrated Sensing and Communication (ISAC) technique, exploiting the mobile communication system with multi-domain resources, multiple network elements, and large-scale infrastructures to realize cooperative sensing is a crucial approach satisfying the requirements of high-accuracy and large-scale sensing in IoE. In this article, the deep cooperation in ISAC system including thr… ▽ More

    Submitted 2 September, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: 8 pages and 6 figures, Accepted by IEEE Internet of Things Magazine

  23. arXiv:2402.05390  [pdf, other

    cs.NI eess.SP

    Integrated Sensing and Communication Driven Digital Twin for Intelligent Machine Network

    Authors: Zhiqing Wei, Yucong Du, Qixun Zhang, Wangjun Jiang, Yanpeng Cui, Zeyang Meng, Huici Wu, Zhiyong Feng

    Abstract: Intelligent machines (IMs), including industrial machines, unmanned aerial vehicles (UAVs), and unmanned vehicles, etc., could perform effective cooperation in complex environment when they form IM network. The efficient environment sensing and communication are crucial for IM network, enabling the real-time and stable control of IMs. With the emergence of integrated sensing and communication (ISA… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: 9 pages, 5 figures, 1 Table

    ACM Class: C.2.1

  24. arXiv:2402.02694  [pdf, other

    eess.AS cs.LG cs.SD

    Description on IEEE ICME 2024 Grand Challenge: Semi-supervised Acoustic Scene Classification under Domain Shift

    Authors: Jisheng Bai, Mou Wang, Haohe Liu, Han Yin, Yafei Jia, Siwei Huang, Yutong Du, Dongzhe Zhang, Dongyuan Shi, Woon-Seng Gan, Mark D. Plumbley, Susanto Rahardja, Bin Xiang, Jianfeng Chen

    Abstract: Acoustic scene classification (ASC) is a crucial research problem in computational auditory scene analysis, and it aims to recognize the unique acoustic characteristics of an environment. One of the challenges of the ASC task is the domain shift between training and testing data. Since 2018, ASC challenges have focused on the generalization of ASC models across different recording devices. Althoug… ▽ More

    Submitted 28 February, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

  25. arXiv:2401.10070  [pdf, other

    cs.CL cs.SD eess.AS

    Communication-Efficient Personalized Federated Learning for Speech-to-Text Tasks

    Authors: Yichao Du, Zhirui Zhang, Linan Yue, Xu Huang, Yuqing Zhang, Tong Xu, Linli Xu, Enhong Chen

    Abstract: To protect privacy and meet legal regulations, federated learning (FL) has gained significant attention for training speech-to-text (S2T) systems, including automatic speech recognition (ASR) and speech translation (ST). However, the commonly used FL approach (i.e., \textsc{FedAvg}) in S2T tasks typically suffers from extensive communication overhead due to multi-round interactions based on the wh… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

    Comments: ICASSP 2024

  26. Exploiting Multipath Information for Integrated Localization and Sensing via PHD Filtering

    Authors: Yinuo Du, Hanying Zhao, Yang Liu, Xinlei Yu, Yuan Shen

    Abstract: Accurate localization and perception are pivotal for enhancing the safety and reliability of vehicles. However, current localization methods suffer from reduced accuracy when the line-of-sight (LOS) path is obstructed, or a combination of reflections and scatterings is present. In this paper, we present an integrated localization and sensing method that delivers superior performance in complex env… ▽ More

    Submitted 15 August, 2024; v1 submitted 24 December, 2023; originally announced December 2023.

    Comments: 6 pages, 6 figures. This work has been accepted and published by the IEEE Transactions on Vehicular Technology (2024)

  27. arXiv:2311.15069  [pdf, ps, other

    cs.IT eess.SP

    Multiuser Beamforming for Partially-Connected Millimeter Wave Massive MIMO

    Authors: Chenhao Qi, Jinlin Hu, Yang Du, Arumugam Nallanathan

    Abstract: Multiuser beamforming is considered for partially-connected millimeter wave massive MIMO systems. Based on perfect channel state information (CSI), a low-complexity hybrid beamforming scheme that decouples the analog beamformer and the digital beamformer is proposed to maximize the sum-rate. The analog beamformer design is modeled as a phase alignment problem to harvest the array gain. Given the a… ▽ More

    Submitted 25 November, 2023; originally announced November 2023.

  28. arXiv:2311.13785  [pdf, other

    eess.SY

    Federated Learning Assisted Distributed Energy Optimization

    Authors: Yuhan Du, Nuno Mendes, Simin Rasouli, Javad Mohammadi, Pedro Moura

    Abstract: The increased penetration of distributed energy resources and the adoption of sensing and control technologies are driving the transition from our current centralized electric grid to a distributed system controlled by multiple entities (agents). The Transactive Energy Community (TEC) serves as an established example of this transition. Distributed energy management approaches can effectively addr… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

    Comments: 14 pages, 14 figures, submitted for journal IET Renewable Power Generation

  29. arXiv:2311.12190  [pdf, other

    eess.SY

    Equitable Coordination in Multi-agent Power Systems: Impacts of Computation Granularity

    Authors: Yuhan Du, Javad Mohammadi

    Abstract: The growing integration of distributed energy resources drives the centralized power system towards a decentralized multi-agent network. Operating multi-agent networks significantly relies on inter-agent communications. Computation granularity in this context refers to the number of nodes overseen by an agent. The impact of granularity on equitable power coordination, particularly among marginaliz… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: 5 pages, 10 figures, submitted for 2024 IEEE Power & Energy Society General Meeting

  30. arXiv:2311.08585  [pdf, other

    cond-mat.mtrl-sci cs.CV cs.LG eess.IV

    Unsupervised segmentation of irradiation$\unicode{x2010}$induced order$\unicode{x2010}$disorder phase transitions in electron microscopy

    Authors: Arman H Ter-Petrosyan, Jenna A Bilbrey, Christina M Doty, Bethany E Matthews, Le Wang, Yingge Du, Eric Lang, Khalid Hattar, Steven R Spurgeon

    Abstract: We present a method for the unsupervised segmentation of electron microscopy images, which are powerful descriptors of materials and chemical systems. Images are oversegmented into overlapping chips, and similarity graphs are generated from embeddings extracted from a domain$\unicode{x2010}$pretrained convolutional neural network (CNN). The Louvain method for community detection is then applied to… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: 7 pages, 3 figures. Accepted to Machine Learning and the Physical Sciences Workshop, NeurIPS 2023

  31. arXiv:2310.00593  [pdf, other

    eess.SP

    Nonlinear Multi-Carrier System with Signal Clipping: Measurement, Analysis, and Optimization

    Authors: Yuyang Du, Liang Hao, Yiming Lei, Qun Yang, Shiqi Xu

    Abstract: Signal clipping is a classic technique for reducing peak-to-average power ratio (PAPR) in orthogonal frequency division multiplexing (OFDM) systems. It has been widely applied in consumer electronic devices owing to its low complexity and high efficiency. Although clipping reduces the nonlinear distortion caused by power amplifiers (PAs), it induces additional clipping distortion. Optimizing the j… ▽ More

    Submitted 16 February, 2024; v1 submitted 1 October, 2023; originally announced October 2023.

  32. arXiv:2309.14392  [pdf, other

    eess.IV cs.CV cs.LG

    Unveiling Fairness Biases in Deep Learning-Based Brain MRI Reconstruction

    Authors: Yuning Du, Yuyang Xue, Rohan Dharmakumar, Sotirios A. Tsaftaris

    Abstract: Deep learning (DL) reconstruction particularly of MRI has led to improvements in image fidelity and reduction of acquisition time. In neuroimaging, DL methods can reconstruct high-quality images from undersampled data. However, it is essential to consider fairness in DL algorithms, particularly in terms of demographic characteristics. This study presents the first fairness analysis in a DL-based b… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

    Comments: Accepted for publication at FAIMI 2023 (Fairness of AI in Medical Imaging) at MICCAI

  33. Cine cardiac MRI reconstruction using a convolutional recurrent network with refinement

    Authors: Yuyang Xue, Yuning Du, Gianluca Carloni, Eva Pachetti, Connor Jordan, Sotirios A. Tsaftaris

    Abstract: Cine Magnetic Resonance Imaging (MRI) allows for understanding of the heart's function and condition in a non-invasive manner. Undersampling of the $k$-space is employed to reduce the scan duration, thus increasing patient comfort and reducing the risk of motion artefacts, at the cost of reduced image quality. In this challenge paper, we investigate the use of a convolutional recurrent neural netw… ▽ More

    Submitted 23 September, 2023; originally announced September 2023.

    Comments: MICCAI STACOM workshop 2023

  34. arXiv:2309.03641  [pdf, other

    cs.SD cs.CV eess.AS

    Spiking Structured State Space Model for Monaural Speech Enhancement

    Authors: Yu Du, Xu Liu, Yansong Chua

    Abstract: Speech enhancement seeks to extract clean speech from noisy signals. Traditional deep learning methods face two challenges: efficiently using information in long speech sequences and high computational costs. To address these, we introduce the Spiking Structured State Space Model (Spiking-S4). This approach merges the energy efficiency of Spiking Neural Networks (SNN) with the long-range sequence… ▽ More

    Submitted 20 April, 2024; v1 submitted 7 September, 2023; originally announced September 2023.

  35. arXiv:2308.15742  [pdf, other

    cs.SD cs.AI cs.SE eess.AS

    ASTER: Automatic Speech Recognition System Accessibility Testing for Stutterers

    Authors: Yi Liu, Yuekang Li, Gelei Deng, Felix Juefei-Xu, Yao Du, Cen Zhang, Chengwei Liu, Yeting Li, Lei Ma, Yang Liu

    Abstract: The popularity of automatic speech recognition (ASR) systems nowadays leads to an increasing need for improving their accessibility. Handling stuttering speech is an important feature for accessible ASR systems. To improve the accessibility of ASR systems for stutterers, we need to expose and analyze the failures of ASR systems on stuttering speech. The speech datasets recorded from stutterers are… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

  36. A Robust Planning Model for Offshore Microgrid Considering Tidal Power and Desalination

    Authors: Zhimeng Wang, Ang Xuan, Xinwei Shen, Yunfei Du, Hongbin Sun

    Abstract: Increasing attention has been paid to resources on islands, thus microgrids on islands need to be invested. Different from onshore microgrids, offshore microgrids (OM) are usually abundant in ocean renewable energy (ORE), such as offshore wind, tidal power generation (TPG), etc. Moreover, some special loads such as seawater desalination unit (SDU) should be included. In this sense, this paper prop… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

  37. arXiv:2307.16518  [pdf, other

    cs.IT eess.SP

    Continuous-Time Channel Prediction Based on Tensor Neural Ordinary Differential Equation

    Authors: Mingyao Cui, Hao Jiang, Yuhao Chen, Yang Du, Linglong Dai

    Abstract: Channel prediction is critical to address the channel aging issue in mobile scenarios. Existing channel prediction techniques are mainly designed for discrete channel prediction, which can only predict the future channel in a fixed time slot per frame, while the other intra-frame channels are usually recovered by interpolation. However, these approaches suffer from a serious interpolation loss, es… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

    Comments: A tensor neural ODE based method is proposed to predict continuous-time wireless channels

  38. arXiv:2307.15615  [pdf, other

    eess.IV cs.CV

    A survey on deep learning in medical image registration: new technologies, uncertainty, evaluation metrics, and beyond

    Authors: Junyu Chen, Yihao Liu, Shuwen Wei, Zhangxing Bian, Shalini Subramanian, Aaron Carass, Jerry L. Prince, Yong Du

    Abstract: Deep learning technologies have dramatically reshaped the field of medical image registration over the past decade. The initial developments, such as regression-based and U-Net-based networks, established the foundation for deep learning in image registration. Subsequent progress has been made in various aspects of deep learning-based registration, including similarity measures, deformation regula… ▽ More

    Submitted 30 April, 2024; v1 submitted 28 July, 2023; originally announced July 2023.

    Comments: A list of open-sourced code from the papers reviewed has been organized and is available at https://bit.ly/3QgFJ9z

  39. arXiv:2307.07319  [pdf, other

    eess.SP

    The Power of Large Language Models for Wireless Communication System Development: A Case Study on FPGA Platforms

    Authors: Yuyang Du, Hongyu Deng, Soung Chang Liew, Kexin Chen, Yulin Shao, He Chen

    Abstract: Large language models (LLMs) have garnered significant attention across various research disciplines, including the wireless communication community. There have been several heated discussions on the intersection of LLMs and wireless technologies. While recent studies have demonstrated the ability of LLMs to generate hardware description language (HDL) code for simple computation tasks, developing… ▽ More

    Submitted 14 July, 2024; v1 submitted 14 July, 2023; originally announced July 2023.

  40. arXiv:2305.14374  [pdf, other

    cs.LG eess.SY

    Inferring Attracting Basins of Power System with Machine Learning

    Authors: Yao Du, Qing Li, Huawei Fan, Meng Zhan, Jinghua Xiao, Xingang Wang

    Abstract: Power systems dominated by renewable energy encounter frequently large, random disturbances, and a critical challenge faced in power-system management is how to anticipate accurately whether the perturbed systems will return to the functional state after the transient or collapse. Whereas model-based studies show that the key to addressing the challenge lies in the attracting basins of the functio… ▽ More

    Submitted 20 May, 2023; originally announced May 2023.

    Comments: 13 pages, 7 figures

  41. arXiv:2304.08490  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    Conditional Generation of Audio from Video via Foley Analogies

    Authors: Yuexi Du, Ziyang Chen, Justin Salamon, Bryan Russell, Andrew Owens

    Abstract: The sound effects that designers add to videos are designed to convey a particular artistic effect and, thus, may be quite different from a scene's true sound. Inspired by the challenges of creating a soundtrack for a video that differs from its true sound, but that nonetheless matches the actions occurring on screen, we propose the problem of conditional Foley. We present the following contributi… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

    Comments: CVPR 2023

  42. arXiv:2303.06179  [pdf, other

    eess.IV cs.CV

    Deformable Cross-Attention Transformer for Medical Image Registration

    Authors: Junyu Chen, Yihao Liu, Yufan He, Yong Du

    Abstract: Transformers have recently shown promise for medical image applications, leading to an increasing interest in developing such models for medical image registration. Recent advancements in designing registration Transformers have focused on using cross-attention (CA) to enable a more precise understanding of spatial correspondences between moving and fixed images. Here, we propose a novel CA mechan… ▽ More

    Submitted 10 March, 2023; originally announced March 2023.

  43. arXiv:2303.06168  [pdf, other

    eess.IV cs.CV

    Spatially-varying Regularization with Conditional Transformer for Unsupervised Image Registration

    Authors: Junyu Chen, Yihao Liu, Yufan He, Yong Du

    Abstract: In the past, optimization-based registration models have used spatially-varying regularization to account for deformation variations in different image regions. However, deep learning-based registration models have mostly relied on spatially-invariant regularization. Here, we introduce an end-to-end framework that uses neural networks to learn a spatially-varying deformation regularizer directly f… ▽ More

    Submitted 10 March, 2023; originally announced March 2023.

  44. arXiv:2303.04015  [pdf, ps, other

    eess.SY

    Simultaneous Recursive Identification of Parameters and Switching Manifolds Identification of Discrete-Time Switched Linear Systems

    Authors: Zengjie Zhang, Yingwei Du, Tong Liu, Fangzhou Liu, Martin Buss

    Abstract: A novel procedure for the online identification of a class of discrete-time switched linear systems, which simultaneously estimates the parameters and switching manifolds of the systems, is proposed in this paper. Firstly, to estimate the parameters of the subsystems, a discrete-time concurrent learning-based recursive parameter estimator is designed to guarantee the exponential convergence of the… ▽ More

    Submitted 7 March, 2023; originally announced March 2023.

  45. Integrated Sensing and Communication Signals Toward 5G-A and 6G: A Survey

    Authors: Zhiqing Wei, Hanyang Qu, Yuan Wang, Xin Yuan, Huici Wu, Ying Du, Kaifeng Han, Ning Zhang, Zhiyong Feng

    Abstract: Integrated sensing and communication (ISAC) has the advantages of efficient spectrum utilization and low hardware cost. It is promising to be implemented in the fifth-generation-advanced (5G-A) and sixth-generation (6G) mobile communication systems, having the potential to be applied in intelligent applications requiring both communication and high-accurate sensing capabilities. As the fundamental… ▽ More

    Submitted 15 December, 2023; v1 submitted 10 January, 2023; originally announced January 2023.

    Comments: 25 pages, 13 figures, 8 tables. IEEE Internet of Things Journal, 2023

    MSC Class: 94-02 ACM Class: A.1

  46. arXiv:2212.12134  [pdf, other

    eess.SP

    AMDET: Attention based Multiple Dimensions EEG Transformer for Emotion Recognition

    Authors: Yongling Xu, Yang Du, Jing Zou, Tianying Zhou, Lushan Xiao, Li Liu, Pengcheng

    Abstract: Affective computing is an important branch of artificial intelligence, and with the rapid development of brain computer interface technology, emotion recognition based on EEG signals has received broad attention. It is still a great challenge to effectively explore the multi-dimensional information in the EEG data in spite of a large number of deep learning methods. In this paper, we propose a dee… ▽ More

    Submitted 22 December, 2022; originally announced December 2022.

  47. arXiv:2212.02715  [pdf, other

    eess.SY cs.AI cs.LG math.OC

    Efficient Learning of Voltage Control Strategies via Model-based Deep Reinforcement Learning

    Authors: Ramij R. Hossain, Tianzhixi Yin, Yan Du, Renke Huang, Jie Tan, Wenhao Yu, Yuan Liu, Qiuhua Huang

    Abstract: This article proposes a model-based deep reinforcement learning (DRL) method to design emergency control strategies for short-term voltage stability problems in power systems. Recent advances show promising results in model-free DRL-based methods for power systems, but model-free methods suffer from poor sample efficiency and training time, both critical for making state-of-the-art DRL algorithms… ▽ More

    Submitted 5 December, 2022; originally announced December 2022.

  48. arXiv:2210.14644  [pdf, ps, other

    cs.SD eess.AS

    Speaker Diarization Based on Multi-channel Microphone Array in Small-scale Meeting

    Authors: Yuxuan Du, Ruohua Zhou

    Abstract: In the task of speaker diarization, the number of small-scale meetings accounts for a large proportion. When microphone arrays are employed as a recording device, its spatial information is usually ignored by most researchers. In this paper, inspired by the clustering method combining d-vector and microphone array spatial vector, we proposed a diarization method which using multi-channel microphon… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

  49. arXiv:2209.09635  [pdf

    cs.SD eess.AS

    The BUCEA Speaker Diarization System for the VoxCeleb Speaker Recognition Challenge 2022

    Authors: Ruohua Zhou, Yuxuan Du, Chenlei Hu

    Abstract: This paper describes the BUCEA speaker diarization system for the 2022 VoxCeleb Speaker Recognition Challenge. Voxsrc-22 provides the development set and test set of VoxConverse, and we mainly use the test set of VoxConverse for parameter adjustment. Our system consists of several modules, including speech activity detection (VAD), speaker embedding extractor, clustering methods, overlapping speec… ▽ More

    Submitted 20 September, 2022; originally announced September 2022.

  50. arXiv:2208.08654  [pdf, other

    cs.IT eess.SP

    Rethinking the Performance of ISAC System: From Efficiency and Utility Perspectives

    Authors: Jiamo Jiang, Mingfeng Xu, Zhongyuan Zhao, Kaifeng Han, Yang Li, Ying Du, Zhiqin Wang

    Abstract: Integrated sensing and communications (ISAC) is an essential technology for the 6G communication system, which enables the conventional wireless communication network capable of sensing targets around. The shared use of pilots is a promising strategy to achieve ISAC. It brings a trade-off between communication and sensing, which is still unclear under the imperfect channel estimation condition. To… ▽ More

    Submitted 18 August, 2022; originally announced August 2022.