Skip to main content

Showing 1–50 of 15,272 results for author: Wang, X

.
  1. arXiv:2410.13830  [pdf, other

    cs.CV

    DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control

    Authors: Yujie Wei, Shiwei Zhang, Hangjie Yuan, Xiang Wang, Haonan Qiu, Rui Zhao, Yutong Feng, Feng Liu, Zhizhong Huang, Jiaxin Ye, Yingya Zhang, Hongming Shan

    Abstract: Recent advances in customized video generation have enabled users to create videos tailored to both specific subjects and motion trajectories. However, existing methods often require complicated test-time fine-tuning and struggle with balancing subject learning and motion control, limiting their real-world applications. In this paper, we present DreamVideo-2, a zero-shot video customization framew… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: Project page: https://dreamvideo2.github.io/

  2. arXiv:2410.13782  [pdf, other

    cs.LG q-bio.QM

    DPLM-2: A Multimodal Diffusion Protein Language Model

    Authors: Xinyou Wang, Zaixiang Zheng, Fei Ye, Dongyu Xue, Shujian Huang, Quanquan Gu

    Abstract: Proteins are essential macromolecules defined by their amino acid sequences, which determine their three-dimensional structures and, consequently, their functions in all living organisms. Therefore, generative protein modeling necessitates a multimodal approach to simultaneously model, understand, and generate both sequences and structures. However, existing methods typically use separate models f… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  3. arXiv:2410.13748  [pdf, other

    hep-ex

    Test of lepton flavour universality with $B_s^0 \rightarrow φ\ell^+\ell^-$ decays

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1124 additional authors not shown)

    Abstract: Lepton flavour universality in rare $b\rightarrow s$ transitions is tested for the first time using $B_s^0$ meson decays. The measurements are performed using $pp$ collision data collected by the LHCb experiment between 2011 and 2018, corresponding to a total integrated luminosity of 9$\,{\rm fb}^{-1}$. Branching fraction ratios between the $B_s^0 \rightarrow φe^+e^-$ and… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://lbfence.cern.ch/alcm/public/analysis/full-details/3513/ (LHCb public pages)

    Report number: LHCb-PAPER-2024-032, CERN-EP-2024-255

  4. arXiv:2410.13610  [pdf, other

    cs.AI cs.CL

    MeNTi: Bridging Medical Calculator and LLM Agent with Nested Tool Calling

    Authors: Yakun Zhu, Shaohang Wei, Xu Wang, Kui Xue, Xiaofan Zhang, Shaoting Zhang

    Abstract: Integrating tools into Large Language Models (LLMs) has facilitated the widespread application. Despite this, in specialized downstream task contexts, reliance solely on tools is insufficient to fully address the complexities of the real world. This particularly restricts the effective deployment of LLMs in fields such as medicine. In this paper, we focus on the downstream tasks of medical calcula… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  5. arXiv:2410.13571  [pdf, other

    cs.CV

    DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving Scene Representation

    Authors: Guosheng Zhao, Chaojun Ni, Xiaofeng Wang, Zheng Zhu, Guan Huang, Xinze Chen, Boyuan Wang, Youyi Zhang, Wenjun Mei, Xingang Wang

    Abstract: Closed-loop simulation is essential for advancing end-to-end autonomous driving systems. Contemporary sensor simulation methods, such as NeRF and 3DGS, rely predominantly on conditions closely aligned with training data distributions, which are largely confined to forward-driving scenarios. Consequently, these methods face limitations when rendering complex maneuvers (e.g., lane change, accelerati… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: https://drivedreamer4d.github.io

  6. arXiv:2410.13515  [pdf, other

    hep-ex hep-lat hep-ph nucl-ex

    Observation of a rare beta decay of the charmed baryon with a Graph Neural Network

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (637 additional authors not shown)

    Abstract: The study of beta decay of the charmed baryon provides unique insights into the fundamental mechanism of the strong and electro-weak interactions. The $Λ_c^+$, being the lightest charmed baryon, undergoes disintegration solely through the charm quark weak decay. Its beta decay provides an ideal laboratory for investigating non-perturbative effects in quantum chromodynamics and for constraining the… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: 28 pages, 6 figures

  7. arXiv:2410.13490  [pdf, other

    cs.RO cs.LG

    Novelty-based Sample Reuse for Continuous Robotics Control

    Authors: Ke Duan, Kai Yang, Houde Liu, Xueqian Wang

    Abstract: In reinforcement learning, agents collect state information and rewards through environmental interactions, essential for policy refinement. This process is notably time-consuming, especially in complex robotic simulations and real-world applications. Traditional algorithms usually re-engage with the environment after processing a single batch of samples, thereby failing to fully capitalize on his… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  8. arXiv:2410.13478  [pdf, other

    hep-ex

    Observation of $χ_{c0}\toΣ^{+}\barΣ^{-}η$ and evidence for $χ_{c1,2}\toΣ^{+}\barΣ^{-}η$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (634 additional authors not shown)

    Abstract: Using $(27.12\pm 0.14)\times10^{8}$ $ψ(3686)$ events collected with the BESIII detector, the decay $χ_{c0}\toΣ^{+}\barΣ^{-}η$ is observed for the first time with a statistical significance of $7.0σ$, and evidence for $χ_{c1}\toΣ^{+}\barΣ^{-}η$ and $χ_{c2}\toΣ^{+}\barΣ^{-}η$ is found with statistical significances of $4.3σ$ and $4.6σ$, respectively. The branching fractions are determined to be… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  9. arXiv:2410.13428  [pdf, other

    cs.IR

    Generate and Instantiate What You Prefer: Text-Guided Diffusion for Sequential Recommendation

    Authors: Guoqing Hu, Zhangyi Yang, Zhibo Cai, An Zhang, Xiang Wang

    Abstract: Recent advancements in generative recommendation systems, particularly in the realm of sequential recommendation tasks, have shown promise in enhancing generalization to new items. Among these approaches, diffusion-based generative recommendation has emerged as an effective tool, leveraging its ability to capture data distributions and generate high-quality samples. Despite effectiveness, two prim… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  10. arXiv:2410.13409  [pdf, other

    cs.CL cs.AI

    Attr-Int: A Simple and Effective Entity Alignment Framework for Heterogeneous Knowledge Graphs

    Authors: Linyan Yang, Jingwei Cheng, Chuanhao Xu, Xihao Wang, Jiayi Li, Fu Zhang

    Abstract: Entity alignment (EA) refers to the task of linking entities in different knowledge graphs (KGs). Existing EA methods rely heavily on structural isomorphism. However, in real-world KGs, aligned entities usually have non-isomorphic neighborhood structures, which paralyses the application of these structure-dependent methods. In this paper, we investigate and tackle the problem of entity alignment b… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  11. arXiv:2410.13373  [pdf, other

    cs.LG

    Addressing Heterogeneity and Heterophily in Graphs: A Heterogeneous Heterophilic Spectral Graph Neural Network

    Authors: Kangkang Lu, Yanhua Yu, Zhiyong Huang, Jia Li, Yuling Wang, Meiyu Liang, Xiting Qin, Yimeng Ren, Tat-Seng Chua, Xidian Wang

    Abstract: Graph Neural Networks (GNNs) have garnered significant scholarly attention for their powerful capabilities in modeling graph structures. Despite this, two primary challenges persist: heterogeneity and heterophily. Existing studies often address heterogeneous and heterophilic graphs separately, leaving a research gap in the understanding of heterogeneous heterophilic graphs-those that feature diver… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  12. arXiv:2410.13368  [pdf, other

    hep-ex hep-ph

    Observation of the Singly Cabibbo-Suppressed Decay $Λ_c^{+}\to pπ^0$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

    Abstract: Utilizing 4.5${~\rm{fb}}^{-1}$ of $e^+e^-$ annihilation data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 4.600 and 4.699 GeV, the first observation of the singly Cabibbo-suppressed decay $Λ_c^{+}\to pπ^0$ is presented, with a statistical significance of $5.4σ$. The ratio of the branching fractions of $Λ_c^{+}\to pπ^0$ and $Λ_c^{+}\to pη$ is measured… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: 9 pages, 4 figures

  13. arXiv:2410.13355  [pdf, ps, other

    cs.CV

    Self-Supervised Scene Flow Estimation with Point-Voxel Fusion and Surface Representation

    Authors: Xuezhi Xiang, Xi Wang, Lei Zhang, Denis Ombati, Himaloy Himu, Xiantong Zhen

    Abstract: Scene flow estimation aims to generate the 3D motion field of points between two consecutive frames of point clouds, which has wide applications in various fields. Existing point-based methods ignore the irregularity of point clouds and have difficulty capturing long-range dependencies due to the inefficiency of point-level computation. Voxel-based methods suffer from the loss of detail informatio… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: The paper is under consideration at 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2025)

  14. arXiv:2410.13333  [pdf, other

    cs.DC

    Malleus: Straggler-Resilient Hybrid Parallel Training of Large-scale Models via Malleable Data and Model Parallelization

    Authors: Haoyang Li, Fangcheng Fu, Hao Ge, Sheng Lin, Xuanyu Wang, Jiawen Niu, Yujie Wang, Hailin Zhang, Xiaonan Nie, Bin Cui

    Abstract: As the scale of models and training data continues to grow, there is an expanding reliance on more GPUs to train large-scale models, which inevitably increases the likelihood of encountering dynamic stragglers that some devices lag behind in performance occasionally. However, hybrid parallel training, one of the de facto paradigms to train large models, is typically sensitive to the stragglers.… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  15. arXiv:2410.13294  [pdf, other

    cs.CV

    LESS: Label-Efficient and Single-Stage Referring 3D Segmentation

    Authors: Xuexun Liu, Xiaoxu Xu, Jinlong Li, Qiudan Zhang, Xu Wang, Nicu Sebe, Lin Ma

    Abstract: Referring 3D Segmentation is a visual-language task that segments all points of the specified object from a 3D point cloud described by a sentence of query. Previous works perform a two-stage paradigm, first conducting language-agnostic instance segmentation then matching with given text query. However, the semantic concepts from text query and visual cues are separately interacted during the trai… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  16. arXiv:2410.13268  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    Roadmap towards Superhuman Speech Understanding using Large Language Models

    Authors: Fan Bu, Yuhao Zhang, Xidong Wang, Benyou Wang, Qun Liu, Haizhou Li

    Abstract: The success of large language models (LLMs) has prompted efforts to integrate speech and audio data, aiming to create general foundation models capable of processing both textual and non-textual inputs. Recent advances, such as GPT-4o, highlight the potential for end-to-end speech LLMs, which preserves non-semantic information and world knowledge for deeper speech understanding. To guide the devel… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  17. arXiv:2410.13184  [pdf, other

    cs.CL

    Router-Tuning: A Simple and Effective Approach for Enabling Dynamic-Depth in Transformers

    Authors: Shwai He, Tao Ge, Guoheng Sun, Bowei Tian, Xiaoyang Wang, Ang Li, Dong Yu

    Abstract: Traditional transformer models often allocate a fixed amount of computational resources to every input token, leading to inefficient and unnecessary computation. To address this, the Mixture of Depths (MoD) was introduced to dynamically adjust the computational depth by skipping less important layers. Despite its promise, current MoD approaches remain under-explored and face two main challenges: (… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

  18. arXiv:2410.13115  [pdf, other

    stat.ME

    Online conformal inference for multi-step time series forecasting

    Authors: Xiaoqian Wang, Rob J Hyndman

    Abstract: We consider the problem of constructing distribution-free prediction intervals for multi-step time series forecasting, with a focus on the temporal dependencies inherent in multi-step forecast errors. We establish that the optimal $h$-step-ahead forecast errors exhibit serial correlation up to lag $(h-1)$ under a general non-stationary autoregressive data generating process. To leverage these prop… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

  19. arXiv:2410.13105  [pdf, other

    cs.SI cs.CE

    AgileRate: Bringing Adaptivity and Robustness to DeFi Lending Markets

    Authors: Mahsa Bastankhah, Viraj Nadkarni, Xuechao Wang, Pramod Viswanath

    Abstract: Decentralized Finance (DeFi) has revolutionized lending by replacing intermediaries with algorithm-driven liquidity pools. However, existing platforms like Aave and Compound rely on static interest rate curves and collateral requirements that struggle to adapt to rapid market changes, leading to inefficiencies in utilization and increased risks of liquidations. In this work, we propose a dynamic m… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

  20. arXiv:2410.13083  [pdf, other

    cs.LG cs.AI cs.CR

    FedCAP: Robust Federated Learning via Customized Aggregation and Personalization

    Authors: Youpeng Li, Xinda Wang, Fuxun Yu, Lichao Sun, Wenbin Zhang, Xuyu Wang

    Abstract: Federated learning (FL), an emerging distributed machine learning paradigm, has been applied to various privacy-preserving scenarios. However, due to its distributed nature, FL faces two key issues: the non-independent and identical distribution (non-IID) of user data and vulnerability to Byzantine threats. To address these challenges, in this paper, we propose FedCAP, a robust FL framework agains… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

    Comments: 14 pages, 12 figures, 5 tables, accepted by 2024 Annual Computer Security Applications Conference (ACSAC 2024)

  21. arXiv:2410.13068  [pdf, other

    hep-ph hep-ex hep-lat nucl-ex nucl-th

    Sketching pion and proton mass distributions

    Authors: Xiaobin Wang, Zanbin Xing, Lei Chang, Minghui Ding, Kh�pani Raya, Craig D. Roberts

    Abstract: A light-front holographic model is used to illustrate an algebraic scheme for constructing a representation of a hadron's zero-skewness generalised parton distribution (GPD) from its valence-quark distribution function (DF) and electromagnetic form factor, $F_H$, without reference to deeply virtual Compton scattering data. The hadron's mass distribution gravitational form factor, $A_H$, calculated… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

    Comments: 7 pages, 6 figures

    Report number: NJU-INP 093/24

  22. arXiv:2410.12836  [pdf, other

    cs.GR cs.AI cs.CV cs.HC

    EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing

    Authors: Kaizhi Zheng, Xiaotong Chen, Xuehai He, Jing Gu, Linjie Li, Zhengyuan Yang, Kevin Lin, Jianfeng Wang, Lijuan Wang, Xin Eric Wang

    Abstract: Given the steep learning curve of professional 3D software and the time-consuming process of managing large 3D assets, language-guided 3D scene editing has significant potential in fields such as virtual reality, augmented reality, and gaming. However, recent approaches to language-guided 3D scene editing either require manual interventions or focus only on appearance modifications without support… ▽ More

    Submitted 3 October, 2024; originally announced October 2024.

  23. arXiv:2410.12700  [pdf, other

    cs.CV cs.AI cs.CY cs.LG cs.MM

    Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value Optimization

    Authors: Xingqi Wang, Xiaoyuan Yi, Xing Xie, Jia Jia

    Abstract: Recent advancements in diffusion models trained on large-scale data have enabled the generation of indistinguishable human-level images, yet they often produce harmful content misaligned with human values, e.g., social bias, and offensive content. Despite extensive research on Large Language Models (LLMs), the challenge of Text-to-Image (T2I) model alignment remains largely unexplored. Addressing… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

    Comments: Accepted by ACM Multimedia 2024. The dataset and code can be found at https://github.com/achernarwang/LiVO

  24. arXiv:2410.12690  [pdf, other

    stat.ML cs.LG stat.AP stat.ME

    Local transfer learning Gaussian process modeling, with applications to surrogate modeling of expensive computer simulators

    Authors: Xinming Wang, Simon Mak, John Miller, Jianguo Wu

    Abstract: A critical bottleneck for scientific progress is the costly nature of computer simulations for complex systems. Surrogate models provide an appealing solution: such models are trained on simulator evaluations, then used to emulate and quantify uncertainty on the expensive simulator at unexplored inputs. In many applications, one often has available data on related systems. For example, in designin… ▽ More

    Submitted 16 October, 2024; v1 submitted 16 October, 2024; originally announced October 2024.

  25. arXiv:2410.12624  [pdf

    cond-mat.supr-con

    Field-free superconducting diode effect and magnetochiral anisotropy in FeTe0.7Se0.3 junctions with the inherent asymmetric barrier

    Authors: Shengyao Li, Ya Deng, Dianyi Hu, Chao Zhu, Zherui Yang, Wanghao Tian, Xueyan Wang, Ming Yue, Qiong Wu, Zheng Liu, Xiao Renshaw Wang

    Abstract: Nonreciprocal electrical transport, characterized by an asymmetric relationship between current and voltage, plays a crucial role in modern electronic industries. Recent studies have extended this phenomenon to superconductors, introducing the concept of the superconducting diode effect (SDE). The SDE is characterized by unequal critical supercurrents along opposite directions. Due to the requirem… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

  26. arXiv:2410.12620  [pdf, other

    hep-ex

    Search for $e^{+}e^{-} \to φχ_{c0}$ and $φη_{c2}(1D)$ at center-of-mass energies from 4.47 to 4.95 GeV

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (644 additional authors not shown)

    Abstract: Utilizing a data set of $6.7$ fb$^{-1}$ from electron-positron collisions recorded by the BESIII detector at the BEPCII storage ring, a search is conducted for the processes $e^{+}e^{-} \to φχ_{c0}$ and $φη_{c2}(1D)$ across center-of-mass energies from 4.47 to 4.95 GeV. In the absence of any significant signals, upper limits are set. These include limits on the Born cross sections for… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

    Comments: 14 pages, 6 figures

  27. arXiv:2410.12519  [pdf, other

    cs.IR

    RosePO: Aligning LLM-based Recommenders with Human Values

    Authors: Jiayi Liao, Xiangnan He, Ruobing Xie, Jiancan Wu, Yancheng Yuan, Xingwu Sun, Zhanhui Kang, Xiang Wang

    Abstract: Recently, there has been a growing interest in leveraging Large Language Models (LLMs) for recommendation systems, which usually adapt a pre-trained LLM to the recommendation scenario through supervised fine-tuning (SFT). However, both the pre-training and SFT stages fail to explicitly model the comparative relationships of a user's preferences on different items. To construct a "helpful and harml… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

  28. arXiv:2410.12457  [pdf, other

    cs.LG cs.AI

    Sharpness-Aware Black-Box Optimization

    Authors: Feiyang Ye, Yueming Lyu, Xuehao Wang, Masashi Sugiyama, Yu Zhang, Ivor Tsang

    Abstract: Black-box optimization algorithms have been widely used in various machine learning problems, including reinforcement learning and prompt fine-tuning. However, directly optimizing the training loss value, as commonly done in existing black-box optimization methods, could lead to suboptimal model quality and generalization performance. To address those problems in black-box optimization, we propose… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

    Comments: 27 pages, 5 figures

  29. arXiv:2410.12352  [pdf, other

    cs.CE

    Private Order Flows and Builder Bidding Dynamics: The Road to Monopoly in Ethereum's Block Building Market

    Authors: Shuzheng Wang, Yue Huang, Wenqin Zhang, Yuming Huang, Xuechao Wang, Jing Tang

    Abstract: Ethereum, as a representative of Web3, adopts a novel framework called Proposer Builder Separation (PBS) to prevent the centralization of block profits in the hands of institutional Ethereum stakers. Introducing builders to generate blocks based on public transactions, PBS aims to ensure that block profits are distributed among all stakers. Through the auction among builders, only one will win the… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

  30. arXiv:2410.12298  [pdf, other

    cs.CL cs.AI

    Pyramid-Driven Alignment: Pyramid Principle Guided Integration of Large Language Models and Knowledge Graphs

    Authors: Lei Sun, Xinchen Wang, Youdi Li

    Abstract: Large Language Models (LLMs) possess impressive reasoning abilities but are prone to generating incorrect information, often referred to as hallucinations. While incorporating external Knowledge Graphs (KGs) can partially mitigate this issue, existing methods primarily treat KGs as static knowledge repositories, overlooking the critical disparity between KG and LLM knowledge, and failing to fully… ▽ More

    Submitted 17 October, 2024; v1 submitted 16 October, 2024; originally announced October 2024.

  31. arXiv:2410.12119  [pdf, other

    cs.LG cs.CL

    Scaling laws for post-training quantized large language models

    Authors: Zifei Xu, Alexander Lan, Wanzin Yazar, Tristan Webb, Sayeh Sharify, Xin Wang

    Abstract: Generalization abilities of well-trained large language models (LLMs) are known to scale predictably as a function of model size. In contrast to the existence of practical scaling laws governing pre-training, the quality of LLMs after post-training compression remains highly unpredictable, often requiring case-by-case validation in practice. In this work, we attempted to close this gap for post-tr… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

  32. arXiv:2410.11878  [pdf, other

    cs.CV cs.AI cs.LG

    Neural Metamorphosis

    Authors: Xingyi Yang, Xinchao Wang

    Abstract: This paper introduces a new learning paradigm termed Neural Metamorphosis (NeuMeta), which aims to build self-morphable neural networks. Contrary to crafting separate models for different architectures or sizes, NeuMeta directly learns the continuous weight manifold of neural networks. Once trained, we can sample weights for any-sized network directly from the manifold, even for previously unseen… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

    Comments: in ECCV2024, https://adamdad.github.io/neumeta/

  33. Analysis and Benchmarking of Extending Blind Face Image Restoration to Videos

    Authors: Zhouxia Wang, Jiawei Zhang, Xintao Wang, Tianshui Chen, Ying Shan, Wenping Wang, Ping Luo

    Abstract: Recent progress in blind face restoration has resulted in producing high-quality restored results for static images. However, efforts to extend these advancements to video scenarios have been minimal, partly because of the absence of benchmarks that allow for a comprehensive and fair comparison. In this work, we first present a fair evaluation benchmark, in which we first introduce a Real-world Lo… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

    Comments: Accepted by TIP'2024; Project page: https://wzhouxiff.github.io/projects/FIR2FVR/FIR2FVR

    Journal ref: IEEE Trans Image Process. 2024;33:5676-5687. Epub 2024 Oct 9. PMID: 39316481

  34. arXiv:2410.11824  [pdf, other

    cs.CV

    KITTEN: A Knowledge-Intensive Evaluation of Image Generation on Visual Entities

    Authors: Hsin-Ping Huang, Xinyi Wang, Yonatan Bitton, Hagai Taitelbaum, Gaurav Singh Tomar, Ming-Wei Chang, Xuhui Jia, Kelvin C. K. Chan, Hexiang Hu, Yu-Chuan Su, Ming-Hsuan Yang

    Abstract: Recent advancements in text-to-image generation have significantly enhanced the quality of synthesized images. Despite this progress, evaluations predominantly focus on aesthetic appeal or alignment with text prompts. Consequently, there is limited understanding of whether these models can accurately represent a wide variety of realistic visual entities - a task requiring real-world knowledge. To… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

    Comments: Project page: https://kitten-project.github.io/

  35. arXiv:2410.11772  [pdf, other

    cs.CL cs.LG

    Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language Models

    Authors: Kai Yao, Penlei Gao, Lichun Li, Yuan Zhao, Xiaofeng Wang, Wei Wang, Jianke Zhu

    Abstract: Parameter-Efficient Fine-Tuning (PEFT) methods have gained significant popularity for adapting pre-trained Large Language Models (LLMs) to downstream tasks, primarily due to their potential to significantly reduce memory and computational overheads. However, a common limitation in most PEFT approaches is their application of a uniform architectural design across all layers. This uniformity involve… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

    Comments: EMNLP 2024

  36. arXiv:2410.11713  [pdf, other

    stat.ME

    Enhancing Statistical Validity and Power in Hybrid Controlled Trials: A Randomization Inference Approach with Conformal Selective Borrowing

    Authors: Ke Zhu, Shu Yang, Xiaofei Wang

    Abstract: Randomized controlled trials (RCTs) are the gold standard for causal inference on treatment effects. However, they can be underpowered due to small population sizes in rare diseases and limited number of patients willing to participate due to questions regarding equipoise among treatment groups in common diseases. Hybrid controlled trials use external controls (ECs) from historical studies or larg… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

  37. arXiv:2410.11664  [pdf, ps, other

    math-ph

    Mathematical Foundation of the U$^N(1)$ Quantum Geometric Tensor

    Authors: Xin Wang, Xu-Yang Hou, Jia-Chen Tang, Hao Guo

    Abstract: In this paper, we systematically establish the mathematical foundation for the $\text{U}^N(1)$ quantum geometric tensor (QGT) of mixed states Explicitly, we present a description based on the $\text{U}^N(1)$ principal bundle and derive a Pythagorean-like distance decomposition equation. Additionally, we offer a comprehensive comparison of its properties with those of the U(1) principal bundle desc… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

  38. arXiv:2410.11607  [pdf, other

    hep-ex

    Observation of $χ_{cJ}\to p \bar p K^0_S K^- π^+ + c.c.$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (648 additional authors not shown)

    Abstract: By analyzing $(27.12\pm0.14)\times10^8$ $ψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, the decays of $χ_{cJ} \to p \bar{p} K^0_S K^- π^+ +c.c.(J=0, 1, 2)$ are observed for the first time with statistical significances greater than $10σ$. The branching fractions of these decays are determined to be… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

    Comments: 12 pages, 5 figures

  39. arXiv:2410.11402  [pdf, other

    cs.RO

    M2Diffuser: Diffusion-based Trajectory Optimization for Mobile Manipulation in 3D Scenes

    Authors: Sixu Yan, Zeyu Zhang, Muzhi Han, Zaijin Wang, Qi Xie, Zhitian Li, Zhehan Li, Hangxin Liu, Xinggang Wang, Song-Chun Zhu

    Abstract: Recent advances in diffusion models have opened new avenues for research into embodied AI agents and robotics. Despite significant achievements in complex robotic locomotion and skills, mobile manipulation-a capability that requires the coordination of navigation and manipulation-remains a challenge for generative AI techniques. This is primarily due to the high-dimensional action space, extended… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

  40. arXiv:2410.11345  [pdf, other

    cs.RO

    Visual Manipulation with Legs

    Authors: Xialin He, Chengjing Yuan, Wenxuan Zhou, Ruihan Yang, David Held, Xiaolong Wang

    Abstract: Animals use limbs for both locomotion and manipulation. We aim to equip quadruped robots with similar versatility. This work introduces a system that enables quadruped robots to interact with objects using their legs, inspired by non-prehensile manipulation. The system has two main components: a visual manipulation policy module and a loco-manipulator module. The visual manipulation policy, traine… ▽ More

    Submitted 16 October, 2024; v1 submitted 15 October, 2024; originally announced October 2024.

    Comments: More details can be found on our project page: https://legged-manipulation.github.io/

  41. arXiv:2410.11282  [pdf, other

    eess.SY

    Multi-Objective-Optimization Multi-AUV Assisted Data Collection Framework for IoUT Based on Offline Reinforcement Learning

    Authors: Yimian Ding, Xinqi Wang, Jingzehua Xu, Guanwen Xie, Weiyi Liu, Yi Li

    Abstract: The Internet of Underwater Things (IoUT) offers significant potential for ocean exploration but encounters challenges due to dynamic underwater environments and severe signal attenuation. Current methods relying on Autonomous Underwater Vehicles (AUVs) based on online reinforcement learning (RL) lead to high computational costs and low data utilization. To address these issues and the constraints… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

  42. arXiv:2410.11267  [pdf, other

    cs.LG cs.AI

    FedCCRL: Federated Domain Generalization with Cross-Client Representation Learning

    Authors: Xinpeng Wang, Xiaoying Tang

    Abstract: Domain Generalization (DG) aims to train models that can effectively generalize to unseen domains. However, in the context of Federated Learning (FL), where clients collaboratively train a model without directly sharing their data, most existing DG algorithms are not directly applicable to the FL setting due to privacy constraints, as well as the limited data quantity and domain diversity at each… ▽ More

    Submitted 16 October, 2024; v1 submitted 15 October, 2024; originally announced October 2024.

  43. arXiv:2410.10967  [pdf, other

    astro-ph.GA

    Seven wonders of Cosmic Dawn: JWST confirms a high abundance of galaxies and AGNs at z $\simeq$ 9-11 in the GLASS field

    Authors: L. Napolitano, M. Castellano, L. Pentericci, P. Arrabal Haro, A. Fontana, T. Treu, P. Bergamini, A. Calabro, S. Mascia, T. Morishita, G. Roberts-Borsani, P. Santini, E. Vanzella, B. Vulcani, D. Zakharova, T. Bakx, M. Dickinson, C. Grillo, N. Leethochawalit, M. Llerena, E. Merlin, D. Paris, S. Rojas-Ruiz, P. Rosati, X. Wang , et al. (2 additional authors not shown)

    Abstract: We present JWST/NIRSpec PRISM follow-up of candidate galaxies at z=9-11 selected from deep JWST/NIRCam photometry in GLASS-JWST Early Release Science data. We spectroscopically confirm six sources with secure redshifts at z = 9.52-10.43, each showing multiple emission lines. An additional object is likely at z = 10.66, based on its Lya-break and a single emission feature, while one source is a low… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

    Comments: Submitted to A&A

  44. arXiv:2410.10658  [pdf, other

    cs.SI cs.CY

    A Personalized MOOC Learning Group and Course Recommendation Method Based on Graph Neural Network and Social Network Analysis

    Authors: Zijin Luo, Xu Wang, Yiquan Wang, Haotian Zhang, Zhuangzhuang Li

    Abstract: In order to enhance students' initiative and participation in MOOC learning, this study constructed a multi-level network model based on Social Network Analysis (SNA). The model makes use of data pertaining to nearly 40,000 users and tens of thousands of courses from various higher education MOOC platforms. Furthermore, an AI-based assistant has been developed which utilises the collected data to… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

  45. arXiv:2410.10626  [pdf, other

    cs.CL

    Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts

    Authors: Guorui Zheng, Xidong Wang, Juhao Liang, Nuo Chen, Yuping Zheng, Benyou Wang

    Abstract: Adapting medical Large Language Models to local languages can reduce barriers to accessing healthcare services, but data scarcity remains a significant challenge, particularly for low-resource languages. To address this, we first construct a high-quality medical dataset and conduct analysis to ensure its quality. In order to leverage the generalization capability of multilingual LLMs to efficientl… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

  46. arXiv:2410.10604  [pdf, other

    cs.CV cs.AI cs.LG

    BrainMVP: Multi-modal Vision Pre-training for Brain Image Analysis using Multi-parametric MRI

    Authors: Shaohao Rui, Lingzhi Chen, Zhenyu Tang, Lilong Wang, Mianxin Liu, Shaoting Zhang, Xiaosong Wang

    Abstract: Accurate diagnosis of brain abnormalities is greatly enhanced by the inclusion of complementary multi-parametric MRI imaging data. There is significant potential to develop a universal pre-training model that can be quickly adapted for image modalities and various clinical scenarios. However, current models often rely on uni-modal image data, neglecting the cross-modal correlations among different… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

  47. arXiv:2410.10560  [pdf, other

    cond-mat.mtrl-sci

    Sulfur Vacancies Limit the Open-circuit Voltage of Sb2S3 Solar Cells

    Authors: Xinwei Wang, Se�n R. Kavanagh, Aron Walsh

    Abstract: Antimony sulfide (Sb2S3) is a promising candidate as an absorber layer for single-junction solar cells and the top subcell in tandem solar cells. However, the power conversion efficiency of Sb2S3-based solar cells has remained stagnant over the past decade, largely due to trap-assisted non-radiative recombination. Here we assess the trap-limited conversion efficiency of Sb2S3 by investigating non-… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

  48. arXiv:2410.10524  [pdf, other

    cs.LG cs.AI

    Get Rid of Task Isolation: A Continuous Multi-task Spatio-Temporal Learning Framework

    Authors: Zhongchao Yi, Zhengyang Zhou, Qihe Huang, Yanjiang Chen, Liheng Yu, Xu Wang, Yang Wang

    Abstract: Spatiotemporal learning has become a pivotal technique to enable urban intelligence. Traditional spatiotemporal models mostly focus on a specific task by assuming a same distribution between training and testing sets. However, given that urban systems are usually dynamic, multi-sourced with imbalanced data distributions, current specific task-specific models fail to generalize to new urban conditi… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

    Comments: Accepted by NeurIPS 2024

  49. arXiv:2410.10450  [pdf, other

    cs.AI cs.CL

    KBLaM: Knowledge Base augmented Language Model

    Authors: Xi Wang, Liana Mikaelyan, Taketomo Isazawa, James Hensman

    Abstract: In this paper, we propose Knowledge Base augmented Language Model (KBLaM), a new method for augmenting Large Language Models (LLMs) with external knowledge. KBLaM works with a knowledge base (KB) constructed from a corpus of documents, transforming each piece of knowledge in the KB into continuous key-value vector pairs via pre-trained sentence encoders with linear adapters and integrating them in… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

  50. arXiv:2410.10389  [pdf

    cs.CV

    Reverse Refinement Network for Narrow Rural Road Detection in High-Resolution Satellite Imagery

    Authors: Ningjing Wang, Xinyu Wang, Yang Pan, Wanqiang Yao, Yanfei Zhong

    Abstract: The automated extraction of rural roads is pivotal for rural development and transportation planning, serving as a cornerstone for socio-economic progress. Current research primarily focuses on road extraction in urban areas. However, rural roads present unique challenges due to their narrow and irregular nature, posing significant difficulties for road extraction. In this article, a reverse refin… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.