Search | arXiv e-print repository

arXiv:2410.11064 [pdf, other]

Parsing altered brain connectivity in neurodevelopmental disorders by integrating graph-based normative modeling and deep generative networks

Authors: Rui Sherry Shen, Yusuf Osmanlıoğlu, Drew Parker, Darien Aunapu, Benjamin E. Yerys, Birkan Tunç, Ragini Verma

Abstract: Many neurodevelopmental disorders can be understood as divergent patterns of neural interactions during brain development. Advances in neuroimaging have illuminated these patterns by modeling the brain as a network structure using diffution MRI tractography. However, characterizing and quantifying individual heterogeneity in neurodevelopmental disorders within these highly complex brain networks r… ▽ More Many neurodevelopmental disorders can be understood as divergent patterns of neural interactions during brain development. Advances in neuroimaging have illuminated these patterns by modeling the brain as a network structure using diffution MRI tractography. However, characterizing and quantifying individual heterogeneity in neurodevelopmental disorders within these highly complex brain networks remains a significant challenge. In this paper, we present for the first time, a framework that integrates deep generative models with graph-based normative modeling to characterize brain network development in the neurotypical population, which can then be used to quantify the individual-level neurodivergence associated with disorders. Our deep generative model incorporates bio-inspired wiring constraints to effectively capture the developmental trajectories of neurotypical brain networks. Neurodivergence is quantified by comparing individuals to this neurotypical trajectory, enabling the creation of region-wise divergence maps that reveal latent developmental differences at each brain regions, along with overall neurodivergence scores based on predicted brain age gaps. We demonstrate the clinical utility of this framework by applying it to a large sample of children with autism spectrum disorders, showing that the individualized region-wise maps help parse the heterogeneity in autism, and the neurodivergence scores correlate with clinical assessments. Together, we provide powerful tools for quantifying neurodevelopmental divergence in brain networks, paying the way for developing imaging markers that will support disorder stratification, monitor progression, and evaluate therapeutic effectiveness. △ Less

Submitted 14 October, 2024; originally announced October 2024.

arXiv:2410.09207 [pdf, other]

P-FOLIO: Evaluating and Improving Logical Reasoning with Abundant Human-Written Reasoning Chains

Authors: Simeng Han, Aaron Yu, Rui Shen, Zhenting Qi, Martin Riddell, Wenfei Zhou, Yujie Qiao, Yilun Zhao, Semih Yavuz, Ye Liu, Shafiq Joty, Yingbo Zhou, Caiming Xiong, Dragomir Radev, Rex Ying, Arman Cohan

Abstract: Existing methods on understanding the capabilities of LLMs in logical reasoning rely on binary entailment classification or synthetically derived rationales, which are not sufficient for proper investigation of model's capabilities. We present P-FOLIO, a human-annotated dataset consisting of diverse and complex reasoning chains for a set of realistic logical reasoning stories also written by human… ▽ More Existing methods on understanding the capabilities of LLMs in logical reasoning rely on binary entailment classification or synthetically derived rationales, which are not sufficient for proper investigation of model's capabilities. We present P-FOLIO, a human-annotated dataset consisting of diverse and complex reasoning chains for a set of realistic logical reasoning stories also written by humans. P-FOLIO is collected with an annotation protocol that facilitates humans to annotate well-structured natural language proofs for first-order logic reasoning problems in a step-by-step manner. The number of reasoning steps in P-FOLIO span from 0 to 20. We further use P-FOLIO to evaluate and improve large-language-model (LLM) reasoning capabilities. We evaluate LLM reasoning capabilities at a fine granularity via single-step inference rule classification, with more diverse inference rules of more diverse and higher levels of complexities than previous works. Given that a single model-generated reasoning chain could take a completely different path than the human-annotated one, we sample multiple reasoning chains from a model and use pass@k metrics for evaluating the quality of model-generated reasoning chains. We show that human-written reasoning chains significantly boost the logical reasoning capabilities of LLMs via many-shot prompting and fine-tuning. Furthermore, fine-tuning Llama3-7B on P-FOLIO improves the model performance by 10% or more on three other out-of-domain logical reasoning datasets. We also conduct detailed analysis to show where most powerful LLMs fall short in reasoning. We will release the dataset and code publicly. △ Less

Submitted 11 October, 2024; originally announced October 2024.

arXiv:2410.07572 [pdf]

Edge-guided inverse design of digital metamaterials for ultra-high-capacity on-chip multi-dimensional interconnect

Authors: Aolong Sun, Sizhe Xing, Xuyu Deng, Ruoyu Shen, An Yan, Fangchen Hu, Yuqin Yuan, Boyu Dong, Junhao Zhao, Ouhan Huang, Ziwei Li, Jianyang Shi, Yingjun Zhou, Chao Shen, Yiheng Zhao, Bingzhou Hong, Wei Chu, Junwen Zhang, Haiwen Cai, Nan Chi

Abstract: The escalating demands of compute-intensive applications, including artificial intelligence, urgently necessitate the adoption of sophisticated optical on-chip interconnect technologies to overcome critical bottlenecks in scaling future computing systems. This transition requires leveraging the inherent parallelism of wavelength and mode dimensions of light, complemented by high-order modulation f… ▽ More The escalating demands of compute-intensive applications, including artificial intelligence, urgently necessitate the adoption of sophisticated optical on-chip interconnect technologies to overcome critical bottlenecks in scaling future computing systems. This transition requires leveraging the inherent parallelism of wavelength and mode dimensions of light, complemented by high-order modulation formats, to significantly enhance data throughput. Here we experimentally demonstrate a novel synergy of these three dimensions, achieving multi-tens-of-terabits-per-second on-chip interconnects using ultra-broadband, multi-mode digital metamaterials. Employing a highly efficient edge-guided analog-and-digital optimization method, we inversely design foundry-compatible, robust, and multi-port digital metamaterials with an 8xhigher computational efficiency. Using a packaged five-mode multiplexing chip, we demonstrate a single-wavelength interconnect capacity of 1.62 Tbit s-1 and a record-setting multi-dimensional interconnect capacity of 38.2 Tbit s-1 across 5 modes and 88 wavelength channels. A theoretical analysis suggests that further system optimization can enable on-chip interconnects to reach sub-petabit-per-second data transmission rates. This study highlights the transformative potential of optical interconnect technologies to surmount the constraints of electronic links, thus setting the stage for next-generation datacenter and optical compute interconnects. △ Less

Submitted 9 October, 2024; originally announced October 2024.

arXiv:2409.19665 [pdf, other]

Gravitational Wave Astronomy With TianQin

Authors: En-Kun Li, Shuai Liu, Alejandro Torres-Orjuela, Xian Chen, Kohei Inayoshi, Long Wang, Yi-Ming Hu, Pau Amaro-Seoane, Abbas Askar, Cosimo Bambi, Pedro R. Capelo, Hong-Yu Chen, Alvin J. K. Chua, Enrique Cond�s-Bre�a, Lixin Dai, Debtroy Das, Andrea Derdzinski, Hui-Min Fan, Michiko Fujii, Jie Gao, Mudit Garg, Hongwei Ge, Mirek Giersz, Shun-Jia Huang, Arkadiusz Hypki , et al. (27 additional authors not shown)

Abstract: The opening of the gravitational wave window has significantly enhanced our capacity to explore the universe's most extreme and dynamic sector. In the mHz frequency range, a diverse range of compact objects, from the most massive black holes at the farthest reaches of the Universe to the lightest white dwarfs in our cosmic backyard, generate a complex and dynamic symphony of gravitational wave sig… ▽ More The opening of the gravitational wave window has significantly enhanced our capacity to explore the universe's most extreme and dynamic sector. In the mHz frequency range, a diverse range of compact objects, from the most massive black holes at the farthest reaches of the Universe to the lightest white dwarfs in our cosmic backyard, generate a complex and dynamic symphony of gravitational wave signals. Once recorded by gravitational wave detectors, these unique fingerprints have the potential to decipher the birth and growth of cosmic structures over a wide range of scales, from stellar binaries and stellar clusters to galaxies and large-scale structures. The TianQin space-borne gravitational wave mission is scheduled for launch in the 2030s, with an operational lifespan of five years. It will facilitate pivotal insights into the history of our universe. This document presents a concise overview of the detectable sources of TianQin, outlining their characteristics, the challenges they present, and the expected impact of the TianQin observatory on our understanding of them. △ Less

Submitted 29 September, 2024; originally announced September 2024.

Comments: TianQin Gravitational Wave Whitepaper, 72 pages, 30 figures

arXiv:2409.01563 [pdf, other]

Blockchain-based Federated Recommendation with Incentive Mechanism

Authors: Jianhai Chen, Yanlin Wu, Dazhong Rong, Guoyao Yu, Lingqi Jiang, Zhenguang Liu, Peng Zhou, Rui Shen

Abstract: Nowadays, federated recommendation technology is rapidly evolving to help multiple organisations share data and train models while meeting user privacy, data security and government regulatory requirements. However, federated recommendation increases customer system costs such as power, computational and communication resources. Besides, federated recommendation systems are also susceptible to mod… ▽ More Nowadays, federated recommendation technology is rapidly evolving to help multiple organisations share data and train models while meeting user privacy, data security and government regulatory requirements. However, federated recommendation increases customer system costs such as power, computational and communication resources. Besides, federated recommendation systems are also susceptible to model attacks and data poisoning by participating malicious clients. Therefore, most customers are unwilling to participate in federated recommendation without any incentive. To address these problems, we propose a blockchain-based federated recommendation system with incentive mechanism to promote more trustworthy, secure, and efficient federated recommendation service. First, we construct a federated recommendation system based on NeuMF and FedAvg. Then we introduce a reverse auction mechanism to select optimal clients that can maximize the social surplus. Finally, we employ blockchain for on-chain evidence storage of models to ensure the safety of the federated recommendation system. The experimental results show that our proposed incentive mechanism can attract clients with superior training data to engage in the federal recommendation at a lower cost, which can increase the economic benefit of federal recommendation by 54.9\% while improve the recommendation performance. Thus our work provides theoretical and technological support for the construction of a harmonious and healthy ecological environment for the application of federal recommendation. △ Less

Submitted 2 September, 2024; originally announced September 2024.

Comments: This paper has been accepted on 2024 Blockchain and Web3 Technology Innovation and Application Exchange Conference (BWTAC 2024)

arXiv:2409.00472 [pdf, other]

doi 10.3847/2041-8213/ad7737

Repeated partial disruptions in a WD-NS or WD-BH merger modulate the prompt emission of long-duration merger-type GRBs

Authors: Junping Chen, Rong-Feng Shen, Wen-Jun Tan, Chen-Wei Wang, Shao-Lin Xiong, Run-Chao Chen, Bin-Bin Zhang

Abstract: The progenitors of gamma-ray bursts (GRBs) have long been an unresolved issue. GRB 230307A stands out as an exceptionally bright event, belonging to the long-duration GRBs but also exhibiting a late emission component reminiscent of a kilonova. Together with the similar events GRBs 060614 and 211211A, they make up a new sub-group of GRBs with intriguing progenitors. If such long-duration merger-ty… ▽ More The progenitors of gamma-ray bursts (GRBs) have long been an unresolved issue. GRB 230307A stands out as an exceptionally bright event, belonging to the long-duration GRBs but also exhibiting a late emission component reminiscent of a kilonova. Together with the similar events GRBs 060614 and 211211A, they make up a new sub-group of GRBs with intriguing progenitors. If such long-duration merger-type GRBs originated from the coalescence of a white dwarf (WD) with a neutron star (NS) or a black hole (BH), as proposed in the recent literature, then the larger tidal disruption radius of the WD, together with a non-negligible residual orbital eccentricity, would make repeated partial tidal disruptions inevitable. This may modulate the mass accretion and jet launching process at the NS or BH, resulting in a quasi-periodic modulation (QPM) in the light curve of the GRB, on the orbital period. The detection of potential QPMs during the early episode of prompt emission of these three GRBs supports this scenario, and the relatively slow QPM ($>$ 1 s) suggests that the lighter object can not be a NS. We propose that the progenitor system of GRBs 230307A, 060614, and 211211A consist of a WD of mass 1.3 $M_\odot$, 0.9 $M_\odot$ and 1.4 $M_\odot$, respectively, and a NS (or BH). After several cycles of modulations, the WD is completely destructed, and the accretion of the remaining debris dominates the extended emission episode. △ Less

Submitted 4 September, 2024; v1 submitted 31 August, 2024; originally announced September 2024.

Comments: 13 pages, 1 appendix, accepted for publication in ApJL

Journal ref: 2024, ApJL, 973, L33

arXiv:2408.08023 [pdf, other]

Causal Discovery from Time-Series Data with Short-Term Invariance-Based Convolutional Neural Networks

Authors: Rujia Shen, Boran Wang, Chao Zhao, Yi Guan, Jingchi Jiang

Abstract: Causal discovery from time-series data aims to capture both intra-slice (contemporaneous) and inter-slice (time-lagged) causality between variables within the temporal chain, which is crucial for various scientific disciplines. Compared to causal discovery from non-time-series data, causal discovery from time-series data necessitates more serialized samples with a larger amount of observed time st… ▽ More Causal discovery from time-series data aims to capture both intra-slice (contemporaneous) and inter-slice (time-lagged) causality between variables within the temporal chain, which is crucial for various scientific disciplines. Compared to causal discovery from non-time-series data, causal discovery from time-series data necessitates more serialized samples with a larger amount of observed time steps. To address the challenges, we propose a novel gradient-based causal discovery approach STIC, which focuses on \textbf{S}hort-\textbf{T}erm \textbf{I}nvariance using \textbf{C}onvolutional neural networks to uncover the causal relationships from time-series data. Specifically, STIC leverages both the short-term time and mechanism invariance of causality within each window observation, which possesses the property of independence, to enhance sample efficiency. Furthermore, we construct two causal convolution kernels, which correspond to the short-term time and mechanism invariance respectively, to estimate the window causal graph. To demonstrate the necessity of convolutional neural networks for causal discovery from time-series data, we theoretically derive the equivalence between convolution and the underlying generative principle of time-series data under the assumption that the additive noise model is identifiable. Experimental evaluations conducted on both synthetic and FMRI benchmark datasets demonstrate that our STIC outperforms baselines significantly and achieves the state-of-the-art performance, particularly when the datasets contain a limited number of observed time steps. Code is available at \url{https://github.com/HITshenrj/STIC}. △ Less

Submitted 15 August, 2024; originally announced August 2024.

arXiv:2408.07306 [pdf]

Deep-sub-cycle attosecond optical pulses

Authors: Hongliang Dang, Jiaxin Gao, Hao Wu, Xin Guo, Y. R. Shen, Limin Tong

Abstract: Sub-cycle optical pulse is of great importance for ultrafast science and technology. While a narrower pulse can offer a higher temporal resolution, so far the pulse width has not exceeded the limit of half an optical cycle. Here we propose to break the half-cycle limit via inverse Compton scattering in a nano-slit optical mode, in which a deep-subwavelength-confined optical field can be converted… ▽ More Sub-cycle optical pulse is of great importance for ultrafast science and technology. While a narrower pulse can offer a higher temporal resolution, so far the pulse width has not exceeded the limit of half an optical cycle. Here we propose to break the half-cycle limit via inverse Compton scattering in a nano-slit optical mode, in which a deep-subwavelength-confined optical field can be converted into a deep-sub-cycle attosecond pulse by high-energy free electrons flying through. Our calculations show that, with experimentally reachable conditions, a measurable deep-sub-cycle attosecond pulse with a width narrower than half an optical cycle can be generated. Quantitatively, using a deep-subwavelength-confined 4.4-$μ$m-wavelength 100-fs pulsed optical driving field and a 2-MeV 50-as 5-pC electron bunch, we can obtain an attosecond pulse with a peak frequency of 2.55 PHz, a pulse width of 99 as ($\sim$0.25 optical cycle), and a single-pulse photon number larger than 200 around the material damage threshold. Such pulses may open opportunities for studying light-matter interaction on the deep-sub-cycle level, and pave a way to unprecedented optical technology ranging from temporal super-resolution optical microscopy and spectroscopy to unconventional atom/molecule polarization and manipulation. △ Less

Submitted 17 August, 2024; v1 submitted 14 August, 2024; originally announced August 2024.

arXiv:2407.21359 [pdf, other]

ProSpec RL: Plan Ahead, then Execute

Authors: Liangliang Liu, Yi Guan, BoRan Wang, Rujia Shen, Yi Lin, Chaoran Kong, Lian Yan, Jingchi Jiang

Abstract: Imagining potential outcomes of actions before execution helps agents make more informed decisions, a prospective thinking ability fundamental to human cognition. However, mainstream model-free Reinforcement Learning (RL) methods lack the ability to proactively envision future scenarios, plan, and guide strategies. These methods typically rely on trial and error to adjust policy functions, aiming… ▽ More Imagining potential outcomes of actions before execution helps agents make more informed decisions, a prospective thinking ability fundamental to human cognition. However, mainstream model-free Reinforcement Learning (RL) methods lack the ability to proactively envision future scenarios, plan, and guide strategies. These methods typically rely on trial and error to adjust policy functions, aiming to maximize cumulative rewards or long-term value, even if such high-reward decisions place the environment in extremely dangerous states. To address this, we propose the Prospective (ProSpec) RL method, which makes higher-value, lower-risk optimal decisions by imagining future n-stream trajectories. Specifically, ProSpec employs a dynamic model to predict future states (termed "imagined states") based on the current state and a series of sampled actions. Furthermore, we integrate the concept of Model Predictive Control and introduce a cycle consistency constraint that allows the agent to evaluate and select the optimal actions from these trajectories. Moreover, ProSpec employs cycle consistency to mitigate two fundamental issues in RL: augmenting state reversibility to avoid irreversible events (low risk) and augmenting actions to generate numerous virtual trajectories, thereby improving data efficiency. We validated the effectiveness of our method on the DMControl benchmarks, where our approach achieved significant performance improvements. Code will be open-sourced upon acceptance. △ Less

Submitted 31 July, 2024; originally announced July 2024.

arXiv:2407.21275 [pdf, other]

Fi$^2$VTS: Time Series Forecasting Via Capturing Intra- and Inter-Variable Variations in the Frequency Domain

Authors: Rujia Shen, Yang Yang, Yaoxion Lin, Liangliang Liu, Boran Wang, Yi Guan, Jingchi Jiang

Abstract: Time series forecasting (TSF) plays a crucial role in various applications, including medical monitoring and crop growth. Despite the advancements in deep learning methods for TSF, their capacity to predict long-term series remains constrained. This limitation arises from the failure to account for both intra- and inter-variable variations meanwhile. To mitigate this challenge, we introduce the Fi… ▽ More Time series forecasting (TSF) plays a crucial role in various applications, including medical monitoring and crop growth. Despite the advancements in deep learning methods for TSF, their capacity to predict long-term series remains constrained. This limitation arises from the failure to account for both intra- and inter-variable variations meanwhile. To mitigate this challenge, we introduce the Fi$^2$VBlock, which leverages a \textbf{F}requency domain perspective to capture \textbf{i}ntra- and \textbf{i}nter-variable \textbf{V}ariations. After transforming into the frequency domain via the Frequency Transform Module, the Frequency Cross Attention between the real and imaginary parts is designed to obtain enhanced frequency representations and capture intra-variable variations. Furthermore, Inception blocks are employed to integrate information, thus capturing correlations across different variables. Our backbone network, Fi$^2$VTS, employs a residual architecture by concatenating multiple Fi$^2$VBlocks, thereby preventing degradation issues. Theoretically, we demonstrate that Fi$^2$VTS achieves a substantial reduction in both time and memory complexity, decreasing from $\mathcal{O}(L^2)$ to $\mathcal{O}(L)$ per Fi$^2$VBlock computation. Empirical evaluations reveal that Fi$^2$VTS outperforms other baselines on two benchmark datasets. The implementation code is accessible at \url{https://github.com/HITshenrj/Fi2VTS}. △ Less

Submitted 2 October, 2024; v1 submitted 30 July, 2024; originally announced July 2024.

arXiv:2407.20563 [pdf, other]

Pyramid Coder: Hierarchical Code Generator for Compositional Visual Question Answering

Authors: Ruoyue Shen, Nakamasa Inoue, Koichi Shinoda

Abstract: Visual question answering (VQA) is the task of providing accurate answers to natural language questions based on visual input. Programmatic VQA (PVQA) models have been gaining attention recently. These use large language models (LLMs) to formulate executable programs that address questions requiring complex visual reasoning. However, there are challenges in enabling LLMs to comprehend the usage of… ▽ More Visual question answering (VQA) is the task of providing accurate answers to natural language questions based on visual input. Programmatic VQA (PVQA) models have been gaining attention recently. These use large language models (LLMs) to formulate executable programs that address questions requiring complex visual reasoning. However, there are challenges in enabling LLMs to comprehend the usage of image processing modules and generate relevant code. To overcome these challenges, this paper introduces PyramidCoder, a novel prompting framework for PVQA models. PyramidCoder consists of three hierarchical levels, each serving a distinct purpose: query rephrasing, code generation, and answer aggregation. Notably, PyramidCoder utilizes a single frozen LLM and pre-defined prompts at each level, eliminating the need for additional training and ensuring flexibility across various LLM architectures. Compared to the state-of-the-art PVQA model, our approach improves accuracy by at least 0.5% on the GQA dataset, 1.4% on the VQAv2 dataset, and 2.9% on the NLVR2 dataset. △ Less

Submitted 30 July, 2024; originally announced July 2024.

Comments: Accepted to the IEEE International Conference on Image Processing (IEEE ICIP) 2024

arXiv:2407.19449 [pdf, other]

A High-Throughput FPGA Accelerator for Lightweight CNNs With Balanced Dataflow

Authors: Zhiyuan Zhao, Yihao Chen, Pengcheng Feng, Jixing Li, Gang Chen, Rongxuan Shen, Huaxiang Lu

Abstract: FPGA accelerators for lightweight neural convolutional networks (LWCNNs) have recently attracted significant attention. Most existing LWCNN accelerators focus on single-Computing-Engine (CE) architecture with local optimization. However, these designs typically suffer from high on-chip/off-chip memory overhead and low computational efficiency due to their layer-by-layer dataflow and unified resour… ▽ More FPGA accelerators for lightweight neural convolutional networks (LWCNNs) have recently attracted significant attention. Most existing LWCNN accelerators focus on single-Computing-Engine (CE) architecture with local optimization. However, these designs typically suffer from high on-chip/off-chip memory overhead and low computational efficiency due to their layer-by-layer dataflow and unified resource mapping mechanisms. To tackle these issues, a novel multi-CE-based accelerator with balanced dataflow is proposed to efficiently accelerate LWCNN through memory-oriented and computing-oriented optimizations. Firstly, a streaming architecture with hybrid CEs is designed to minimize off-chip memory access while maintaining a low cost of on-chip buffer size. Secondly, a balanced dataflow strategy is introduced for streaming architectures to enhance computational efficiency by improving efficient resource mapping and mitigating data congestion. Furthermore, a resource-aware memory and parallelism allocation methodology is proposed, based on a performance model, to achieve better performance and scalability. The proposed accelerator is evaluated on Xilinx ZC706 platform using MobileNetV2 and ShuffleNetV2.Implementation results demonstrate that the proposed accelerator can save up to 68.3% of on-chip memory size with reduced off-chip memory access compared to the reference design. It achieves an impressive performance of up to 2092.4 FPS and a state-of-the-art MAC efficiency of up to 94.58%, while maintaining a high DSP utilization of 95%, thus significantly outperforming current LWCNN accelerators. △ Less

Submitted 28 September, 2024; v1 submitted 28 July, 2024; originally announced July 2024.

Comments: 14 pages, 17 figures, and 5 tables

arXiv:2407.19094 [pdf, other]

Solving Robotics Problems in Zero-Shot with Vision-Language Models

Authors: Zidan Wang, Rui Shen, Bradly Stadie

Abstract: We introduce Wonderful Team, a multi-agent Vision Large Language Model (VLLM) framework designed to solve robotics problems in a zero-shot regime. In our context, zero-shot means that for a novel environment, we provide a VLLM with an image of the robot's surroundings and a task description, and the VLLM outputs the sequence of actions necessary for the robot to complete the task. Unlike prior wor… ▽ More We introduce Wonderful Team, a multi-agent Vision Large Language Model (VLLM) framework designed to solve robotics problems in a zero-shot regime. In our context, zero-shot means that for a novel environment, we provide a VLLM with an image of the robot's surroundings and a task description, and the VLLM outputs the sequence of actions necessary for the robot to complete the task. Unlike prior work that requires fine-tuning parts of the pipeline -- such as adjusting an LLM on robot-specific data or training separate vision encoders -- our approach demonstrates that with careful engineering, a single off-the-shelf VLLM can autonomously handle all aspects of a robotics task, from high-level planning to low-level location extraction and action execution. Crucially, compared to using GPT-4o alone, Wonderful Team is self-corrective and capable of iteratively fixing its own mistakes, enabling it to solve challenging long-horizon tasks. We validate our framework through extensive experiments, both in simulated environments using VIMABench and in real-world settings. Our system showcases the ability to handle diverse tasks such as manipulation, goal-reaching, and visual reasoning -- all in a zero-shot manner. These results underscore a key point: vision-language models have progressed rapidly in the past year and should be strongly considered as a backbone for many robotics problems moving forward. △ Less

Submitted 11 October, 2024; v1 submitted 26 July, 2024; originally announced July 2024.

Comments: aka Wonderful Team

arXiv:2407.04575 [pdf, other]

FA-GAN: Artifacts-free and Phase-aware High-fidelity GAN-based Vocoder

Authors: Rubing Shen, Yanzhen Ren, Zongkun Sun

Abstract: Generative adversarial network (GAN) based vocoders have achieved significant attention in speech synthesis with high quality and fast inference speed. However, there still exist many noticeable spectral artifacts, resulting in the quality decline of synthesized speech. In this work, we adopt a novel GAN-based vocoder designed for few artifacts and high fidelity, called FA-GAN. To suppress the ali… ▽ More Generative adversarial network (GAN) based vocoders have achieved significant attention in speech synthesis with high quality and fast inference speed. However, there still exist many noticeable spectral artifacts, resulting in the quality decline of synthesized speech. In this work, we adopt a novel GAN-based vocoder designed for few artifacts and high fidelity, called FA-GAN. To suppress the aliasing artifacts caused by non-ideal upsampling layers in high-frequency components, we introduce the anti-aliased twin deconvolution module in the generator. To alleviate blurring artifacts and enrich the reconstruction of spectral details, we propose a novel fine-grained multi-resolution real and imaginary loss to assist in the modeling of phase information. Experimental results reveal that FA-GAN outperforms the compared approaches in promoting audio quality and alleviating spectral artifacts, and exhibits superior performance when applied to unseen speaker scenarios. △ Less

Submitted 5 July, 2024; originally announced July 2024.

arXiv:2406.15437 [pdf, ps, other]

On small Sylow numbers of finite groups

Authors: Xiaofang Gao, Igor Lima, Rulin Shen

Abstract: Let $G$ be a finite group and $n_p(G)$ the number of Sylow $p$-subgroups of $G$. In this paper, we prove if $n_p(G)<p^2$ then almost all numbers $n_p(G)$ are a power of a prime. Let $G$ be a finite group and $n_p(G)$ the number of Sylow $p$-subgroups of $G$. In this paper, we prove if $n_p(G)<p^2$ then almost all numbers $n_p(G)$ are a power of a prime. △ Less

Submitted 29 May, 2024; originally announced June 2024.

arXiv:2406.08012 [pdf, other]

Interaction of an outflow with surrounding gaseous clouds as the origin of the late-time radio flares in TDEs

Authors: Jialun Zhuang, Rong-Feng Shen, Guobin Mou, Wenbin Lu

Abstract: Close encounter between a star and a supermassive black hole (SMBH) results in the tidal disruption of the star, known as a tidal disruption event (TDE). Recently, a few TDEs, e.g., ASASSN-15oi and AT2018hyz, have shown late-time (hundreds of days after their UV/optical peaks) radio flares with radio luminosities of $10^{38\sim39}$ erg/s. The super-Eddington fallback or accretion in a TDE may gene… ▽ More Close encounter between a star and a supermassive black hole (SMBH) results in the tidal disruption of the star, known as a tidal disruption event (TDE). Recently, a few TDEs, e.g., ASASSN-15oi and AT2018hyz, have shown late-time (hundreds of days after their UV/optical peaks) radio flares with radio luminosities of $10^{38\sim39}$ erg/s. The super-Eddington fallback or accretion in a TDE may generate a mass outflow. Here we investigate a scenario that the late-time radio flares come from the interaction of the outflow with the circum-nuclear gaseous clouds, in addition to the slow-evolving emission component due to the outflow-diffuse medium interaction. We calculate the associated radio temporal and spectral signatures and find that they reproduce well the observations. The outflows have the inferred velocity of 0.2$\sim0.8$ c, the total mass of $10^{-3}\sim10^{-1}$ $\mathrm{M_{\odot}}$ and the ejection duration of a month to a year. The distances of the clouds to the SMBH are $0.1\sim1$ pc. This scenario has advantages in explaining the long delay, sharpness of the rise and the multiplicity of the late radio flares. Future observations may build up a much larger sample of late-time radio flares and enable their use as a probe of the TDE physics and the host circumnuclear environment. △ Less

Submitted 26 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

Comments: 13 pages, 13 figures. Submitted to ApJ. A new version with some modifications. Comments are welcome

arXiv:2406.06388 [pdf, ps, other]

Simple smooth modules over the Ramond algebra and applications to vertex operator superalgebras

Authors: Yulu Chen, Ran Shen, Yufeng Yao, Kaiming Zhao

Abstract: Simple smooth modules over the Virasoro algebra and one of the super-Virasoro algebra named the Neveu-Schwarz algebra were classified. This problem remained unsolved for the other super-Virasoro algebra called the Ramond algebra. In this paper, all simple smooth modules over the Ramond algebra are classified. More precisely, a simple smooth module over the Ramond algebra is either a simple highest… ▽ More Simple smooth modules over the Virasoro algebra and one of the super-Virasoro algebra named the Neveu-Schwarz algebra were classified. This problem remained unsolved for the other super-Virasoro algebra called the Ramond algebra. In this paper, all simple smooth modules over the Ramond algebra are classified. More precisely, a simple smooth module over the Ramond algebra is either a simple highest weight module or isomorphic to an induced module from a simple module over a finite dimensional solvable Lie superalgebra. As an application we obtain all simple weak $ψ$-twisted modules over some veterx operator superalgebras. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: 18 pages

arXiv:2406.01065 [pdf, other]

Causal prompting model-based offline reinforcement learning

Authors: Xuehui Yu, Yi Guan, Rujia Shen, Xin Li, Chen Tang, Jingchi Jiang

Abstract: Model-based offline Reinforcement Learning (RL) allows agents to fully utilise pre-collected datasets without requiring additional or unethical explorations. However, applying model-based offline RL to online systems presents challenges, primarily due to the highly suboptimal (noise-filled) and diverse nature of datasets generated by online systems. To tackle these issues, we introduce the Causal… ▽ More Model-based offline Reinforcement Learning (RL) allows agents to fully utilise pre-collected datasets without requiring additional or unethical explorations. However, applying model-based offline RL to online systems presents challenges, primarily due to the highly suboptimal (noise-filled) and diverse nature of datasets generated by online systems. To tackle these issues, we introduce the Causal Prompting Reinforcement Learning (CPRL) framework, designed for highly suboptimal and resource-constrained online scenarios. The initial phase of CPRL involves the introduction of the Hidden-Parameter Block Causal Prompting Dynamic (Hip-BCPD) to model environmental dynamics. This approach utilises invariant causal prompts and aligns hidden parameters to generalise to new and diverse online users. In the subsequent phase, a single policy is trained to address multiple tasks through the amalgamation of reusable skills, circumventing the need for training from scratch. Experiments conducted across datasets with varying levels of noise, including simulation-based and real-world offline datasets from the Dnurse APP, demonstrate that our proposed method can make robust decisions in out-of-distribution and noisy environments, outperforming contemporary algorithms. Additionally, we separately verify the contributions of Hip-BCPDs and the skill-reuse strategy to the robustness of performance. We further analyse the visualised structure of Hip-BCPD and the interpretability of sub-skills. We released our source code and the first ever real-world medical dataset for precise medical decision-making tasks. △ Less

Submitted 3 June, 2024; originally announced June 2024.

arXiv:2405.19766 [pdf, other]

Hierarchical Structure and Self-gravity in the Maddalena Giant Molecular Cloud

Authors: Renjie Shen, Yuehui Ma, Hongchi Wang, Suziye He, Miaomiao Zhang

Abstract: In this work, we present the data from the Milky Way Imaging Scroll Painting (MWISP) project for the Maddalena giant molecular cloud (GMC). We decompose the 13CO emission datacube of the observed region into hierarchical substructures using a modified Dendrogram algorithm. We investigate the statistical properties of these substructures and examine the role that self-gravity plays on various spati… ▽ More In this work, we present the data from the Milky Way Imaging Scroll Painting (MWISP) project for the Maddalena giant molecular cloud (GMC). We decompose the 13CO emission datacube of the observed region into hierarchical substructures using a modified Dendrogram algorithm. We investigate the statistical properties of these substructures and examine the role that self-gravity plays on various spatial scales. The statistics of the mass (M), radius (R), velocity dispersion (σv), virial parameter (αvir), and sonic Mach number of the substructures are presented. The radius and mass distributions and the σv-R scaling relationship of the substructures resemble those reported in previous studies that use non-hierarchical algorithms to identify the entities. We find that for the hierarchical substructures αvir decreases as the radius or mass of the substructures increases. The majority of the substructures in the quiescent region of Maddalena GMC are not gravitationally bound (αvir > 2), while most of the substructures in the star-forming regions are gravitationally bound (αvir < 2). Furthermore, we find that self-gravity plays an important role on scales of 0.8-4 pc in the IRAS 06453 star-forming region, while it is not an important factor on scales below 5 pc in the non-star-forming region. △ Less

Submitted 31 May, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.17458 [pdf, other]

Blood Glucose Control Via Pre-trained Counterfactual Invertible Neural Networks

Authors: Jingchi Jiang, Rujia Shen, Boran Wang, Yi Guan

Abstract: Type 1 diabetes mellitus (T1D) is characterized by insulin deficiency and blood glucose (BG) control issues. The state-of-the-art solution for continuous BG control is reinforcement learning (RL), where an agent can dynamically adjust exogenous insulin doses in time to maintain BG levels within the target range. However, due to the lack of action guidance, the agent often needs to learn from rando… ▽ More Type 1 diabetes mellitus (T1D) is characterized by insulin deficiency and blood glucose (BG) control issues. The state-of-the-art solution for continuous BG control is reinforcement learning (RL), where an agent can dynamically adjust exogenous insulin doses in time to maintain BG levels within the target range. However, due to the lack of action guidance, the agent often needs to learn from randomized trials to understand misleading correlations between exogenous insulin doses and BG levels, which can lead to instability and unsafety. To address these challenges, we propose an introspective RL based on Counterfactual Invertible Neural Networks (CINN). We use the pre-trained CINN as a frozen introspective block of the RL agent, which integrates forward prediction and counterfactual inference to guide the policy updates, promoting more stable and safer BG control. Constructed based on interpretable causal order, CINN employs bidirectional encoders with affine coupling layers to ensure invertibility while using orthogonal weight normalization to enhance the trainability, thereby ensuring the bidirectional differentiability of network parameters. We experimentally validate the accuracy and generalization ability of the pre-trained CINN in BG prediction and counterfactual inference for action. Furthermore, our experimental results highlight the effectiveness of pre-trained CINN in guiding RL policy updates for more accurate and safer BG control. △ Less

Submitted 18 July, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

arXiv:2404.16425 [pdf, other]

Soft X-ray prompt emission from a high-redshift gamma-ray burst EP240315a

Authors: Y. Liu, H. Sun, D. Xu, D. S. Svinkin, J. Delaunay, N. R. Tanvir, H. Gao, C. Zhang, Y. Chen, X. -F. Wu, B. Zhang, W. Yuan, J. An, G. Bruni, D. D. Frederiks, G. Ghirlanda, J. -W. Hu, A. Li, C. -K. Li, J. -D. Li, D. B. Malesani, L. Piro, G. Raman, R. Ricci, E. Troja , et al. (170 additional authors not shown)

Abstract: Long gamma-ray bursts (GRBs) are believed to originate from core collapse of massive stars. High-redshift GRBs can probe the star formation and reionization history of the early universe, but their detection remains rare. Here we report the detection of a GRB triggered in the 0.5--4 keV band by the Wide-field X-ray Telescope (WXT) on board the Einstein Probe (EP) mission, designated as EP240315a,… ▽ More Long gamma-ray bursts (GRBs) are believed to originate from core collapse of massive stars. High-redshift GRBs can probe the star formation and reionization history of the early universe, but their detection remains rare. Here we report the detection of a GRB triggered in the 0.5--4 keV band by the Wide-field X-ray Telescope (WXT) on board the Einstein Probe (EP) mission, designated as EP240315a, whose bright peak was also detected by the Swift Burst Alert Telescope and Konus-Wind through off-line analyses. At a redshift of $z=4.859$, EP240315a showed a much longer and more complicated light curve in the soft X-ray band than in gamma-rays. Benefiting from a large field-of-view ($\sim$3600 deg$^2$) and a high sensitivity, EP-WXT captured the earlier engine activation and extended late engine activity through a continuous detection. With a peak X-ray flux at the faint end of previously known high-$z$ GRBs, the detection of EP240315a demonstrates the great potential for EP to study the early universe via GRBs. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: 41 pages, 8 figures, 7 tables

arXiv:2403.14249 [pdf, other]

Direct Probe of Topology and Geometry of Quantum States on IBM Q

Authors: Tianqi Chen, Hai-Tao Ding, Ruizhe Shen, Shi-Liang Zhu, Jiangbin Gong

Abstract: The concepts of topology and geometry are of critical importance in exploring exotic phases of quantum matter. Though they have been investigated on various experimental platforms, to date a direct probe of topological and geometric properties on a universal quantum computer even for a minimum model is still in vain. In this work, we first show that a density matrix form of the quantum geometric t… ▽ More The concepts of topology and geometry are of critical importance in exploring exotic phases of quantum matter. Though they have been investigated on various experimental platforms, to date a direct probe of topological and geometric properties on a universal quantum computer even for a minimum model is still in vain. In this work, we first show that a density matrix form of the quantum geometric tensor (QGT) can be explicitly re-constructed from Pauli operator measurements on a quantum circuit. We then propose two algorithms, suitable for IBM quantum computers, to directly probe QGT. The first algorithm is a variational quantum algorithm particularly suitable for Noisy Intermediate-Scale Quantum (NISQ)-era devices, whereas the second one is a pure quantum algorithm based on quantum imaginary time evolution. Explicit results obtained from IBM Q simulating a Chern insulator model are presented and analysed. Our results indicate that transmon qubit-based universal quantum computers have the potential to directly simulate and investigate topological and geometric properties of a quantum system. △ Less

Submitted 6 June, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

Comments: 14 pages, 8 figures (updated main text and references)

arXiv:2403.09954 [pdf]

Search-based Ordered Password Generation of Autoregressive Neural Networks

Authors: Min Jin, Junbin Ye, Rongxuan Shen, Huaxing Lu

Abstract: Passwords are the most widely used method of authentication and password guessing is the essential part of password cracking and password security research. The progress of deep learning technology provides a promising way to improve the efficiency of password guessing. However, current research on neural network password guessing methods mostly focuses on model structure and has overlooked the ge… ▽ More Passwords are the most widely used method of authentication and password guessing is the essential part of password cracking and password security research. The progress of deep learning technology provides a promising way to improve the efficiency of password guessing. However, current research on neural network password guessing methods mostly focuses on model structure and has overlooked the generation method. Due to the randomness of sampling, not only the generated passwords have a large number of duplicates, but also the order in which passwords generated is random, leading to inefficient password attacks. In this paper, we propose SOPG, a search-based ordered password generation method, which enables the password guessing model based on autoregressive neural network to generate passwords in approximately descending order of probability. Experiment on comparison of SOPG and Random sampling shows passwords generated by SOPG do not repeat, and when they reach the same cover rate, SOPG requires fewer inferences and far fewer generated passwords than Random sampling, which brings great efficiency improvement to subsequent password attacks. We build SOPGesGPT, a password guessing model based on GPT, using SOPG to generate passwords. Compared with the most influential models OMEN, FLA, PassGAN, VAEPass and the latest model PassGPT in one-site test, experiments show that SOPGesGPT is far ahead in terms of both effective rate and cover rate. As to cover rate that everyone recognizes, SOPGesGPT reaches 35.06%, which is 254%, 298%, 421%, 380%, 81% higher than OMEN, FLA, PassGAN, VAEPass, and PassGPT respectively. △ Less

Submitted 14 March, 2024; originally announced March 2024.

Comments: This paper is in Peer Review

arXiv:2403.09444 [pdf, other]

doi 10.3847/1538-4357/ad5a85

Expansion and Spectral Softening of the Dust Scattering Rings of GRB 221009A

Authors: Guoying Zhao, Rong-Feng Shen

Abstract: Expanding X-ray halo or rings appear when short pulses of X-ray radiation from a background source are scattered by clouds of dust in the Milky Way. We study the X-ray rings of the brightest gamma-ray burst (GRB) 221009A, detected by the {\it Swift} X-Ray Telescope. The rings center on the GRB position and their angular radii increase with time. We identify five major expanding rings, and our mode… ▽ More Expanding X-ray halo or rings appear when short pulses of X-ray radiation from a background source are scattered by clouds of dust in the Milky Way. We study the X-ray rings of the brightest gamma-ray burst (GRB) 221009A, detected by the {\it Swift} X-Ray Telescope. The rings center on the GRB position and their angular radii increase with time. We identify five major expanding rings, and our modeling of their expansion history suggests that they are scattered off, respectively, from five dusty clouds at distances of 0.4-13 kpc from the observer. Given an assumed prompt X-ray fluence of this GRB, the fluxes of those rings suggest that these clouds have dust grain column densities of $10^{7\sim8}~\mathrm{cm^{-2}}$. More interestingly, our time-dependent spectral analysis of these rings show that they all experience spectral softening, i.e., getting softer as they expand, with spectral indices ranging from 2.2 to 5, consistent with what the dust scattering model predicts. △ Less

Submitted 20 June, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

Comments: 9 pages, 6 figures, 2 tables, Accepted for publication in ApJ

Journal ref: 2024, ApJ, 970, 124

arXiv:2403.04931 [pdf, other]

A Survey on Human-AI Teaming with Large Pre-Trained Models

Authors: Vanshika Vats, Marzia Binta Nizam, Minghao Liu, Ziyuan Wang, Richard Ho, Mohnish Sai Prasad, Vincent Titterton, Sai Venkat Malreddy, Riya Aggarwal, Yanwen Xu, Lei Ding, Jay Mehta, Nathan Grinnell, Li Liu, Sijia Zhong, Devanathan Nallur Gandamani, Xinyi Tang, Rohan Ghosalkar, Celeste Shen, Rachel Shen, Nafisa Hussain, Kesav Ravichandran, James Davis

Abstract: In the rapidly evolving landscape of artificial intelligence (AI), the collaboration between human intelligence and AI systems, known as Human-AI (HAI) Teaming, has emerged as a cornerstone for advancing problem-solving and decision-making processes. The advent of Large Pre-trained Models (LPtM) has significantly transformed this landscape, offering unprecedented capabilities by leveraging vast am… ▽ More In the rapidly evolving landscape of artificial intelligence (AI), the collaboration between human intelligence and AI systems, known as Human-AI (HAI) Teaming, has emerged as a cornerstone for advancing problem-solving and decision-making processes. The advent of Large Pre-trained Models (LPtM) has significantly transformed this landscape, offering unprecedented capabilities by leveraging vast amounts of data to understand and predict complex patterns. This paper surveys the pivotal integration of LPtMs with HAI, emphasizing how these models enhance collaborative intelligence beyond traditional approaches. It examines the potential of LPtMs in augmenting human capabilities, discussing this collaboration for AI model improvements, effective teaming, ethical considerations, and their broad applied implications in various sectors. Through this exploration, the study sheds light on the transformative impact of LPtM-enhanced HAI Teaming, providing insights for future research, policy development, and strategic implementations aimed at harnessing the full potential of this collaboration for research and societal benefit. △ Less

Submitted 26 June, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

arXiv:2403.02395 [pdf, other]

Enhanced many-body quantum scars from the non-Hermitian Fock skin effect

Authors: Ruizhe Shen, Fang Qin, Jean-Yves Desaules, Zlatko Papić, Ching Hua Lee

Abstract: In contrast with extended Bloch waves, a single particle can become spatially localized due to the so-called skin effect originating from non-Hermitian pumping. Here we show that in kinetically-constrained many-body systems, the skin effect can instead manifest as dynamical amplification within the Fock space, beyond the intuitively expected and previously studied particle localization and cluster… ▽ More In contrast with extended Bloch waves, a single particle can become spatially localized due to the so-called skin effect originating from non-Hermitian pumping. Here we show that in kinetically-constrained many-body systems, the skin effect can instead manifest as dynamical amplification within the Fock space, beyond the intuitively expected and previously studied particle localization and clustering. We exemplify this non-Hermitian Fock skin effect in an asymmetric version of the PXP model and show that it gives rise to ergodicity-breaking eigenstates, the non-Hermitian analogs of quantum many-body scars. A distinguishing feature of these non-Hermitian scars is their enhanced robustness against external disorders. We propose an experimental realization of the non-Hermitian scar enhancement in a tilted Bose-Hubbard optical lattice with laser-induced loss. Additionally, we implement digital simulations of such scar enhancement on the IBM quantum processor. Our results show that the Fock skin effect provides a powerful tool for creating robust non-ergodic states in generic open quantum systems. △ Less

Submitted 2 September, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

Comments: 20 pages and 16 figures, including new quantum simulation results

arXiv:2402.00077 [pdf, other]

Unlocking the Power of Multi-institutional Data: Integrating and Harmonizing Genomic Data Across Institutions

Authors: Yuan Chen, Ronglai Shen, Xiwen Feng, Katherine Panageas

Abstract: Cancer is a complex disease driven by genomic alterations, and tumor sequencing is becoming a mainstay of clinical care for cancer patients. The emergence of multi-institution sequencing data presents a powerful resource for learning real-world evidence to enhance precision oncology. GENIE BPC, led by the American Association for Cancer Research, establishes a unique database linking genomic data… ▽ More Cancer is a complex disease driven by genomic alterations, and tumor sequencing is becoming a mainstay of clinical care for cancer patients. The emergence of multi-institution sequencing data presents a powerful resource for learning real-world evidence to enhance precision oncology. GENIE BPC, led by the American Association for Cancer Research, establishes a unique database linking genomic data with clinical information for patients treated at multiple cancer centers. However, leveraging such multi-institutional sequencing data presents significant challenges. Variations in gene panels result in loss of information when the analysis is conducted on common gene sets. Additionally, differences in sequencing techniques and patient heterogeneity across institutions add complexity. High data dimensionality, sparse gene mutation patterns, and weak signals at the individual gene level further complicate matters. Motivated by these real-world challenges, we introduce the Bridge model. It uses a quantile-matched latent variable approach to derive integrated features to preserve information beyond common genes and maximize the utilization of all available data while leveraging information sharing to enhance both learning efficiency and the model's capacity to generalize. By extracting harmonized and noise-reduced lower-dimensional latent variables, the true mutation pattern unique to each individual is captured. We assess the model's performance and parameter estimation through extensive simulation studies. The extracted latent features from the Bridge model consistently excel in predicting patient survival across six cancer types in GENIE BPC data. △ Less

Submitted 30 January, 2024; originally announced February 2024.

arXiv:2401.12494 [pdf, ps, other]

doi 10.3847/1538-4357/ad65d8

Apparently ultra-long period radio sources from self-lensed pulsar-black hole binaries

Authors: Xinxu Xiao, Rong-feng Shen

Abstract: Pulsar-black hole (BH) close binary systems, which have not been found yet, are unique laboratories for testing theories of gravity and understanding the formation channels of gravitational-wave sources. We study the self-gravitational lensing effect in a pulsar-BH system on the pulsar's emission. Because this effect occurs once per orbital period for almost edge-on binaries, we find that it could… ▽ More Pulsar-black hole (BH) close binary systems, which have not been found yet, are unique laboratories for testing theories of gravity and understanding the formation channels of gravitational-wave sources. We study the self-gravitational lensing effect in a pulsar-BH system on the pulsar's emission. Because this effect occurs once per orbital period for almost edge-on binaries, we find that it could generate apparently ultra-long period (minutes to hours) radio signals when the intrinsic pulsar signal is too weak to detect. Each of such lensed signals, or 'pulse', is composed of a number of amplified intrinsic pulsar pulses. We estimate that a radio telescope with a sensitivity of $10\,\rm mJy$ could detect $\sim$ a few systems that emit such signals in our galaxy. The model is applied to three recently found puzzling long-period radio sources: GLEAM-X J1627, PSR J0901-4046, and GPM J1839-10. To explain their observed signal durations and periods, the masses of their lensing components would be $\sim10^4\,\rm M_{\odot}$, $\sim4\,\rm M_{\odot}$ and $10^{3-6}\,\rm M_{\odot}$, respectively, with their binary coalescence times ranging from a few tens to thousands of years. However, the implied merger rates (as high as $\sim 10^{3-4}\,\rm Myr^{-1}$ per galaxy) and the large period decay rates ($>10^{-8}\,\rm s\,s^{-1}$) tend to disfavour this self-lensing scenario for these three sources. Finally, for a binary containing a millisecond pulsar and a stellar-mass BH, the Shapiro delay effect would cause a $\geq10\%$ variation of the profile width for the sub-pulses in such lensed signals. △ Less

Submitted 21 July, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

Comments: 13 pages, 6 figures, 1 table, Accepted for publication in ApJ

Journal ref: 2024, ApJ, 972, 60

arXiv:2401.01718 [pdf]

RHDLPP: A multigroup radiation hydrodynamics code for laser-produced plasmas

Authors: Qi Min, Ziyang Xu, Siqi He, Haidong Lu, Xingbang Liu, Ruizi Shen, Yanhong Wu, Qikun Pan, Chongxiao Zhao, Fei Chen, Maogen Su, Chenzhong Dong

Abstract: We introduce the RHDLPP, a flux-limited multigroup radiation hydrodynamics numerical code designed for simulating laser-produced plasmas in diverse environments. The code bifurcates into two packages: RHDLPP-LTP for low-temperature plasmas generated by moderate-intensity nanosecond lasers, and RHDLPP-HTP for high-temperature, high-density plasmas formed by high-intensity laser pulses. The core rad… ▽ More We introduce the RHDLPP, a flux-limited multigroup radiation hydrodynamics numerical code designed for simulating laser-produced plasmas in diverse environments. The code bifurcates into two packages: RHDLPP-LTP for low-temperature plasmas generated by moderate-intensity nanosecond lasers, and RHDLPP-HTP for high-temperature, high-density plasmas formed by high-intensity laser pulses. The core radiation hydrodynamic equations are resolved in the Eulerian frame, employing an operator-split method. This method decomposes the solution into two substeps: first, the explicit resolution of the hyperbolic subsystems integrating radiation and fluid dynamics, and second, the implicit treatment of the parabolic part comprising stiff radiation diffusion, heat conduction, and energy exchange. Laser propagation and energy deposition are modeled through a hybrid approach, combining geometrical optics ray-tracing in sub-critical plasma regions with a one-dimensional solution of the Helmholtz wave equation in super-critical areas. The thermodynamic states are ascertained using an equation of state, based on either the real gas approximation or the quotidian equation of state (QEOS). Additionally, RHDLPP includes RHDLPP-SpeIma3D, a three-dimensional spectral simulation post-processing module, for generating both temporally-spatially resolved and time-integrated spectra and imaging, facilitating direct comparisons with experimental data. The paper showcases a series of verification tests to establish the code's accuracy and efficiency, followed by application cases, including simulations of laser-produced aluminum (Al) plasmas, pre-pulse-induced target deformation of tin (Sn) microdroplets relevant to extreme ultraviolet lithography light sources, and varied imaging and spectroscopic simulations. △ Less

Submitted 3 January, 2024; originally announced January 2024.

arXiv:2401.01613 [pdf, ps, other]

Synthetically enhanced sensitivity using higher-order exceptional point and coherent perfect absorption

Authors: Yao-Dong Hu, Yi-Pu Wang, Rui-Chang Shen, Zi-Qi Wang, Wei-Jiang Wu, J. Q. You

Abstract: Sensors play a crucial role in advanced apparatuses and it is persistently pursued to improve their sensitivities. Recently, the singularity of a non-Hermitian system, known as the exceptional point (EP), has drawn much attention for this goal. Response of the eigenfrequency shift to a perturbation $ε$ follows the $ε^{1/n}$-dependence at an $n$th-order EP, leading to significantly enhanced sensiti… ▽ More Sensors play a crucial role in advanced apparatuses and it is persistently pursued to improve their sensitivities. Recently, the singularity of a non-Hermitian system, known as the exceptional point (EP), has drawn much attention for this goal. Response of the eigenfrequency shift to a perturbation $ε$ follows the $ε^{1/n}$-dependence at an $n$th-order EP, leading to significantly enhanced sensitivity via a high-order EP. However, due to the requirement of increasingly complicated systems, great difficulties will occur along the path of increasing the EP order to enhance the sensitivity. Here we report that by utilizing the spectral anomaly of the coherent perfect absorption (CPA), the sensitivity at a third-order EP can be further enhanced owing to the cooperative effects of both CPA and EP. We realize this synthetically enhanced sensor using a pseudo-Hermitian cavity magnonic system composed of two yttrium iron garnet spheres and a microwave cavity. The detectable minimum change of the magnetic field reaches $4.2\times10^{-21}$T. It opens a new avenue to design novel sensors using hybrid non-Hermitian quantum systems. △ Less

Submitted 3 January, 2024; originally announced January 2024.

Comments: 18 pages,5 figures

arXiv:2312.15352 [pdf]

A Bayesian Basket Trial Design Using Local Power Prior

Authors: Haiming Zhou, Rex Shen, Sutan Wu, Philip He

Abstract: In recent years, basket trials, which enable the evaluation of an experimental therapy across multiple tumor types within a single protocol, have gained prominence in early-phase oncology development. Unlike traditional trials, where each tumor type is evaluated separately with limited sample size, basket trials offer the advantage of borrowing information across various tumor types. However, a ke… ▽ More In recent years, basket trials, which enable the evaluation of an experimental therapy across multiple tumor types within a single protocol, have gained prominence in early-phase oncology development. Unlike traditional trials, where each tumor type is evaluated separately with limited sample size, basket trials offer the advantage of borrowing information across various tumor types. However, a key challenge in designing basket trials lies in dynamically determining the extent of information borrowing across tumor types to enhance statistical power while maintaining an acceptable type I error rate. In this paper, we propose a local power prior framework that includes a 3-component borrowing mechanism with explicit model interpretation. Unlike many existing Bayesian methods that require Markov Chain Monte Carlo (MCMC) sampling, the proposed framework offers a closed-form solution, eliminating the time-consuming nature of MCMC in large-scale simulations for evaluating operating characteristics. Extensive simulations have been conducted and demonstrated a good performance of the proposal method comparable to the other complex methods. The significantly shortened computation time further underscores the practical utility in the context of basket trials. △ Less

Submitted 19 April, 2024; v1 submitted 23 December, 2023; originally announced December 2023.

arXiv:2312.06771 [pdf, other]

Evidence for Stream Collision and Disk Formation in Tidal Disruption Events

Authors: Hengxiao Guo, Jingbo Sun, Shuang-Liang Li, Yan-Fei Jiang, Tinggui Wang, Defu Bu, Ning Jiang, Yanan Wang, Yuhan Yao, Rongfeng Shen, Minfeng Gu, Mouyuan Sun

Abstract: When a star passes through the tidal disruption radius of a massive black hole (BH), it can be torn apart by the tidal force of the BH, known as the Tidal Disruption Event (TDE). Since the UV/optical emitting region inferred from the blackbody radius is significantly larger than the circularization radius predicted by the classical TDE theory, two competing models, stream collision and envelope re… ▽ More When a star passes through the tidal disruption radius of a massive black hole (BH), it can be torn apart by the tidal force of the BH, known as the Tidal Disruption Event (TDE). Since the UV/optical emitting region inferred from the blackbody radius is significantly larger than the circularization radius predicted by the classical TDE theory, two competing models, stream collision and envelope reprocessing, were proposed to explain the unexpectedly large UV/optical emitting size. Here, we investigate the variability behaviors (cross-correlation and time delay) of three representative TDEs with continuum reverberation mapping. Our results demonstrate that TDE behavior is clearly inconsistent with the envelope reprocessing scenario. In contrast, the picture of the stream collision, together with the late-time formed accretion disk, can explain heterogeneous observations. This provides compelling evidence that the UV/optical emission originates from stream collisions during the early-stage of TDE evolution and gradually transitions to being dominated by accretion disk with detectable X-ray emission in a late stage. After fading back to a quiescent state, recurrent flares may be observed in some occasions, such as partial TDEs. △ Less

Submitted 11 December, 2023; originally announced December 2023.

Comments: 56 pages, 21 figures, 5 tables. Submitted, comments welcome!

arXiv:2311.14737 [pdf, other]

Positional Description Matters for Transformers Arithmetic

Authors: Ruoqi Shen, S�bastien Bubeck, Ronen Eldan, Yin Tat Lee, Yuanzhi Li, Yi Zhang

Abstract: Transformers, central to the successes in modern Natural Language Processing, often falter on arithmetic tasks despite their vast capabilities --which paradoxically include remarkable coding abilities. We observe that a crucial challenge is their naive reliance on positional information to solve arithmetic problems with a small number of digits, leading to poor performance on larger numbers. Herei… ▽ More Transformers, central to the successes in modern Natural Language Processing, often falter on arithmetic tasks despite their vast capabilities --which paradoxically include remarkable coding abilities. We observe that a crucial challenge is their naive reliance on positional information to solve arithmetic problems with a small number of digits, leading to poor performance on larger numbers. Herein, we delve deeper into the role of positional encoding, and propose several ways to fix the issue, either by modifying the positional encoding directly, or by modifying the representation of the arithmetic task to leverage standard positional encoding differently. We investigate the value of these modifications for three tasks: (i) classical multiplication, (ii) length extrapolation in addition, and (iii) addition in natural language context. For (i) we train a small model on a small dataset (100M parameters and 300k samples) with remarkable aptitude in (direct, no scratchpad) 15 digits multiplication and essentially perfect up to 12 digits, while usual training in this context would give a model failing at 4 digits multiplication. In the experiments on addition, we use a mere 120k samples to demonstrate: for (ii) extrapolation from 10 digits to testing on 12 digits numbers while usual training would have no extrapolation, and for (iii) almost perfect accuracy up to 5 digits while usual training would be correct only up to 3 digits (which is essentially memorization with a training set of 120k samples). △ Less

Submitted 21 November, 2023; originally announced November 2023.

Comments: 18 pages

arXiv:2311.10143 [pdf, other]

Observation of the non-Hermitian skin effect and Fermi skin on a digital quantum computer

Authors: Ruizhe Shen, Tianqi Chen, Bo Yang, Ching Hua Lee

Abstract: Non-Hermitian physics has attracted considerable attention in the recent years, in particular the non-Hermitian skin effect (NHSE) for its extreme sensitivity and non-locality. While the NHSE has been physically observed in various classical metamaterials and even ultracold atomic arrays, its highly-nontrivial implications in many-body dynamics have never been experimentally investigated. In this… ▽ More Non-Hermitian physics has attracted considerable attention in the recent years, in particular the non-Hermitian skin effect (NHSE) for its extreme sensitivity and non-locality. While the NHSE has been physically observed in various classical metamaterials and even ultracold atomic arrays, its highly-nontrivial implications in many-body dynamics have never been experimentally investigated. In this work, we report the first observation of the NHSE on a universal quantum processor, as well as its characteristic but elusive Fermi skin from many-fermion statistics. To implement NHSE dynamics on a quantum computer, the effective time-evolution circuit not only needs to be non-reciprocal and non-unitary, but must also be scaled up to a sufficient number of lattice qubits to achieve spatial non-locality. We show how such a non-unitary operation can be systematically realized by post-selecting multiple ancilla qubits, as demonstrated through two paradigmatic non-reciprocal models on a noisy IBM quantum processor, with clear signatures of asymmetric spatial propagation and many-body Fermi skin accumulation. To minimize errors from inevitable device noise, time evolution is performed using a trainable optimized quantum circuit produced with variational quantum algorithms. Our study represents a critical milestone in the quantum simulation of non-Hermitian lattice phenomena on present-day quantum computers, and can be readily generalized to more sophisticated many-body models with the remarkable programmability of quantum computers. △ Less

Submitted 17 December, 2023; v1 submitted 16 November, 2023; originally announced November 2023.

Comments: 24 pages and 13 figures

arXiv:2311.03968 [pdf, ps, other]

Channel-localized Strichartz estimates of radial wave equations

Authors: Liang Li, Shenghao Luo, Ruipeng Shen

Abstract: In this work we give a few new Strichartz estimates of radial solutions to the wave equation. These Strichartz estimates still use $L^p L^q$ type norms in each channel-like region $\{(x,t): |t|+2^k < |x| < |t|+2^{k+1}\}$, with weaker restrictions on $p, q$ than the classic ones, but combine these localized norms together in the way of an $l^2$ space. We also give an application of these Strichartz… ▽ More In this work we give a few new Strichartz estimates of radial solutions to the wave equation. These Strichartz estimates still use $L^p L^q$ type norms in each channel-like region $\{(x,t): |t|+2^k < |x| < |t|+2^{k+1}\}$, with weaker restrictions on $p, q$ than the classic ones, but combine these localized norms together in the way of an $l^2$ space. We also give an application of these Strichartz estimates on the well-posedness theory of non-linear wave equations. △ Less

Submitted 7 November, 2023; originally announced November 2023.

Comments: 17 pages, 1 figure

MSC Class: 35L71; 35L05

arXiv:2311.00948 [pdf, ps, other]

The generation and detection of the spin-valley-polarization in semi-Dirac materials

Authors: Yupeng Huang, R. Shen

Abstract: We investigated the transport properties in a normal metal/ferromagnet/normal metal/superconductor junction based on semi-Dirac materials with inverted energy gap. With a scattering matrix approach, we show that the electron transport in the junction is spin-valley-polarized due to the ferromagnetic exchange energy. It is also shown that the Andreev reflection is strongly suppressed, which is a cl… ▽ More We investigated the transport properties in a normal metal/ferromagnet/normal metal/superconductor junction based on semi-Dirac materials with inverted energy gap. With a scattering matrix approach, we show that the electron transport in the junction is spin-valley-polarized due to the ferromagnetic exchange energy. It is also shown that the Andreev reflection is strongly suppressed, which is a clear experimental signal for the spin-valley-polarization in semi-Dirac materials. △ Less

Submitted 1 November, 2023; originally announced November 2023.

arXiv:2310.19538 [pdf, other]

Quantum Lego and XP Stabilizer Codes

Authors: Ruohan Shen, Yixu Wang, ChunJun Cao

Abstract: We apply the recent graphical framework of ''quantum lego'' to XP stabilizer codes where the stabilizer group is generally non-abelian. We show that the idea of operator matching continues to hold for such codes and is sufficient for generating all their XP symmetries provided the resulting code is XP. We provide an efficient classical algorithm for tracking these symmetries under tensor contracti… ▽ More We apply the recent graphical framework of ''quantum lego'' to XP stabilizer codes where the stabilizer group is generally non-abelian. We show that the idea of operator matching continues to hold for such codes and is sufficient for generating all their XP symmetries provided the resulting code is XP. We provide an efficient classical algorithm for tracking these symmetries under tensor contraction or conjoining. This constitutes a partial extension of the algorithm implied by Gottesman-Knill theorem beyond Pauli stabilizer states and Clifford operations. Because conjoining transformations generate quantum operations that are universal, the XP symmetries obtained from these algorithms do not uniquely identify the resulting tensors in general. Using this extended framework, we provide a novel XP stabilizer code with higher distance and a $[[8,1,2]]$ code with fault-tolerant $T$ gate. For XP regular codes, we also construct a tensor-network-based the maximum likelihood decoder for any i.i.d. single qubit error channel. △ Less

Submitted 30 October, 2023; originally announced October 2023.

Comments: 18 pages, 6 figures

arXiv:2310.17878 [pdf, other]

A Sublinear-Time Spectral Clustering Oracle with Improved Preprocessing Time

Authors: Ranran Shen, Pan Peng

Abstract: We address the problem of designing a sublinear-time spectral clustering oracle for graphs that exhibit strong clusterability. Such graphs contain $k$ latent clusters, each characterized by a large inner conductance (at least $\varphi$) and a small outer conductance (at most $\varepsilon$). Our aim is to preprocess the graph to enable clustering membership queries, with the key requirement that bo… ▽ More We address the problem of designing a sublinear-time spectral clustering oracle for graphs that exhibit strong clusterability. Such graphs contain $k$ latent clusters, each characterized by a large inner conductance (at least $\varphi$) and a small outer conductance (at most $\varepsilon$). Our aim is to preprocess the graph to enable clustering membership queries, with the key requirement that both preprocessing and query answering should be performed in sublinear time, and the resulting partition should be consistent with a $k$-partition that is close to the ground-truth clustering. Previous oracles have relied on either a $\textrm{poly}(k)\log n$ gap between inner and outer conductances or exponential (in $k/\varepsilon$) preprocessing time. Our algorithm relaxes these assumptions, albeit at the cost of a slightly higher misclassification ratio. We also show that our clustering oracle is robust against a few random edge deletions. To validate our theoretical bounds, we conducted experiments on synthetic networks. △ Less

Submitted 29 December, 2023; v1 submitted 26 October, 2023; originally announced October 2023.

Comments: To appear at NeurIPS'23

arXiv:2310.17848 [pdf, other]

Boosting Data Analytics With Synthetic Volume Expansion

Authors: Xiaotong Shen, Yifei Liu, Rex Shen

Abstract: Synthetic data generation, a cornerstone of Generative Artificial Intelligence, promotes a paradigm shift in data science by addressing data scarcity and privacy while enabling unprecedented performance. As synthetic data becomes more prevalent, concerns emerge regarding the accuracy of statistical methods when applied to synthetic data in contrast to raw data. This article explores the effectiven… ▽ More Synthetic data generation, a cornerstone of Generative Artificial Intelligence, promotes a paradigm shift in data science by addressing data scarcity and privacy while enabling unprecedented performance. As synthetic data becomes more prevalent, concerns emerge regarding the accuracy of statistical methods when applied to synthetic data in contrast to raw data. This article explores the effectiveness of statistical methods on synthetic data and the privacy risks of synthetic data. Regarding effectiveness, we present the Synthetic Data Generation for Analytics framework. This framework applies statistical approaches to high-quality synthetic data produced by generative models like tabular diffusion models, which, initially trained on raw data, benefit from insights from pertinent studies through transfer learning. A key finding within this framework is the generational effect, which reveals that the error rate of statistical methods on synthetic data decreases with the addition of more synthetic data but may eventually rise or stabilize. This phenomenon, stemming from the challenge of accurately mirroring raw data distributions, highlights a "reflection point"-an ideal volume of synthetic data defined by specific error metrics. Through three case studies, sentiment analysis, predictive modeling of structured data, and inference in tabular data, we validate the superior performance of this framework compared to conventional approaches. On privacy, synthetic data imposes lower risks while supporting the differential privacy standard. These studies underscore synthetic data's untapped potential in redefining data science's landscape. △ Less

Submitted 10 March, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

arXiv:2310.15843 [pdf, ps, other]

Decay Estimates of High Dimensional Adjoint Radon Transforms

Authors: Ruipeng Shen

Abstract: In this paper we prove an optimal $L^2-L^{2d}$ decay estimate of the adjoint Radon transform of compactly supported data in $d$-dimensional space via a geometric method. A similar problem in dimension $3$ has be considered in the author's previous work. This work deals with all higher dimensional case $d\geq 4$. As an application we give the decay of Strichartz norms of $5$-dimensional non-radiati… ▽ More In this paper we prove an optimal $L^2-L^{2d}$ decay estimate of the adjoint Radon transform of compactly supported data in $d$-dimensional space via a geometric method. A similar problem in dimension $3$ has be considered in the author's previous work. This work deals with all higher dimensional case $d\geq 4$. As an application we give the decay of Strichartz norms of $5$-dimensional non-radiative free waves. The general idea is similar to the lower dimensional case but we introduce a new method to prove the corresponding geometric inequality because the old method becomes too complicated in higher dimensions. △ Less

Submitted 24 October, 2023; originally announced October 2023.

Comments: 21 pages, 1 figure

MSC Class: 35L05; 44A12

arXiv:2310.13914 [pdf, other]

Cold Diffusion on the Replay Buffer: Learning to Plan from Known Good States

Authors: Zidan Wang, Takeru Oba, Takuma Yoneda, Rui Shen, Matthew Walter, Bradly C. Stadie

Abstract: Learning from demonstrations (LfD) has successfully trained robots to exhibit remarkable generalization capabilities. However, many powerful imitation techniques do not prioritize the feasibility of the robot behaviors they generate. In this work, we explore the feasibility of plans produced by LfD. As in prior work, we employ a temporal diffusion model with fixed start and goal states to facilita… ▽ More Learning from demonstrations (LfD) has successfully trained robots to exhibit remarkable generalization capabilities. However, many powerful imitation techniques do not prioritize the feasibility of the robot behaviors they generate. In this work, we explore the feasibility of plans produced by LfD. As in prior work, we employ a temporal diffusion model with fixed start and goal states to facilitate imitation through in-painting. Unlike previous studies, we apply cold diffusion to ensure the optimization process is directed through the agent's replay buffer of previously visited states. This routing approach increases the likelihood that the final trajectories will predominantly occupy the feasible region of the robot's state space. We test this method in simulated robotic environments with obstacles and observe a significant improvement in the agent's ability to avoid these obstacles during planning. △ Less

Submitted 21 October, 2023; originally announced October 2023.

arXiv:2310.09930 [pdf, other]

FiLM: Fill-in Language Models for Any-Order Generation

Authors: Tianxiao Shen, Hao Peng, Ruoqi Shen, Yao Fu, Zaid Harchaoui, Yejin Choi

Abstract: Language models have become the backbone of today's AI systems. However, their predominant left-to-right generation limits the use of bidirectional context, which is essential for tasks that involve filling text in the middle. We propose the Fill-in Language Model (FiLM), a new language modeling approach that allows for flexible generation at any position without adhering to a specific generation… ▽ More Language models have become the backbone of today's AI systems. However, their predominant left-to-right generation limits the use of bidirectional context, which is essential for tasks that involve filling text in the middle. We propose the Fill-in Language Model (FiLM), a new language modeling approach that allows for flexible generation at any position without adhering to a specific generation order. Its training extends the masked language modeling objective by adopting varying mask probabilities sampled from the Beta distribution to enhance the generative capabilities of FiLM. During inference, FiLM can seamlessly insert missing phrases, sentences, or paragraphs, ensuring that the outputs are fluent and are coherent with the surrounding context. In both automatic and human evaluations, FiLM outperforms existing infilling methods that rely on left-to-right language models trained on rearranged text segments. FiLM is easy to implement and can be either trained from scratch or fine-tuned from a left-to-right language model. Notably, as the model size grows, FiLM's perplexity approaches that of strong left-to-right language models of similar sizes, indicating FiLM's scalability and potential as a large language model. △ Less

Submitted 15 October, 2023; originally announced October 2023.

arXiv:2310.03236 [pdf, other]

doi 10.3847/2041-8213/acffc5

Dissonance in harmony: The UV/optical periodic outbursts of ASASSN-14ko exhibit repeated bumps and rebrightenings

Authors: Shifeng Huang, Ning Jiang, Rong-Feng Shen, Tinggui Wang, Zhenfeng Sheng

Abstract: ASASSN-14ko was identified as an abnormal periodic nuclear transient with a potential decreasing period. Its outbursts in the optical and UV bands have displayed a consistent and smooth "fast-rise and slow-decay" pattern since its discovery, which has recently experienced an unexpected alteration in the last two epochs, as revealed by our proposed high-cadence Swift observations. The new light cur… ▽ More ASASSN-14ko was identified as an abnormal periodic nuclear transient with a potential decreasing period. Its outbursts in the optical and UV bands have displayed a consistent and smooth "fast-rise and slow-decay" pattern since its discovery, which has recently experienced an unexpected alteration in the last two epochs, as revealed by our proposed high-cadence Swift observations. The new light curve profiles show a bump during the rising stages and a rebrightening during the declining stages, making them much broader and symmetrical than the previous ones. In the last two epochs, there is no significant difference in the X-ray spectral slope compared to the previous one, and its overall luminosity is lower than those of the previous epochs. The energy released in the early bump and rebrightening phases ($\sim10^{50}$ erg) could be due to collision of the stripped stream from partial tidal disruption events (pTDEs) with an expanded accretion disk. We also discussed other potential explanations, such as disk instability and star-disk collisions. Further high-cadence multi-wavelength observations of subsequent cycles are encouraged to comprehend the unique periodic source with its new intriguing features. △ Less

Submitted 15 October, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

Comments: Accepted for publication in ApJL, 10 pages, 6 figures

arXiv:2309.17446 [pdf, other]

L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language Models

Authors: Ansong Ni, Pengcheng Yin, Yilun Zhao, Martin Riddell, Troy Feng, Rui Shen, Stephen Yin, Ye Liu, Semih Yavuz, Caiming Xiong, Shafiq Joty, Yingbo Zhou, Dragomir Radev, Arman Cohan

Abstract: Recently, large language models (LLMs), especially those that are pretrained on code, have demonstrated strong capabilities in generating programs from natural language inputs in a few-shot or even zero-shot manner. Despite promising results, there is a notable lack of a comprehensive evaluation of these models language-to-code generation capabilities. Existing studies often focus on specific task… ▽ More Recently, large language models (LLMs), especially those that are pretrained on code, have demonstrated strong capabilities in generating programs from natural language inputs in a few-shot or even zero-shot manner. Despite promising results, there is a notable lack of a comprehensive evaluation of these models language-to-code generation capabilities. Existing studies often focus on specific tasks, model architectures, or learning paradigms, leading to a fragmented understanding of the overall landscape. In this work, we present L2CEval, a systematic evaluation of the language-to-code generation capabilities of LLMs on 7 tasks across the domain spectrum of semantic parsing, math reasoning and Python programming, analyzing the factors that potentially affect their performance, such as model size, pretraining data, instruction tuning, and different prompting methods. In addition to assessing model performance, we measure confidence calibration for the models and conduct human evaluations of the output programs. This enables us to identify and analyze the typical failure modes across various tasks and models. L2CEval offers a comprehensive understanding of the capabilities and limitations of LLMs in language-to-code generation. We also release the evaluation framework and all model outputs, hoping to lay the groundwork for further future research in this domain. △ Less

Submitted 2 October, 2023; v1 submitted 29 September, 2023; originally announced September 2023.

Comments: Project Website: https://l2c-eval.github.io/

arXiv:2309.11560 [pdf, other]

A Robust Large-Period Discrete Time Crystal and its Signature in a Digital Quantum Computer

Authors: Tianqi Chen, Ruizhe Shen, Ching Hua Lee, Bo Yang, Raditya Weda Bomantara

Abstract: Discrete time crystals (DTCs) are novel out-of-equilibrium quantum states of matter which break time translational symmetry. So far, only the simplest form of DTCs that exhibit period-doubling dynamics has been unambiguously realized in experiments. We develop an intuitive interacting spin-$1/2$ system that supports the more non-trivial period-quadrupling DTCs ($4T$-DTCs) and demonstrate its digit… ▽ More Discrete time crystals (DTCs) are novel out-of-equilibrium quantum states of matter which break time translational symmetry. So far, only the simplest form of DTCs that exhibit period-doubling dynamics has been unambiguously realized in experiments. We develop an intuitive interacting spin-$1/2$ system that supports the more non-trivial period-quadrupling DTCs ($4T$-DTCs) and demonstrate its digital simulation on a noisy quantum processor. Remarkably, we found a strong signature of the predicted $4T$-DTC that is robust against and, in some cases, amplified by different types of disorders. Our findings thus shed light on the interplay between disorder and quantum interactions on the formation of time crystallinity beyond periodic-doubling, as well as demonstrate the potential of existing noisy intermediate-scale quantum devices for simulating exotic non-equilibrium quantum states of matter. △ Less

Submitted 13 August, 2024; v1 submitted 20 September, 2023; originally announced September 2023.

Comments: 16 pages, 5 figures (main text) + 6 figures (appendix); Updated results, figures and references

arXiv:2309.00420 [pdf, ps, other]

doi 10.1016/j.physletb.2024.138594

Well-separated soliton-antisoliton pairs with an adjoint Higgs field in 4D space

Authors: Liang Gong, Rui Shen

Abstract: We present single soliton states and soliton-antisoliton states with an adjoint Higgs field in 4D flat space. The action of a single soliton state diverges, while the action of soliton-antisoliton states converges. This means such solitons can exist in soliton-antisoliton states, although they cannot exist individually. The interaction in a soliton-antisoliton state takes a logarithmic dependence… ▽ More We present single soliton states and soliton-antisoliton states with an adjoint Higgs field in 4D flat space. The action of a single soliton state diverges, while the action of soliton-antisoliton states converges. This means such solitons can exist in soliton-antisoliton states, although they cannot exist individually. The interaction in a soliton-antisoliton state takes a logarithmic dependence on separation. Such soliton-antisoliton states exhibit stability under a scaling transformation. △ Less

Submitted 21 March, 2024; v1 submitted 1 September, 2023; originally announced September 2023.

arXiv:2308.07183 [pdf, ps, other]

On Thompson Problem

Authors: Rulin Shen, Wujie Shi, Feng Tang

Abstract: In 1987, the second author of this paper reported his conjecture, all finite simple groups $S$ can be characterized uniformly using the order of $S$ and the set of element orders in $S$, to Prof. J. G. Thompson. In their communications, Thompson posed his problem about the judgment of solvability of finite groups $G$. In this paper we give a positive answer for Thompson's problem if the prime grap… ▽ More In 1987, the second author of this paper reported his conjecture, all finite simple groups $S$ can be characterized uniformly using the order of $S$ and the set of element orders in $S$, to Prof. J. G. Thompson. In their communications, Thompson posed his problem about the judgment of solvability of finite groups $G$. In this paper we give a positive answer for Thompson's problem if the prime graph of $G$ is not connection. △ Less

Submitted 17 September, 2023; v1 submitted 14 August, 2023; originally announced August 2023.

Comments: 8 pages

MSC Class: 20D06; 20D60

arXiv:2308.06151 [pdf, ps, other]

doi 10.3847/1538-4357/acefbf

Quasi-periodic oscillation in short gamma-ray bursts from black hole-neutron star mergers

Authors: Yan Li, Rong-Feng Shen, Bin-Bin Zhang

Abstract: Short-duration gamma-ray bursts (sGRBs) are commonly attributed to the mergers of double neutron stars (NSs) or the mergers of a neutron star with a black hole (BH). While the former scenario was confirmed by the event GW170817, the latter remains elusive. Here, we consider the latter scenario in which, a NS is tidally disrupted by a fast spinning low-mass BH and the accretion onto the BH launches… ▽ More Short-duration gamma-ray bursts (sGRBs) are commonly attributed to the mergers of double neutron stars (NSs) or the mergers of a neutron star with a black hole (BH). While the former scenario was confirmed by the event GW170817, the latter remains elusive. Here, we consider the latter scenario in which, a NS is tidally disrupted by a fast spinning low-mass BH and the accretion onto the BH launches a relativistic jet and hence produces a sGRB. The merging binary's orbit is likely misaligned with the BH's spin. Hence, the Lense-Thirring precession around the BH may cause a hyper-accreting thick disk to precess in a solid-body manner. We propose that a jet, initially aligned with the BH spin, is deflected and collimated by the wind from the disk, therefore being forced to precess along with the disk. This would result in a quasi-periodic oscillation or modulation in the gamma-ray light curve of the sGRB, with a quasi-period of $\sim 0.01-0.1$ s. The appearance of the modulation may be delayed respective to the triggering of the light curve. This feature, unique to the BH-NS merger, may have already revealed itself in a few observed sGRBs (such as GRB 130310A), and it carries the spin-obit orientation information of the merging system. Identification of this feature would be a new approach to reveal spin-orbit-misaligned merging BH-NS systems, which are likely missed by the current gravitational-wave searching strategy principally targeting aligned systems. △ Less

Submitted 11 August, 2023; originally announced August 2023.

Comments: 13 pages, 8 figures, Accepted for publication in ApJ

Journal ref: 2023, ApJ, 955, 98

arXiv:2307.11328 [pdf, other]

Polaromechanics: photons, magnons and phonons in the triple strong-coupling regime

Authors: Rui-Chang Shen, Jie Li, Yi-Ming Sun, Wei-Jiang Wu, Xuan Zuo, Yi-Pu Wang, Shi-Yao Zhu, J. Q. You

Abstract: Building hybrid quantum systems is a crucial step for realizing multifunctional quantum technologies, quantum information processing, and hybrid quantum networks. A functional hybrid quantum system requires strong coupling among its components. However, couplings between distinct physical systems are typically very weak. Experimental realization of strong coupling in a hybrid system remains a long… ▽ More Building hybrid quantum systems is a crucial step for realizing multifunctional quantum technologies, quantum information processing, and hybrid quantum networks. A functional hybrid quantum system requires strong coupling among its components. However, couplings between distinct physical systems are typically very weak. Experimental realization of strong coupling in a hybrid system remains a long-standing challenge, especially when it has multiple components and the components are of different nature. Here we demonstrate the realization of triple strong coupling in a novel polaromechanical hybrid system, where polaritons, formed by strongly coupled ferromagnetic magnons and microwave photons, are further strongly coupled to phonons. The corresponding polaromechanical normal-mode splitting is observed. A high polaromechanical cooperativity of $9.4\times10^3$ is achieved by significantly reducing the polariton decay rate via exploiting coherent perfect absorption. The quantum cooperativity much greater than unity is achievable if placing the system at low bath temperatures, which would enable various quantum applications. Our results pave the way towards coherent quantum control of photons, magnons and phonons, and are a crucial step for building functional hybrid quantum systems based on magnons. △ Less

Submitted 27 September, 2024; v1 submitted 20 July, 2023; originally announced July 2023.

arXiv:2306.13139 [pdf, other]

doi 10.1103/PhysRevA.109.053311

Kinked linear response from non-Hermitian cold-atom pumping

Authors: Fang Qin, Ruizhe Shen, Linhu Li, Ching Hua Lee

Abstract: It is well known that non-Hermitian, non-reciprocal systems may harbor exponentially localized skin modes. However, in this work, we find that, generically, non-Hermiticity gives rise to abrupt and prominent kinks in the semi-classical wave packet trajectories of quantum gases, despite the absence of sudden physical impulses. This physically stems from a hitherto underappreciated intrinsic non-loc… ▽ More It is well known that non-Hermitian, non-reciprocal systems may harbor exponentially localized skin modes. However, in this work, we find that, generically, non-Hermiticity gives rise to abrupt and prominent kinks in the semi-classical wave packet trajectories of quantum gases, despite the absence of sudden physical impulses. This physically stems from a hitherto underappreciated intrinsic non-locality from non-Hermitian pumping, even if all physical couplings are local, thereby resulting in enigmatic singularities in the band structure that lead to discontinuous band geometry and Berry curvature. Specifically, we focus on the realization of the kinked response in an ultracold atomic setup. For a concrete experimental demonstration, we propose an ultracold atomic setup in a two-dimensional optical lattice with laser-induced loss such that response kinks can be observed without fine-tuning in the physical atomic cloud dynamics. Our results showcase unique non-monotonic behavior from non-Hermitian pumping beyond the non-Hermitian skin effect and suggest new avenues for investigating non-Hermitian dynamics on ultracold atomic platforms. △ Less

Submitted 29 May, 2024; v1 submitted 22 June, 2023; originally announced June 2023.

Comments: 19 pages, 9 figures, updated references, published version

Journal ref: Phys. Rev. A 109, 053311 (2024)

Showing 1–50 of 259 results for author: Shen, R