Search | arXiv e-print repository

When LLMs Go Online: The Emerging Threat of Web-Enabled LLMs

Authors: Hanna Kim, Minkyoo Song, Seung Ho Na, Seungwon Shin, Kimin Lee

Abstract: Recent advancements in Large Language Models (LLMs) have established them as agentic systems capable of planning and interacting with various tools. These LLM agents are often paired with web-based tools, enabling access to diverse sources and real-time information. Although these advancements offer significant benefits across various applications, they also increase the risk of malicious use, par… ▽ More Recent advancements in Large Language Models (LLMs) have established them as agentic systems capable of planning and interacting with various tools. These LLM agents are often paired with web-based tools, enabling access to diverse sources and real-time information. Although these advancements offer significant benefits across various applications, they also increase the risk of malicious use, particularly in cyberattacks involving personal information. In this work, we investigate the risks associated with misuse of LLM agents in cyberattacks involving personal data. Specifically, we aim to understand: 1) how potent LLM agents can be when directed to conduct cyberattacks, 2) how cyberattacks are enhanced by web-based tools, and 3) how affordable and easy it becomes to launch cyberattacks using LLM agents. We examine three attack scenarios: the collection of Personally Identifiable Information (PII), the generation of impersonation posts, and the creation of spear-phishing emails. Our experiments reveal the effectiveness of LLM agents in these attacks: LLM agents achieved a precision of up to 95.9% in collecting PII, up to 93.9% of impersonation posts created by LLM agents were evaluated as authentic, and the click rate for links in spear phishing emails created by LLM agents reached up to 46.67%. Additionally, our findings underscore the limitations of existing safeguards in contemporary commercial LLMs, emphasizing the urgent need for more robust security measures to prevent the misuse of LLM agents. △ Less

Submitted 18 October, 2024; originally announced October 2024.

arXiv:2410.13504 [pdf, ps, other]

Local Intertwining Relations and Co-tempered $A$-packets of Classical Groups

Authors: Hiraku Atobe, Wee Teck Gan, Atsushi Ichino, Tasho Kaletha, Alberto M�nguez, Sug Woo Shin

Abstract: The local intertwining relation is an identity that gives precise information about the action of normalized intertwining operators on parabolically induced representations. We prove several instances of the local intertwining relation for quasi-split classical groups and the twisted general linear group, as they are required in the inductive proof of the endoscopic classification for quasi-split… ▽ More The local intertwining relation is an identity that gives precise information about the action of normalized intertwining operators on parabolically induced representations. We prove several instances of the local intertwining relation for quasi-split classical groups and the twisted general linear group, as they are required in the inductive proof of the endoscopic classification for quasi-split classical groups due to Arthur and Mok. In addition, we construct the co-tempered local $A$-packets by Aubert duality and verify their key properties by purely local means, which provide the seed cases needed as an input to the inductive proof. Together with further technical results that we establish, this makes the endoscopic classification conditional only on the validity of the twisted weighted fundamental lemma. △ Less

Submitted 17 October, 2024; originally announced October 2024.

Comments: 190 pages

arXiv:2410.12744 [pdf, other]

Drillboards: Adaptive Visualization Dashboards for Dynamic Personalization of Visualization Experiences

Authors: Sungbok Shin, Inyoup Na, Niklas Elmqvist

Abstract: We present drillboards, a technique for adaptive visualization dashboards consisting of a hierarchy of coordinated charts that the user can drill down to reach a desired level of detail depending on their expertise, interest, and desired effort. This functionality allows different users to personalize the same dashboard to their specific needs and expertise. The technique is based on a formal voca… ▽ More We present drillboards, a technique for adaptive visualization dashboards consisting of a hierarchy of coordinated charts that the user can drill down to reach a desired level of detail depending on their expertise, interest, and desired effort. This functionality allows different users to personalize the same dashboard to their specific needs and expertise. The technique is based on a formal vocabulary of chart representations and rules for merging multiple charts of different types and data into single composite representations. The drillboard hierarchy is created by iteratively applying these rules starting from a baseline dashboard, with each consecutive operation yielding a new dashboard with fewer charts and progressively more abstract and simplified views. We also present an authoring tool for building drillboards and show how experts users can use to build up and deliver personalized experiences to a wide audience. Our evaluation asked three domain experts to author drillboards for their own datasets, which we then showed to casual end-users with favorable outcomes. △ Less

Submitted 17 October, 2024; v1 submitted 16 October, 2024; originally announced October 2024.

Comments: Submitted to TVCG

arXiv:2410.04806 [pdf]

Topological beaming of light: Proof-of-concept experiment

Authors: Yu Sung Choi, Ki Young Lee, Soo-Chan An, Minchul Jang, Youngjae Kim, Seung Han Shin, Jae Woong Yoon

Abstract: Beam shaping in nanophotonic systems remains a challenge due to the reliance on complex heuristic optimization procedures. In this work, we experimentally demonstrate a novel approach to topological beam shaping using Jackiw-Rebbi states in metasurfaces. By fabricating thin-film dielectric structures with engineered Dirac-mass distributions, we create domain walls that allow precise control over b… ▽ More Beam shaping in nanophotonic systems remains a challenge due to the reliance on complex heuristic optimization procedures. In this work, we experimentally demonstrate a novel approach to topological beam shaping using Jackiw-Rebbi states in metasurfaces. By fabricating thin-film dielectric structures with engineered Dirac-mass distributions, we create domain walls that allow precise control over beam profiles. We observe the emergence of Jackiw-Rebbi states and confirm their localized characteristics. Notably, we achieve a flat-top beam profile by carefully tailoring the Dirac mass distribution, highlighting the potential of this method for customized beam shaping. This experimental realization establishes our approach as a new mechanism for beam control, rooted in topological physics, and offers an efficient strategy for nanophotonic design. △ Less

Submitted 7 October, 2024; originally announced October 2024.

arXiv:2410.01866 [pdf, other]

House of Cards: Massive Weights in LLMs

Authors: Jaehoon Oh, Seungjun Shin, Dokwan Oh

Abstract: Massive activations, which manifest in specific feature dimensions of hidden states, introduce a significant bias in large language models (LLMs), leading to an overemphasis on the corresponding token. In this paper, we identify that massive activations originate not from the hidden state but from the intermediate state of a feed-forward network module in an early layer. Expanding on the previous… ▽ More Massive activations, which manifest in specific feature dimensions of hidden states, introduce a significant bias in large language models (LLMs), leading to an overemphasis on the corresponding token. In this paper, we identify that massive activations originate not from the hidden state but from the intermediate state of a feed-forward network module in an early layer. Expanding on the previous observation that massive activations occur only in specific feature dimensions, we dive deep into the weights that cause massive activations. Specifically, we define top-$k$ massive weights as the weights that contribute to the dimensions with the top-$k$ magnitudes in the intermediate state. When these massive weights are set to zero, the functionality of LLMs is entirely disrupted. However, when all weights except for massive weights are set to zero, it results in a relatively minor performance drop, even though a much larger number of weights are set to zero. This implies that during the pre-training process, learning is dominantly focused on massive weights. Building on this observation, we propose a simple plug-and-play method called MacDrop (massive weights curriculum dropout), to rely less on massive weights during parameter-efficient fine-tuning. This method applies dropout to the pre-trained massive weights, starting with a high dropout probability and gradually decreasing it as fine-tuning progresses. Through experiments, we demonstrate that MacDrop generally improves performance across zero-shot downstream tasks and generation tasks. △ Less

Submitted 2 October, 2024; originally announced October 2024.

Comments: Under review

arXiv:2410.01219 [pdf, other]

Organization and Dynamics of Chromosomes

Authors: D. Thirumalai, Guang Shi, Sucheol Shin, Changbong Hyeon

Abstract: How long threadlike eukaryotic chromosomes fit tidily in the small volume of the nucleus without significant entanglement is just beginning to be understood, thanks to major advances in experimental techniques. Several polymer models, which reproduce contact maps that measure the probabilities that two loci are in spatial contact, have predicted the three-dimensional structures of interphase chrom… ▽ More How long threadlike eukaryotic chromosomes fit tidily in the small volume of the nucleus without significant entanglement is just beginning to be understood, thanks to major advances in experimental techniques. Several polymer models, which reproduce contact maps that measure the probabilities that two loci are in spatial contact, have predicted the three-dimensional structures of interphase chromosomes. Data-driven approaches, using contact maps as input, predict that mitotic helical chromosomes are characterized by switch in handedness, referred to as "perversion". By using experimentally derived effective interactions between chromatin loci in simulations, structures of conventional and inverted nuclei have been accurately predicted. Polymer theory and simulations show that the dynamics of individual loci in chromatin exhibit subdiffusive behavior but the diffusion exponents are broadly distributed, which accords well with experiments. Although coarse-grained models are successful, many challenging problems remain, which require the creation of new experimental and computational tools to understand genome biology. △ Less

Submitted 1 October, 2024; originally announced October 2024.

arXiv:2409.18288 [pdf, other]

The hypothetical track-length fitting algorithm for energy measurement in liquid argon TPCs

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, F. Akbar, N. S. Alex, K. Allison, S. Alonso Monsalve, M. Alrashed, A. Alton, R. Alvarez, T. Alves, H. Amar, P. Amedo, J. Anderson, C. Andreopoulos , et al. (1348 additional authors not shown)

Abstract: This paper introduces the hypothetical track-length fitting algorithm, a novel method for measuring the kinetic energies of ionizing particles in liquid argon time projection chambers (LArTPCs). The algorithm finds the most probable offset in track length for a track-like object by comparing the measured ionization density as a function of position with a theoretical prediction of the energy loss… ▽ More This paper introduces the hypothetical track-length fitting algorithm, a novel method for measuring the kinetic energies of ionizing particles in liquid argon time projection chambers (LArTPCs). The algorithm finds the most probable offset in track length for a track-like object by comparing the measured ionization density as a function of position with a theoretical prediction of the energy loss as a function of the energy, including models of electron recombination and detector response. The algorithm can be used to measure the energies of particles that interact before they stop, such as charged pions that are absorbed by argon nuclei. The algorithm's energy measurement resolutions and fractional biases are presented as functions of particle kinetic energy and number of track hits using samples of stopping secondary charged pions in data collected by the ProtoDUNE-SP detector, and also in a detailed simulation. Additional studies describe impact of the dE/dx model on energy measurement performance. The method described in this paper to characterize the energy measurement performance can be repeated in any LArTPC experiment using stopping secondary charged pions. △ Less

Submitted 1 October, 2024; v1 submitted 26 September, 2024; originally announced September 2024.

Report number: FERMILAB-PUB-24-0561-LBNF-PPD, CERN-EP-2024-256

arXiv:2409.18132 [pdf, ps, other]

Decomposition of one-layer neural networks via the infinite sum of reproducing kernel Banach spaces

Authors: Seungcheol Shin, Myungjoo Kang

Abstract: In this paper, we define the sum of RKBSs using the characterization theorem of RKBSs and show that the sum of RKBSs is compatible with the direct sum of feature spaces. Moreover, we decompose the integral RKBS into the sum of $p$-norm RKBSs. Finally, we provide applications for the structural understanding of the integral RKBS class. In this paper, we define the sum of RKBSs using the characterization theorem of RKBSs and show that the sum of RKBSs is compatible with the direct sum of feature spaces. Moreover, we decompose the integral RKBS into the sum of $p$-norm RKBSs. Finally, we provide applications for the structural understanding of the integral RKBS class. △ Less

Submitted 9 August, 2024; originally announced September 2024.

Comments: 13 pages

arXiv:2409.16651 [pdf, other]

Learning Representation for Multitask learning through Self Supervised Auxiliary learning

Authors: Seokwon Shin, Hyungrok Do, Youngdoo Son

Abstract: Multi-task learning is a popular machine learning approach that enables simultaneous learning of multiple related tasks, improving algorithmic efficiency and effectiveness. In the hard parameter sharing approach, an encoder shared through multiple tasks generates data representations passed to task-specific predictors. Therefore, it is crucial to have a shared encoder that provides decent represen… ▽ More Multi-task learning is a popular machine learning approach that enables simultaneous learning of multiple related tasks, improving algorithmic efficiency and effectiveness. In the hard parameter sharing approach, an encoder shared through multiple tasks generates data representations passed to task-specific predictors. Therefore, it is crucial to have a shared encoder that provides decent representations for every and each task. However, despite recent advances in multi-task learning, the question of how to improve the quality of representations generated by the shared encoder remains open. To address this gap, we propose a novel approach called Dummy Gradient norm Regularization that aims to improve the universality of the representations generated by the shared encoder. Specifically, the method decreases the norm of the gradient of the loss function with repect to dummy task-specific predictors to improve the universality of the shared encoder's representations. Through experiments on multiple multi-task learning benchmark datasets, we demonstrate that DGR effectively improves the quality of the shared representations, leading to better multi-task prediction performances. Applied to various classifiers, the shared representations generated by DGR also show superior performance compared to existing multi-task learning methods. Moreover, our approach takes advantage of computational efficiency due to its simplicity. The simplicity also allows us to seamlessly integrate DGR with the existing multi-task learning algorithms. △ Less

Submitted 25 September, 2024; originally announced September 2024.

arXiv:2409.16618 [pdf, other]

Claim-Guided Textual Backdoor Attack for Practical Applications

Authors: Minkyoo Song, Hanna Kim, Jaehan Kim, Youngjin Jin, Seungwon Shin

Abstract: Recent advances in natural language processing and the increased use of large language models have exposed new security vulnerabilities, such as backdoor attacks. Previous backdoor attacks require input manipulation after model distribution to activate the backdoor, posing limitations in real-world applicability. Addressing this gap, we introduce a novel Claim-Guided Backdoor Attack (CGBA), which… ▽ More Recent advances in natural language processing and the increased use of large language models have exposed new security vulnerabilities, such as backdoor attacks. Previous backdoor attacks require input manipulation after model distribution to activate the backdoor, posing limitations in real-world applicability. Addressing this gap, we introduce a novel Claim-Guided Backdoor Attack (CGBA), which eliminates the need for such manipulations by utilizing inherent textual claims as triggers. CGBA leverages claim extraction, clustering, and targeted training to trick models to misbehave on targeted claims without affecting their performance on clean data. CGBA demonstrates its effectiveness and stealthiness across various datasets and models, significantly enhancing the feasibility of practical backdoor attacks. Our code and data will be available at https://github.com/PaperCGBA/CGBA. △ Less

Submitted 25 September, 2024; originally announced September 2024.

Comments: Under Review

arXiv:2409.15087 [pdf]

Towards Accountable AI-Assisted Eye Disease Diagnosis: Workflow Design, External Validation, and Continual Learning

Authors: Qingyu Chen, Tiarnan D L Keenan, Elvira Agron, Alexis Allot, Emily Guan, Bryant Duong, Amr Elsawy, Benjamin Hou, Cancan Xue, Sanjeeb Bhandari, Geoffrey Broadhead, Chantal Cousineau-Krieger, Ellen Davis, William G Gensheimer, David Grasic, Seema Gupta, Luis Haddock, Eleni Konstantinou, Tania Lamba, Michele Maiberger, Dimosthenis Mantopoulos, Mitul C Mehta, Ayman G Nahri, Mutaz AL-Nawaflh, Arnold Oshinsky , et al. (13 additional authors not shown)

Abstract: Timely disease diagnosis is challenging due to increasing disease burdens and limited clinician availability. AI shows promise in diagnosis accuracy but faces real-world application issues due to insufficient validation in clinical workflows and diverse populations. This study addresses gaps in medical AI downstream accountability through a case study on age-related macular degeneration (AMD) diag… ▽ More Timely disease diagnosis is challenging due to increasing disease burdens and limited clinician availability. AI shows promise in diagnosis accuracy but faces real-world application issues due to insufficient validation in clinical workflows and diverse populations. This study addresses gaps in medical AI downstream accountability through a case study on age-related macular degeneration (AMD) diagnosis and severity classification. We designed and implemented an AI-assisted diagnostic workflow for AMD, comparing diagnostic performance with and without AI assistance among 24 clinicians from 12 institutions with real patient data sampled from the Age-Related Eye Disease Study (AREDS). Additionally, we demonstrated continual enhancement of an existing AI model by incorporating approximately 40,000 additional medical images (named AREDS2 dataset). The improved model was then systematically evaluated using both AREDS and AREDS2 test sets, as well as an external test set from Singapore. AI assistance markedly enhanced diagnostic accuracy and classification for 23 out of 24 clinicians, with the average F1-score increasing by 20% from 37.71 (Manual) to 45.52 (Manual + AI) (P-value < 0.0001), achieving an improvement of over 50% in some cases. In terms of efficiency, AI assistance reduced diagnostic times for 17 out of the 19 clinicians tracked, with time savings of up to 40%. Furthermore, a model equipped with continual learning showed robust performance across three independent datasets, recording a 29% increase in accuracy, and elevating the F1-score from 42 to 54 in the Singapore population. △ Less

Submitted 23 September, 2024; originally announced September 2024.

arXiv:2409.14119 [pdf, other]

Obliviate: Neutralizing Task-agnostic Backdoors within the Parameter-efficient Fine-tuning Paradigm

Authors: Jaehan Kim, Minkyoo Song, Seung Ho Na, Seungwon Shin

Abstract: Parameter-efficient fine-tuning (PEFT) has become a key training strategy for large language models. However, its reliance on fewer trainable parameters poses security risks, such as task-agnostic backdoors. Despite their severe impact on a wide range of tasks, there is no practical defense solution available that effectively counters task-agnostic backdoors within the context of PEFT. In this stu… ▽ More Parameter-efficient fine-tuning (PEFT) has become a key training strategy for large language models. However, its reliance on fewer trainable parameters poses security risks, such as task-agnostic backdoors. Despite their severe impact on a wide range of tasks, there is no practical defense solution available that effectively counters task-agnostic backdoors within the context of PEFT. In this study, we introduce Obliviate, a PEFT-integrable backdoor defense. We develop two techniques aimed at amplifying benign neurons within PEFT layers and penalizing the influence of trigger tokens. Our evaluations across three major PEFT architectures show that our method can significantly reduce the attack success rate of the state-of-the-art task-agnostic backdoors (83.6%$\downarrow$). Furthermore, our method exhibits robust defense capabilities against both task-specific backdoors and adaptive attacks. Source code will be obtained at https://github.com/obliviateARR/Obliviate. △ Less

Submitted 6 October, 2024; v1 submitted 21 September, 2024; originally announced September 2024.

Comments: Under Review

arXiv:2409.13109 [pdf, other]

Visualizationary: Automating Design Feedback for Visualization Designers using LLMs

Authors: Sungbok Shin, Sanghyun Hong, Niklas Elmqvist

Abstract: Interactive visualization editors empower people to author visualizations without writing code, but do not guide them in the art and craft of effective visual communication. In this paper, we explore the potential for using an off-the-shelf Large Language Model (LLM) to provide actionable and customized feedback to visualization designers. Our implementation, called VISUALIZATIONARY, showcases how… ▽ More Interactive visualization editors empower people to author visualizations without writing code, but do not guide them in the art and craft of effective visual communication. In this paper, we explore the potential for using an off-the-shelf Large Language Model (LLM) to provide actionable and customized feedback to visualization designers. Our implementation, called VISUALIZATIONARY, showcases how ChatGPT can be used in this manner using two components: a preamble of visualization design guidelines and a suite of perceptual filters extracting salient metrics from a visualization image. We present findings from a longitudinal user study involving 13 visualization designers - 6 novices, 4 intermediate ones, and 3 experts - authoring a new visualization from scratch over the course of several days. Our results indicate that providing guidance in natural language using an LLM can aid even seasoned designers in refining their visualizations. All supplemental materials accompanying this paper are available at https://osf.io/v7hu8. △ Less

Submitted 19 September, 2024; originally announced September 2024.

arXiv:2409.10327 [pdf, other]

Baking Relightable NeRF for Real-time Direct/Indirect Illumination Rendering

Authors: Euntae Choi, Vincent Carpentier, Seunghun Shin, Sungjoo Yoo

Abstract: Relighting, which synthesizes a novel view under a given lighting condition (unseen in training time), is a must feature for immersive photo-realistic experience. However, real-time relighting is challenging due to high computation cost of the rendering equation which requires shape and material decomposition and visibility test to model shadow. Additionally, for indirect illumination, additional… ▽ More Relighting, which synthesizes a novel view under a given lighting condition (unseen in training time), is a must feature for immersive photo-realistic experience. However, real-time relighting is challenging due to high computation cost of the rendering equation which requires shape and material decomposition and visibility test to model shadow. Additionally, for indirect illumination, additional computation of rendering equation on each secondary surface point (where reflection occurs) is required rendering real-time relighting challenging. We propose a novel method that executes a CNN renderer to compute primary surface points and rendering parameters, required for direct illumination. We also present a lightweight hash grid-based renderer, for indirect illumination, which is recursively executed to perform the secondary ray tracing process. Both renderers are trained in a distillation from a pre-trained teacher model and provide real-time physically-based rendering under unseen lighting condition at a negligible loss of rendering quality. △ Less

Submitted 16 September, 2024; originally announced September 2024.

Comments: Under review

arXiv:2409.08221 [pdf, other]

Tweezers: A Framework for Security Event Detection via Event Attribution-centric Tweet Embedding

Authors: Jian Cui, Hanna Kim, Eugene Jang, Dayeon Yim, Kicheol Kim, Yongjae Lee, Jin-Woo Chung, Seungwon Shin, Xiaojing Liao

Abstract: Twitter is recognized as a crucial platform for the dissemination and gathering of Cyber Threat Intelligence (CTI). Its capability to provide real-time, actionable intelligence makes it an indispensable tool for detecting security events, helping security professionals cope with ever-growing threats. However, the large volume of tweets and inherent noises of human-crafted tweets pose significant c… ▽ More Twitter is recognized as a crucial platform for the dissemination and gathering of Cyber Threat Intelligence (CTI). Its capability to provide real-time, actionable intelligence makes it an indispensable tool for detecting security events, helping security professionals cope with ever-growing threats. However, the large volume of tweets and inherent noises of human-crafted tweets pose significant challenges in accurately identifying security events. While many studies tried to filter out event-related tweets based on keywords, they are not effective due to their limitation in understanding the semantics of tweets. Another challenge in security event detection from Twitter is the comprehensive coverage of security events. Previous studies emphasized the importance of early detection of security events, but they overlooked the importance of event coverage. To cope with these challenges, in our study, we introduce a novel event attribution-centric tweet embedding method to enable the high precision and coverage of events. Our experiment result shows that the proposed method outperforms existing text and graph-based tweet embedding methods in identifying security events. Leveraging this novel embedding approach, we have developed and implemented a framework, Tweezers, that is applicable to security event detection from Twitter for CTI gathering. This framework has demonstrated its effectiveness, detecting twice as many events compared to established baselines. Additionally, we have showcased two applications, built on Tweezers for the integration and inspection of security events, i.e., security event trend analysis and informative security user identification. △ Less

Submitted 12 September, 2024; originally announced September 2024.

arXiv:2409.01547 [pdf, ps, other]

The R package psvmSDR: A Unified Algorithm for Sufficient Dimension Reduction via Principal Machines

Authors: Jungmin Shin, Seung Jun Shin, Andreas Artemiou

Abstract: Sufficient dimension reduction (SDR), which seeks a lower-dimensional subspace of the predictors containing regression or classification information has been popular in a machine learning community. In this work, we present a new R software package psvmSDR that implements a new class of SDR estimators, which we call the principal machine (PM) generalized from the principal support vector machine (… ▽ More Sufficient dimension reduction (SDR), which seeks a lower-dimensional subspace of the predictors containing regression or classification information has been popular in a machine learning community. In this work, we present a new R software package psvmSDR that implements a new class of SDR estimators, which we call the principal machine (PM) generalized from the principal support vector machine (PSVM). The package covers both linear and nonlinear SDR and provides a function applicable to realtime update scenarios. The package implements the descent algorithm for the PMs to efficiently compute the SDR estimators in various situations. This easy-to-use package will be an attractive alternative to the dr R package that implements classical SDR methods. △ Less

Submitted 4 September, 2024; v1 submitted 2 September, 2024; originally announced September 2024.

Comments: version 2.0

arXiv:2408.17066 [pdf, other]

Non-verbal Interaction and Interface with a Quadruped Robot using Body and Hand Gestures: Design and User Experience Evaluation

Authors: Soohyun Shin, Trevor Evetts, Hunter Saylor, Hyunji Kim, Soojin Woo, Wonhwha Rhee, Seong-Woo Kim

Abstract: In recent years, quadruped robots have attracted significant attention due to their practical advantages in maneuverability, particularly when navigating rough terrain and climbing stairs. As these robots become more integrated into various industries, including construction and healthcare, researchers have increasingly focused on developing intuitive interaction methods such as speech and gesture… ▽ More In recent years, quadruped robots have attracted significant attention due to their practical advantages in maneuverability, particularly when navigating rough terrain and climbing stairs. As these robots become more integrated into various industries, including construction and healthcare, researchers have increasingly focused on developing intuitive interaction methods such as speech and gestures that do not require separate devices such as keyboards or joysticks. This paper aims at investigating a comfortable and efficient interaction method with quadruped robots that possess a familiar form factor. To this end, we conducted two preliminary studies to observe how individuals naturally interact with a quadruped robot in natural and controlled settings, followed by a prototype experiment to examine human preferences for body-based and hand-based gesture controls using a Unitree Go1 Pro quadruped robot. We assessed the user experience of 13 participants using the User Experience Questionnaire and measured the time taken to complete specific tasks. The findings of our preliminary results indicate that humans have a natural preference for communicating with robots through hand and body gestures rather than speech. In addition, participants reported higher satisfaction and completed tasks more quickly when using body gestures to interact with the robot. This contradicts the fact that most gesture-based control technologies for quadruped robots are hand-based. The video is available at https://youtu.be/rysv1p1zvp4. △ Less

Submitted 30 August, 2024; originally announced August 2024.

Comments: 16 pages

arXiv:2408.12725 [pdf, other]

DUNE Phase II: Scientific Opportunities, Detector Concepts, Technological Solutions

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, A. Alton, R. Alvarez, T. Alves, H. Amar, P. Amedo, J. Anderson, C. Andreopoulos, M. Andreotti , et al. (1347 additional authors not shown)

Abstract: The international collaboration designing and constructing the Deep Underground Neutrino Experiment (DUNE) at the Long-Baseline Neutrino Facility (LBNF) has developed a two-phase strategy toward the implementation of this leading-edge, large-scale science project. The 2023 report of the US Particle Physics Project Prioritization Panel (P5) reaffirmed this vision and strongly endorsed DUNE Phase I… ▽ More The international collaboration designing and constructing the Deep Underground Neutrino Experiment (DUNE) at the Long-Baseline Neutrino Facility (LBNF) has developed a two-phase strategy toward the implementation of this leading-edge, large-scale science project. The 2023 report of the US Particle Physics Project Prioritization Panel (P5) reaffirmed this vision and strongly endorsed DUNE Phase I and Phase II, as did the European Strategy for Particle Physics. While the construction of the DUNE Phase I is well underway, this White Paper focuses on DUNE Phase II planning. DUNE Phase-II consists of a third and fourth far detector (FD) module, an upgraded near detector complex, and an enhanced 2.1 MW beam. The fourth FD module is conceived as a "Module of Opportunity", aimed at expanding the physics opportunities, in addition to supporting the core DUNE science program, with more advanced technologies. This document highlights the increased science opportunities offered by the DUNE Phase II near and far detectors, including long-baseline neutrino oscillation physics, neutrino astrophysics, and physics beyond the standard model. It describes the DUNE Phase II near and far detector technologies and detector design concepts that are currently under consideration. A summary of key R&D goals and prototyping phases needed to realize the Phase II detector technical designs is also provided. DUNE's Phase II detectors, along with the increased beam power, will complete the full scope of DUNE, enabling a multi-decadal program of groundbreaking science with neutrinos. △ Less

Submitted 22 August, 2024; originally announced August 2024.

Report number: FERMILAB-TM-2833-LBNF

arXiv:2408.01024 [pdf, other]

Semantic Skill Grounding for Embodied Instruction-Following in Cross-Domain Environments

Authors: Sangwoo Shin, Seunghyun Kim, Youngsoo Jang, Moontae Lee, Honguk Woo

Abstract: In embodied instruction-following (EIF), the integration of pretrained language models (LMs) as task planners emerges as a significant branch, where tasks are planned at the skill level by prompting LMs with pretrained skills and user instructions. However, grounding these pretrained skills in different domains remains challenging due to their intricate entanglement with the domain-specific knowle… ▽ More In embodied instruction-following (EIF), the integration of pretrained language models (LMs) as task planners emerges as a significant branch, where tasks are planned at the skill level by prompting LMs with pretrained skills and user instructions. However, grounding these pretrained skills in different domains remains challenging due to their intricate entanglement with the domain-specific knowledge. To address this challenge, we present a semantic skill grounding (SemGro) framework that leverages the hierarchical nature of semantic skills. SemGro recognizes the broad spectrum of these skills, ranging from short-horizon low-semantic skills that are universally applicable across domains to long-horizon rich-semantic skills that are highly specialized and tailored for particular domains. The framework employs an iterative skill decomposition approach, starting from the higher levels of semantic skill hierarchy and then moving downwards, so as to ground each planned skill to an executable level within the target domain. To do so, we use the reasoning capabilities of LMs for composing and decomposing semantic skills, as well as their multi-modal extension for assessing the skill feasibility in the target domain. Our experiments in the VirtualHome benchmark show the efficacy of SemGro in 300 cross-domain EIF scenarios. △ Less

Submitted 20 August, 2024; v1 submitted 2 August, 2024; originally announced August 2024.

Comments: Findings of ACL-2024 Camera Ready Version

arXiv:2408.00731 [pdf, other]

doi 10.1145/3622781.3674181

Litmus: Fair Pricing for Serverless Computing

Authors: Qi Pei, Yipeng Wang, Seunghee Shin

Abstract: Serverless computing has emerged as a market-dominant paradigm in modern cloud computing, benefiting both cloud providers and tenants. While service providers can optimize their machine utilization, tenants only need to pay for the resources they use. To maximize resource utilization, these serverless systems co-run numerous short-lived functions, bearing frequent system condition shifts. When the… ▽ More Serverless computing has emerged as a market-dominant paradigm in modern cloud computing, benefiting both cloud providers and tenants. While service providers can optimize their machine utilization, tenants only need to pay for the resources they use. To maximize resource utilization, these serverless systems co-run numerous short-lived functions, bearing frequent system condition shifts. When the system gets overcrowded, a tenant's function may suffer from disturbing slowdowns. Ironically, tenants also incur higher costs during these slowdowns, as commercial serverless platforms determine costs proportional to their execution times. This paper argues that cloud providers should compensate tenants for losses incurred when the server is over-provisioned. However, estimating tenants' losses is challenging without pre-profiled information about their functions. Prior studies have indicated that assessing tenant losses leads to heavy overheads. As a solution, this paper introduces a new pricing model that offers discounts based on the machine's state while presuming the tenant's loss under that state. To monitor the machine state accurately, Litmus pricing frequently conducts Litmus tests, an effective and lightweight solution for measuring system congestion. Our experiments show that Litmus pricing can accurately gauge the impact of system congestion and offer nearly ideal prices, with only a 0.2% price difference on average, in a heavily congested system. △ Less

Submitted 1 August, 2024; originally announced August 2024.

arXiv:2408.00582 [pdf, other]

First Measurement of the Total Inelastic Cross-Section of Positively-Charged Kaons on Argon at Energies Between 5.0 and 7.5 GeV

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, A. Alton, R. Alvarez, T. Alves, H. Amar, P. Amedo, J. Anderson, C. Andreopoulos, M. Andreotti , et al. (1341 additional authors not shown)

Abstract: ProtoDUNE Single-Phase (ProtoDUNE-SP) is a 770-ton liquid argon time projection chamber that operated in a hadron test beam at the CERN Neutrino Platform in 2018. We present a measurement of the total inelastic cross section of charged kaons on argon as a function of kaon energy using 6 and 7 GeV/$c$ beam momentum settings. The flux-weighted average of the extracted inelastic cross section at each… ▽ More ProtoDUNE Single-Phase (ProtoDUNE-SP) is a 770-ton liquid argon time projection chamber that operated in a hadron test beam at the CERN Neutrino Platform in 2018. We present a measurement of the total inelastic cross section of charged kaons on argon as a function of kaon energy using 6 and 7 GeV/$c$ beam momentum settings. The flux-weighted average of the extracted inelastic cross section at each beam momentum setting was measured to be 380$\pm$26 mbarns for the 6 GeV/$c$ setting and 379$\pm$35 mbarns for the 7 GeV/$c$ setting. △ Less

Submitted 1 August, 2024; originally announced August 2024.

Report number: CERN-EP-2024-211, FERMILAB-PUB-24-0216-V

arXiv:2407.20643 [pdf]

Generalizing AI-driven Assessment of Immunohistochemistry across Immunostains and Cancer Types: A Universal Immunohistochemistry Analyzer

Authors: Biagio Brattoli, Mohammad Mostafavi, Taebum Lee, Wonkyung Jung, Jeongun Ryu, Seonwook Park, Jongchan Park, Sergio Pereira, Seunghwan Shin, Sangjoon Choi, Hyojin Kim, Donggeun Yoo, Siraj M. Ali, Kyunghyun Paeng, Chan-Young Ock, Soo Ick Cho, Seokhwi Kim

Abstract: Despite advancements in methodologies, immunohistochemistry (IHC) remains the most utilized ancillary test for histopathologic and companion diagnostics in targeted therapies. However, objective IHC assessment poses challenges. Artificial intelligence (AI) has emerged as a potential solution, yet its development requires extensive training for each cancer and IHC type, limiting versatility. We dev… ▽ More Despite advancements in methodologies, immunohistochemistry (IHC) remains the most utilized ancillary test for histopathologic and companion diagnostics in targeted therapies. However, objective IHC assessment poses challenges. Artificial intelligence (AI) has emerged as a potential solution, yet its development requires extensive training for each cancer and IHC type, limiting versatility. We developed a Universal IHC (UIHC) analyzer, an AI model for interpreting IHC images regardless of tumor or IHC types, using training datasets from various cancers stained for PD-L1 and/or HER2. This multi-cohort trained model outperforms conventional single-cohort models in interpreting unseen IHCs (Kappa score 0.578 vs. up to 0.509) and consistently shows superior performance across different positive staining cutoff values. Qualitative analysis reveals that UIHC effectively clusters patches based on expression levels. The UIHC model also quantitatively assesses c-MET expression with MET mutations, representing a significant advancement in AI application in the era of personalized medicine and accumulating novel biomarkers. △ Less

Submitted 30 July, 2024; originally announced July 2024.

arXiv:2407.19746 [pdf, other]

Octave-YOLO: Cross frequency detection network with octave convolution

Authors: Sangjune Shin, Dongkun Shin

Abstract: Despite the rapid advancement of object detection algorithms, processing high-resolution images on embedded devices remains a significant challenge. Theoretically, the fully convolutional network architecture used in current real-time object detectors can handle all input resolutions. However, the substantial computational demands required to process high-resolution images render them impractical… ▽ More Despite the rapid advancement of object detection algorithms, processing high-resolution images on embedded devices remains a significant challenge. Theoretically, the fully convolutional network architecture used in current real-time object detectors can handle all input resolutions. However, the substantial computational demands required to process high-resolution images render them impractical for real-time applications. To address this issue, real-time object detection models typically downsample the input image for inference, leading to a loss of detail and decreased accuracy. In response, we developed Octave-YOLO, designed to process high-resolution images in real-time within the constraints of embedded systems. We achieved this through the introduction of the cross frequency partial network (CFPNet), which divides the input feature map into low-resolution, low-frequency, and high-resolution, high-frequency sections. This configuration enables complex operations such as convolution bottlenecks and self-attention to be conducted exclusively on low-resolution feature maps while simultaneously preserving the details in high-resolution maps. Notably, this approach not only dramatically reduces the computational demands of convolution tasks but also allows for the integration of attention modules, which are typically challenging to implement in real-time applications, with minimal additional cost. Additionally, we have incorporated depthwise separable convolution into the core building blocks and downsampling layers to further decrease latency. Experimental results have shown that Octave-YOLO matches the performance of YOLOv8 while significantly reducing computational demands. For example, in 1080x1080 resolution, Octave-YOLO-N is 1.56 times faster than YOLOv8, achieving nearly the same accuracy on the COCO dataset with approximately 40 percent fewer parameters and FLOPs. △ Less

Submitted 29 July, 2024; originally announced July 2024.

arXiv:2407.17015 [pdf]

Unveiling van Hove singularity modulation and fluctuated charge order in kagome superconductor $\rm{CsV_3Sb_5}$ via time-resolved ARPES

Authors: Yigui Zhong, Takeshi Suzuki, Hongxiong Liu, Kecheng Liu, Zhengwei Nie, Youguo Shi, Sheng Meng, Baiqing Lv, Hong Ding, Teruto Kanai, Jiro Itatani, Shik Shin, Kozo Okazaki

Abstract: Kagome superconductor CsV3Sb5, which exhibits intertwined unconventional charge density wave (CDW) and superconductivity, has garnered significant attention recently. Despite extensive static studies, the nature of these exotic electronic orders remains elusive. In this study, we investigate the non-equilibrium electronic structure of CsV3Sb5 via time- and angle-resolved photoemission spectroscopy… ▽ More Kagome superconductor CsV3Sb5, which exhibits intertwined unconventional charge density wave (CDW) and superconductivity, has garnered significant attention recently. Despite extensive static studies, the nature of these exotic electronic orders remains elusive. In this study, we investigate the non-equilibrium electronic structure of CsV3Sb5 via time- and angle-resolved photoemission spectroscopy. Our results reveal that upon laser excitation, the van Hove singularities immediately shift towards the Fermi level and subsequently oscillate in sync with a 1.3 THz coherent phonon mode. By analyzing the coherent intensity oscillations in the energy-momentum (E-k) map, we find that this coherent phonon is strongly coupled with electronic bands from both Sb and V orbitals. While typically observable only in the CDW state, remarkably, we find that the 1.3-THz coherent phonon mode can be persistently excited at temperatures above T_CDW, suggesting the potential existence of fluctuated CDW in CsV3Sb5. These findings enhance our understanding of the unconventional CDW control of kagome superconductivity. △ Less

Submitted 24 July, 2024; originally announced July 2024.

Comments: 11 pages, 5 figures

arXiv:2407.16533 [pdf, other]

HAPFI: History-Aware Planning based on Fused Information

Authors: Sujin Jeon, Suyeon Shin, Byoung-Tak Zhang

Abstract: Embodied Instruction Following (EIF) is a task of planning a long sequence of sub-goals given high-level natural language instructions, such as "Rinse a slice of lettuce and place on the white table next to the fork". To successfully execute these long-term horizon tasks, we argue that an agent must consider its past, i.e., historical data, when making decisions in each step. Nevertheless, recent… ▽ More Embodied Instruction Following (EIF) is a task of planning a long sequence of sub-goals given high-level natural language instructions, such as "Rinse a slice of lettuce and place on the white table next to the fork". To successfully execute these long-term horizon tasks, we argue that an agent must consider its past, i.e., historical data, when making decisions in each step. Nevertheless, recent approaches in EIF often neglects the knowledge from historical data and also do not effectively utilize information across the modalities. To this end, we propose History-Aware Planning based on Fused Information (HAPFI), effectively leveraging the historical data from diverse modalities that agents collect while interacting with the environment. Specifically, HAPFI integrates multiple modalities, including historical RGB observations, bounding boxes, sub-goals, and high-level instructions, by effectively fusing modalities via our Mutually Attentive Fusion method. Through experiments with diverse comparisons, we show that an agent utilizing historical multi-modal information surpasses all the compared methods that neglect the historical data in terms of action planning capability, enabling the generation of well-informed action plans for the next step. Moreover, we provided qualitative evidence highlighting the significance of leveraging historical multi-modal data, particularly in scenarios where the agent encounters intermediate failures, showcasing its robust re-planning capabilities. △ Less

Submitted 23 July, 2024; originally announced July 2024.

Comments: 7 pages, 3 figures, published to ICRA 2024

arXiv:2407.10476 [pdf, other]

Kinetic Typography Diffusion Model

Authors: Seonmi Park, Inhwan Bae, Seunghyun Shin, Hae-Gon Jeon

Abstract: This paper introduces a method for realistic kinetic typography that generates user-preferred animatable 'text content'. We draw on recent advances in guided video diffusion models to achieve visually-pleasing text appearances. To do this, we first construct a kinetic typography dataset, comprising about 600K videos. Our dataset is made from a variety of combinations in 584 templates designed by p… ▽ More This paper introduces a method for realistic kinetic typography that generates user-preferred animatable 'text content'. We draw on recent advances in guided video diffusion models to achieve visually-pleasing text appearances. To do this, we first construct a kinetic typography dataset, comprising about 600K videos. Our dataset is made from a variety of combinations in 584 templates designed by professional motion graphics designers and involves changing each letter's position, glyph, and size (i.e., flying, glitches, chromatic aberration, reflecting effects, etc.). Next, we propose a video diffusion model for kinetic typography. For this, there are three requirements: aesthetic appearances, motion effects, and readable letters. This paper identifies the requirements. For this, we present static and dynamic captions used as spatial and temporal guidance of a video diffusion model, respectively. The static caption describes the overall appearance of the video, such as colors, texture and glyph which represent a shape of each letter. The dynamic caption accounts for the movements of letters and backgrounds. We add one more guidance with zero convolution to determine which text content should be visible in the video. We apply the zero convolution to the text content, and impose it on the diffusion model. Lastly, our glyph loss, only minimizing a difference between the predicted word and its ground-truth, is proposed to make the prediction letters readable. Experiments show that our model generates kinetic typography videos with legible and artistic letter motions based on text prompts. △ Less

Submitted 15 July, 2024; originally announced July 2024.

Comments: Accepted at ECCV 2024, Project page: https://seonmip.github.io/kinety

arXiv:2407.10339 [pdf, other]

Supernova Pointing Capabilities of DUNE

Authors: DUNE Collaboration, A. Abed Abud, B. Abi, R. Acciarri, M. A. Acero, M. R. Adames, G. Adamov, M. Adamowski, D. Adams, M. Adinolfi, C. Adriano, A. Aduszkiewicz, J. Aguilar, B. Aimard, F. Akbar, K. Allison, S. Alonso Monsalve, M. Alrashed, A. Alton, R. Alvarez, T. Alves, H. Amar, P. Amedo, J. Anderson, D. A. Andrade , et al. (1340 additional authors not shown)

Abstract: The determination of the direction of a stellar core collapse via its neutrino emission is crucial for the identification of the progenitor for a multimessenger follow-up. A highly effective method of reconstructing supernova directions within the Deep Underground Neutrino Experiment (DUNE) is introduced. The supernova neutrino pointing resolution is studied by simulating and reconstructing electr… ▽ More The determination of the direction of a stellar core collapse via its neutrino emission is crucial for the identification of the progenitor for a multimessenger follow-up. A highly effective method of reconstructing supernova directions within the Deep Underground Neutrino Experiment (DUNE) is introduced. The supernova neutrino pointing resolution is studied by simulating and reconstructing electron-neutrino charged-current absorption on $^{40}$Ar and elastic scattering of neutrinos on electrons. Procedures to reconstruct individual interactions, including a newly developed technique called ``brems flipping'', as well as the burst direction from an ensemble of interactions are described. Performance of the burst direction reconstruction is evaluated for supernovae happening at a distance of 10 kpc for a specific supernova burst flux model. The pointing resolution is found to be 3.4 degrees at 68% coverage for a perfect interaction-channel classification and a fiducial mass of 40 kton, and 6.6 degrees for a 10 kton fiducial mass respectively. Assuming a 4% rate of charged-current interactions being misidentified as elastic scattering, DUNE's burst pointing resolution is found to be 4.3 degrees (8.7 degrees) at 68% coverage. △ Less

Submitted 14 July, 2024; originally announced July 2024.

Comments: 25 pages, 16 figures

Report number: FERMILAB-PUB-24-0319-LBNF

arXiv:2406.19287 [pdf, other]

Isotropy of cosmic rays beyond $10^{20}$ eV favors their heavy mass composition

Authors: Telescope Array Collaboration, R. U. Abbasi, Y. Abe, T. Abu-Zayyad, M. Allen, Y. Arai, R. Arimura, E. Barcikowski, J. W. Belz, D. R. Bergman, S. A. Blake, I. Buckland, B. G. Cheon, M. Chikawa, T. Fujii, K. Fujisue, K. Fujita, R. Fujiwara, M. Fukushima, G. Furlich, N. Globus, R. Gonzalez, W. Hanlon, N. Hayashida, H. He , et al. (118 additional authors not shown)

Abstract: We report an estimation of the injected mass composition of ultra-high energy cosmic rays (UHECRs) at energies higher than 10 EeV. The composition is inferred from an energy-dependent sky distribution of UHECR events observed by the Telescope Array surface detector by comparing it to the Large Scale Structure of the local Universe. In the case of negligible extra-galactic magnetic fields the resul… ▽ More We report an estimation of the injected mass composition of ultra-high energy cosmic rays (UHECRs) at energies higher than 10 EeV. The composition is inferred from an energy-dependent sky distribution of UHECR events observed by the Telescope Array surface detector by comparing it to the Large Scale Structure of the local Universe. In the case of negligible extra-galactic magnetic fields the results are consistent with a relatively heavy injected composition at E ~ 10 EeV that becomes lighter up to E ~ 100 EeV, while the composition at E > 100 EeV is very heavy. The latter is true even in the presence of highest experimentally allowed extra-galactic magnetic fields, while the composition at lower energies can be light if a strong EGMF is present. The effect of the uncertainty in the galactic magnetic field on these results is subdominant. △ Less

Submitted 3 July, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

Comments: 8 pages, 3 figures, accepted for publication in PRL

arXiv:2406.19286 [pdf, other]

Mass composition of ultra-high energy cosmic rays from distribution of their arrival directions with the Telescope Array

Authors: Telescope Array Collaboration, R. U. Abbasi, Y. Abe, T. Abu-Zayyad, M. Allen, Y. Arai, R. Arimura, E. Barcikowski, J. W. Belz, D. R. Bergman, S. A. Blake, I. Buckland, B. G. Cheon, M. Chikawa, T. Fujii, K. Fujisue, K. Fujita, R. Fujiwara, M. Fukushima, G. Furlich, N. Globus, R. Gonzalez, W. Hanlon, N. Hayashida, H. He , et al. (118 additional authors not shown)

Abstract: We use a new method to estimate the injected mass composition of ultrahigh cosmic rays (UHECRs) at energies higher than 10 EeV. The method is based on comparison of the energy-dependent distribution of cosmic ray arrival directions as measured by the Telescope Array experiment (TA) with that calculated in a given putative model of UHECR under the assumption that sources trace the large-scale struc… ▽ More We use a new method to estimate the injected mass composition of ultrahigh cosmic rays (UHECRs) at energies higher than 10 EeV. The method is based on comparison of the energy-dependent distribution of cosmic ray arrival directions as measured by the Telescope Array experiment (TA) with that calculated in a given putative model of UHECR under the assumption that sources trace the large-scale structure (LSS) of the Universe. As we report in the companion letter, the TA data show large deflections with respect to the LSS which can be explained, assuming small extra-galactic magnetic fields (EGMF), by an intermediate composition changing to a heavy one (iron) in the highest energy bin. Here we show that these results are robust to uncertainties in UHECR injection spectra, the energy scale of the experiment and galactic magnetic fields (GMF). The assumption of weak EGMF, however, strongly affects this interpretation at all but the highest energies E > 100 EeV, where the remarkable isotropy of the data implies a heavy injected composition even in the case of strong EGMF. This result also holds if UHECR sources are as rare as $2 \times 10^{-5}$ Mpc$^{-3}$, that is the conservative lower limit for the source number density. △ Less

Submitted 3 July, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

Comments: 18 pages, 11 figures, accepted for publication in PRD

arXiv:2406.15524 [pdf, other]

Rethinking Pruning Large Language Models: Benefits and Pitfalls of Reconstruction Error Minimization

Authors: Sungbin Shin, Wonpyo Park, Jaeho Lee, Namhoon Lee

Abstract: This work suggests fundamentally rethinking the current practice of pruning large language models (LLMs). The way it is done is by divide and conquer: split the model into submodels, sequentially prune them, and reconstruct predictions of the dense counterparts on small calibration data one at a time; the final model is obtained simply by putting the resulting sparse submodels together. While this… ▽ More This work suggests fundamentally rethinking the current practice of pruning large language models (LLMs). The way it is done is by divide and conquer: split the model into submodels, sequentially prune them, and reconstruct predictions of the dense counterparts on small calibration data one at a time; the final model is obtained simply by putting the resulting sparse submodels together. While this approach enables pruning under memory constraints, it generates high reconstruction errors. In this work, we first present an array of reconstruction techniques that can significantly reduce this error by more than $90\%$. Unwittingly, however, we discover that minimizing reconstruction error is not always ideal and can overfit the given calibration data, resulting in rather increased language perplexity and poor performance at downstream tasks. We find out that a strategy of self-generating calibration data can mitigate this trade-off between reconstruction and generalization, suggesting new directions in the presence of both benefits and pitfalls of reconstruction for pruning LLMs. △ Less

Submitted 10 October, 2024; v1 submitted 21 June, 2024; originally announced June 2024.

Comments: EMNLP 2024 main

arXiv:2406.13900 [pdf, other]

doi 10.1103/PhysRevB.109.L241107

Anomalous Fermi pockets on Hund's metal surface of Sr2RuO4 induced by the correlation-enhanced spin-orbit coupling

Authors: Takeshi Kondo, Masayuki Ochi, Shuntaro Akebi, Yuyang Dong, Haruka Taniguchi, Yoshiteru Maeno, Shik Shin

Abstract: The electronic structure of the topmost layer in Sr2RuO4 in the close vicinity of the Fermi level is investigated by angle-resolved photoemission spectroscopy (ARPES) with a 7-eV laser. We find that the spin-orbit coupling (SOC) predicted as 100 meV by the density functional theory (DFT) calculations is enormously enhanced in a real material up to 250 meV, even more than that of bulk state (200 me… ▽ More The electronic structure of the topmost layer in Sr2RuO4 in the close vicinity of the Fermi level is investigated by angle-resolved photoemission spectroscopy (ARPES) with a 7-eV laser. We find that the spin-orbit coupling (SOC) predicted as 100 meV by the density functional theory (DFT) calculations is enormously enhanced in a real material up to 250 meV, even more than that of bulk state (200 meV), by the electron-correlation effect increased by the octahedral rotation in the crystal structure. This causes the formation of highly orbital-mixing small Fermi pockets and reasonably explains why the orbital-selective Mott transition (OSMT) is not realized in perovskite oxides with crystal distortion. Interestingly, Hund's metal feature allows the quasiparticle generation only near EF, restricting the spectral gap opening derived by band hybridization within an extremely small binding energy (< 10 meV). Furthermore, it causes coherent-incoherent crossover, making the Fermi pockets disappear at elevated temperatures. The anomalous Fermi pockets are characterized by the dichotomy of the orbital-isolating Hund's coupling and the orbital-mixing SOC, which is key to understanding the nature of Sr2RuO4. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Journal ref: Phys. Rev. B 109, L241107 (2024)

arXiv:2406.12705 [pdf, ps, other]

doi 10.7566/JPSJ.93.103702

Coherence Length of Electronic Nematicity in Iron-Based Superconductors

Authors: Yoichi Kageyama, Asato Onishi, C�dric Bareille, Kousuke Ishida, Yuta Mizukami, Shigeyuki Ishida, Hiroshi Eisaki, Kenichiro Hashimoto, Toshiyuki Taniuchi, Shik Shin, Hiroshi Kontani, Takasada Shibauchi

Abstract: Recent developments in laser-excited photoemission electron microscopy (laser-PEEM) advance the visualization of electronic nematicity and nematic domain structures in iron-based superconductors. In FeSe and BaFe$_2$(As$_{0.87}$P$_{0.13}$)$_2$ superconductors, it has been reported that the thickness of the electronic nematic domain walls is unexpectedly long, leading to the formation of mesoscopic… ▽ More Recent developments in laser-excited photoemission electron microscopy (laser-PEEM) advance the visualization of electronic nematicity and nematic domain structures in iron-based superconductors. In FeSe and BaFe$_2$(As$_{0.87}$P$_{0.13}$)$_2$ superconductors, it has been reported that the thickness of the electronic nematic domain walls is unexpectedly long, leading to the formation of mesoscopic nematicity wave [T. Shimojima $\textit{et al.}$, Science $\textbf{373}$ (2021) 1122]. This finding demonstrates that the nematic coherence length $ξ_{\rm nem}$ can be decoupled from the lattice domain wall. Here, we report that the electronic domain wall thickness shows a distinct variation in related materials: it is similarly long in FeSe$_{0.9}$S$_{0.1}$ whereas it is much shorter in undoped BaFe$_2$As$_2$. We find a correlation between the thick domain walls and the non-Fermi liquid properties of normal-state resistivity above the nematic transition temperature. This suggests that the nematic coherence length can be enhanced by underlying spin-orbital fluctuations responsible for the anomalous transport properties. △ Less

Submitted 3 August, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

Comments: 5 pages, 3 figures

arXiv:2406.11006 [pdf, other]

SPEAR: Receiver-to-Receiver Acoustic Neural Warping Field

Authors: Yuhang He, Shitong Xu, Jia-Xing Zhong, Sangyun Shin, Niki Trigoni, Andrew Markham

Abstract: We present SPEAR, a continuous receiver-to-receiver acoustic neural warping field for spatial acoustic effects prediction in an acoustic 3D space with a single stationary audio source. Unlike traditional source-to-receiver modelling methods that require prior space acoustic properties knowledge to rigorously model audio propagation from source to receiver, we propose to predict by warping the spat… ▽ More We present SPEAR, a continuous receiver-to-receiver acoustic neural warping field for spatial acoustic effects prediction in an acoustic 3D space with a single stationary audio source. Unlike traditional source-to-receiver modelling methods that require prior space acoustic properties knowledge to rigorously model audio propagation from source to receiver, we propose to predict by warping the spatial acoustic effects from one reference receiver position to another target receiver position, so that the warped audio essentially accommodates all spatial acoustic effects belonging to the target position. SPEAR can be trained in a data much more readily accessible manner, in which we simply ask two robots to independently record spatial audio at different positions. We further theoretically prove the universal existence of the warping field if and only if one audio source presents. Three physical principles are incorporated to guide SPEAR network design, leading to the learned warping field physically meaningful. We demonstrate SPEAR superiority on both synthetic, photo-realistic and real-world dataset, showing the huge potential of SPEAR to various down-stream robotic tasks. △ Less

Submitted 16 June, 2024; originally announced June 2024.

Comments: 9 pages, 5 figures in main paper

arXiv:2406.09459 [pdf, other]

Ad Auctions for LLMs via Retrieval Augmented Generation

Authors: MohammadTaghi Hajiaghayi, Sébastien Lahaie, Keivan Rezaei, Suho Shin

Abstract: In the field of computational advertising, the integration of ads into the outputs of large language models (LLMs) presents an opportunity to support these services without compromising content integrity. This paper introduces novel auction mechanisms for ad allocation and pricing within the textual outputs of LLMs, leveraging retrieval-augmented generation (RAG). We propose a segment auction wher… ▽ More In the field of computational advertising, the integration of ads into the outputs of large language models (LLMs) presents an opportunity to support these services without compromising content integrity. This paper introduces novel auction mechanisms for ad allocation and pricing within the textual outputs of LLMs, leveraging retrieval-augmented generation (RAG). We propose a segment auction where an ad is probabilistically retrieved for each discourse segment (paragraph, section, or entire output) according to its bid and relevance, following the RAG framework, and priced according to competing bids. We show that our auction maximizes logarithmic social welfare, a new notion of welfare that balances allocation efficiency and fairness, and we characterize the associated incentive-compatible pricing rule. These results are extended to multi-ad allocation per segment. An empirical evaluation validates the feasibility and effectiveness of our approach over several ad auction scenarios, and exhibits inherent tradeoffs in metrics as we allow the LLM more flexibility to allocate ads. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2406.08612 [pdf, other]

Observation of Declination Dependence in the Cosmic Ray Energy Spectrum

Authors: The Telescope Array Collaboration, R. U. Abbasi, T. Abu-Zayyad, M. Allen, J. W. Belz, D. R. Bergman, I. Buckland, W. Campbell, B. G. Cheon, K. Endo, A. Fedynitch, T. Fujii, K. Fujisue, K. Fujita, M. Fukushima, G. Furlich, Z. Gerber, N. Globus, W. Hanlon, N. Hayashida, H. He, K. Hibino, R. Higuchi, D. Ikeda, T. Ishii , et al. (101 additional authors not shown)

Abstract: We report on an observation of the difference between northern and southern skies of the ultrahigh energy cosmic ray energy spectrum with a significance of ${\sim}8σ$. We use measurements from the two largest experiments$\unicode{x2014}$the Telescope Array observing the northern hemisphere and the Pierre Auger Observatory viewing the southern hemisphere. Since the comparison of two measurements fr… ▽ More We report on an observation of the difference between northern and southern skies of the ultrahigh energy cosmic ray energy spectrum with a significance of ${\sim}8σ$. We use measurements from the two largest experiments$\unicode{x2014}$the Telescope Array observing the northern hemisphere and the Pierre Auger Observatory viewing the southern hemisphere. Since the comparison of two measurements from different observatories introduces the issue of possible systematic differences between detectors and analyses, we validate the methodology of the comparison by examining the region of the sky where the apertures of the two observatories overlap. Although the spectra differ in this region, we find that there is only a $1.8σ$ difference between the spectrum measurements when anisotropic regions are removed and a fiducial cut in the aperture is applied. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 8 pages, 6 figures

arXiv:2406.03063 [pdf, other]

In-operando microwave scattering-parameter calibrated measurement of a Josephson travelling wave parametric amplifier

Authors: S. H. Shin, M. Stanley, W. N. Wong, T. Sweetnam, A. Elarabi, T. Lindström, N. M. Ridler, S. E. de Graaf

Abstract: Superconducting travelling wave parametric amplifiers (TWPAs) are broadband near-quantum limited microwave amplifiers commonly used for qubit readout and a wide range of other applications in quantum technologies. The performance of these amplifiers depends on achieving impedance matching to minimise reflected signals. Here we apply a microwave calibration technique to extract the S-parameters of… ▽ More Superconducting travelling wave parametric amplifiers (TWPAs) are broadband near-quantum limited microwave amplifiers commonly used for qubit readout and a wide range of other applications in quantum technologies. The performance of these amplifiers depends on achieving impedance matching to minimise reflected signals. Here we apply a microwave calibration technique to extract the S-parameters of a Josephson junction based TWPA in-operando. This enables reflections occurring at the TWPA and its extended network of components to be quantified, and we find that the in-operation performance can be well described by the off-state measured S-parameters. △ Less

Submitted 5 June, 2024; originally announced June 2024.

arXiv:2405.20597 [pdf]

Double-sided van der Waals epitaxy of topological insulators across an atomically thin membrane

Authors: Joon Young Park, Young Jae Shin, Jeacheol Shin, Jehyun Kim, Janghyun Jo, Hyobin Yoo, Danial Haei, Chohee Hyun, Jiyoung Yun, Robert M. Huber, Arijit Gupta, Kenji Watanabe, Takashi Taniguchi, Wan Kyu Park, Hyeon Suk Shin, Miyoung Kim, Dohun Kim, Gyu-Chul Yi, Philip Kim

Abstract: Atomically thin van der Waals (vdW) films provide a novel material platform for epitaxial growth of quantum heterostructures. However, unlike the remote epitaxial growth of three-dimensional bulk crystals, the growth of two-dimensional (2D) material heterostructures across atomic layers has been limited due to the weak vdW interaction. Here, we report the double-sided epitaxy of vdW layered materi… ▽ More Atomically thin van der Waals (vdW) films provide a novel material platform for epitaxial growth of quantum heterostructures. However, unlike the remote epitaxial growth of three-dimensional bulk crystals, the growth of two-dimensional (2D) material heterostructures across atomic layers has been limited due to the weak vdW interaction. Here, we report the double-sided epitaxy of vdW layered materials through atomic membranes. We grow vdW topological insulators (TIs) Sb$_2$Te$_3$ and Bi$_2$Se$_3$ by molecular beam epitaxy on both surfaces of atomically thin graphene or hBN, which serve as suspended 2D vdW "$\textit{substrate}$" layers. Both homo- and hetero- double-sided vdW TI tunnel junctions are fabricated, with the atomically thin hBN acting as a crystal-momentum-conserving tunnelling barrier with abrupt and epitaxial interface. By performing field-angle dependent magneto-tunnelling spectroscopy on these devices, we reveal the energy-momentum-spin resonant tunnelling of massless Dirac electrons between helical Landau levels developed in the topological surface states at the interface. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 24 pages, 4 main figures, 7 extended data figures

arXiv:2405.14236 [pdf, other]

Condensed-space methods for nonlinear programming on GPUs

Authors: Fran�ois Pacaud, Sungho Shin, Alexis Montoison, Michel Schanen, Mihai Anitescu

Abstract: This paper explores two condensed-space interior-point methods to efficiently solve large-scale nonlinear programs on graphics processing units (GPUs). The interior-point method solves a sequence of symmetric indefinite linear systems, or Karush-Kuhn-Tucker (KKT) systems, which become increasingly ill-conditioned as we approach the solution. Solving a KKT system with traditional sparse factorizati… ▽ More This paper explores two condensed-space interior-point methods to efficiently solve large-scale nonlinear programs on graphics processing units (GPUs). The interior-point method solves a sequence of symmetric indefinite linear systems, or Karush-Kuhn-Tucker (KKT) systems, which become increasingly ill-conditioned as we approach the solution. Solving a KKT system with traditional sparse factorization methods involve numerical pivoting, making parallelization difficult. A solution is to condense the KKT system into a symmetric positive-definite matrix and solve it with a Cholesky factorization, stable without pivoting. Although condensed KKT systems are more prone to ill-conditioning than the original ones, they exhibit structured ill-conditioning that mitigates the loss of accuracy. This paper compares the benefits of two recent condensed-space interior-point methods, HyKKT and LiftedKKT. We implement the two methods on GPUs using MadNLP.jl, an optimization solver interfaced with the NVIDIA sparse linear solver cuDSS and with the GPU-accelerated modeler ExaModels.jl. Our experiments on the PGLIB and the COPS benchmarks reveal that GPUs can attain up to a tenfold speed increase compared to CPUs when solving large-scale instances. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.14032 [pdf, other]

Scalable Multi-Period AC Optimal Power Flow Utilizing GPUs with High Memory Capacities

Authors: Sungho Shin, Vishwas Rao, Michel Schanen, D. Adrian Maldonado, Mihai Anitescu

Abstract: This paper demonstrates the scalability of open-source GPU-accelerated nonlinear programming (NLP) frameworks -- ExaModels.jl and MadNLP.jl -- for solving multi-period alternating current (AC) optimal power flow (OPF) problems on GPUs with high memory capacities (e.g., NVIDIA GH200 with 480 GB of unified memory). There has been a growing interest in solving multi-period AC OPF problems, as the inc… ▽ More This paper demonstrates the scalability of open-source GPU-accelerated nonlinear programming (NLP) frameworks -- ExaModels.jl and MadNLP.jl -- for solving multi-period alternating current (AC) optimal power flow (OPF) problems on GPUs with high memory capacities (e.g., NVIDIA GH200 with 480 GB of unified memory). There has been a growing interest in solving multi-period AC OPF problems, as the increasingly fluctuating electricity market requires operation planning over multiple periods. These problems, formerly deemed intractable, are now becoming technologically feasible to solve thanks to the advent of high-memory GPU hardware and accelerated NLP tools. This study evaluates the capability of these tools to tackle previously unsolvable multi-period AC OPF instances. Our numerical experiments, run on an NVIDIA GH200, demonstrate that we can solve a multi-period OPF instance with more than 10 million variables up to $10^{-4}$ precision in less than 10 minutes. These results demonstrate the efficacy of the GPU-accelerated NLP frameworks for the solution of extreme-scale multi-period OPF. We provide ExaModelsPower.jl, an open-source modeling tool for multi-period AC OPF models for GPUs. △ Less

Submitted 22 May, 2024; originally announced May 2024.

arXiv:2405.11158 [pdf, other]

Dusk Till Dawn: Self-supervised Nighttime Stereo Depth Estimation using Visual Foundation Models

Authors: Madhu Vankadari, Samuel Hodgson, Sangyun Shin, Kaichen Zhou Andrew Markham, Niki Trigoni

Abstract: Self-supervised depth estimation algorithms rely heavily on frame-warping relationships, exhibiting substantial performance degradation when applied in challenging circumstances, such as low-visibility and nighttime scenarios with varying illumination conditions. Addressing this challenge, we introduce an algorithm designed to achieve accurate self-supervised stereo depth estimation focusing on ni… ▽ More Self-supervised depth estimation algorithms rely heavily on frame-warping relationships, exhibiting substantial performance degradation when applied in challenging circumstances, such as low-visibility and nighttime scenarios with varying illumination conditions. Addressing this challenge, we introduce an algorithm designed to achieve accurate self-supervised stereo depth estimation focusing on nighttime conditions. Specifically, we use pretrained visual foundation models to extract generalised features across challenging scenes and present an efficient method for matching and integrating these features from stereo frames. Moreover, to prevent pixels violating photometric consistency assumption from negatively affecting the depth predictions, we propose a novel masking approach designed to filter out such pixels. Lastly, addressing weaknesses in the evaluation of current depth estimation algorithms, we present novel evaluation metrics. Our experiments, conducted on challenging datasets including Oxford RobotCar and Multi-Spectral Stereo, demonstrate the robust improvements realized by our approach. Code is available at: https://github.com/madhubabuv/dtd △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: The paper is published at ICRA 2024

arXiv:2405.08724 [pdf, other]

Revisiting Reactor Anti-Neutrino 5 MeV Bump with $^{13}$C Neutral-Current Interaction

Authors: Pouya Bakhti, Min-Gwa Park, Meshkat Rajaee, Chang Sub Shin, Seodong Shin

Abstract: For the first time, we systematically investigate the potential of neutrino-nucleus neutral current interactions with $^{13}$C to identify the origin of the 5 MeV bump observed in reactor anti-neutrino spectra in the inverse beta decay process. The distinctive signal is obtained from the de-excitation of $^{13}$C$^*$ into the ground state emitting a 3.685 MeV photon in various liquid scintillator… ▽ More For the first time, we systematically investigate the potential of neutrino-nucleus neutral current interactions with $^{13}$C to identify the origin of the 5 MeV bump observed in reactor anti-neutrino spectra in the inverse beta decay process. The distinctive signal is obtained from the de-excitation of $^{13}$C$^*$ into the ground state emitting a 3.685 MeV photon in various liquid scintillator detectors. Such an interaction predominantly occurs for the reactor anti-neutrinos within the energy range coinciding with the 5 MeV bump. For a detector that has a capability of 95\% level photon and electron separation and small thorium contamination below $5 \times 10^{-17}$ gr/gr located in a site with an overburden of about a few hundred m.w.e, such as the location of near detectors of RENO and Daya Bay will have a great sensitivity to resolve the 5 MeV bump. In addition, we propose a novel approach to track the time evolution of reactor isotopes by analyzing our $^{13}$C signal shedding light on the contributions from $^{235}$U or $^{239}$Pu to the observed bump. This provides an extra powerful tool in both discriminating the flux models and testing any new physics possibilities for the 5 MeV bump at 3$σ$ to 5$σ$ level with much less systematic uncertainties and assuming 10 kt.year of data collection. Our detector requirements are realistic, aligning well with recent studies conducted for existing or forthcoming experiments. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 6 pages, 3 figures

arXiv:2404.15542 [pdf]

doi 10.1021/acs.nanolett.4c01764

Twisted MoSe2 Homobilayer Behaving as a Heterobilayer

Authors: Arka Karmakar, Abdullah Al-Mahboob, Natalia Zawadzka, Mateusz Raczyński, Weiguang Yang, Mehdi Arfaoui, Gayatri, Julia Kucharek, Jerzy T. Sadowski, Hyeon Suk Shin, Adam Babiński, Wojciech Pacuski, Tomasz Kazimierczuk, Maciej R Molas

Abstract: Heterostructures (HSs) formed by the transition-metal dichalcogenides (TMDCs) materials have shown great promise in next-generation optoelectronic and photonic applications. An artificially twisted HS, allows us to manipulate the optical, and electronic properties. With this work, we introduce the understanding of the complex energy transfer (ET) process governed by the dipolar interaction in a tw… ▽ More Heterostructures (HSs) formed by the transition-metal dichalcogenides (TMDCs) materials have shown great promise in next-generation optoelectronic and photonic applications. An artificially twisted HS, allows us to manipulate the optical, and electronic properties. With this work, we introduce the understanding of the complex energy transfer (ET) process governed by the dipolar interaction in a twisted molybdenum diselenide (MoSe2) homobilayer without any charge-blocking interlayer. We fabricated an unconventional homobilayer (i.e., HS) with a large twist angle by combining the chemical vapor deposition (CVD) and mechanical exfoliation (Exf.) techniques to fully exploit the lattice parameters mismatch and indirect/direct (CVD/Exf.) bandgap nature. This effectively weaken the charge transfer (CT) process and allows the ET process to take over the carrier recombination channels. We utilize a series of optical and electron spectroscopy techniques complementing by the density functional theory calculations, to describe a massive photoluminescence enhancement from the HS area due to an efficient ET process. Our results show that the electronically decoupled MoSe2 homobilayer is coupled by the ET process, mimicking a 'true' heterobilayer nature. △ Less

Submitted 7 June, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

Comments: 4 figures

arXiv:2404.15190 [pdf, other]

Socratic Planner: Inquiry-Based Zero-Shot Planning for Embodied Instruction Following

Authors: Suyeon Shin, Sujin jeon, Junghyun Kim, Gi-Cheon Kang, Byoung-Tak Zhang

Abstract: Embodied Instruction Following (EIF) is the task of executing natural language instructions by navigating and interacting with objects in 3D environments. One of the primary challenges in EIF is compositional task planning, which is often addressed with supervised or in-context learning with labeled data. To this end, we introduce the Socratic Planner, the first zero-shot planning method that infe… ▽ More Embodied Instruction Following (EIF) is the task of executing natural language instructions by navigating and interacting with objects in 3D environments. One of the primary challenges in EIF is compositional task planning, which is often addressed with supervised or in-context learning with labeled data. To this end, we introduce the Socratic Planner, the first zero-shot planning method that infers without the need for any training data. Socratic Planner first decomposes the instructions into substructural information of the task through self-questioning and answering, translating it into a high-level plan, i.e., a sequence of subgoals. Subgoals are executed sequentially, with our visually grounded re-planning mechanism adjusting plans dynamically through a dense visual feedback. We also introduce an evaluation metric of high-level plans, RelaxedHLP, for a more comprehensive evaluation. Experiments demonstrate the effectiveness of the Socratic Planner, achieving competitive performance on both zero-shot and few-shot task planning in the ALFRED benchmark, particularly excelling in tasks requiring higher-dimensional inference. Additionally, a precise adjustments in the plan were achieved by incorporating environmental visual information. △ Less

Submitted 21 April, 2024; originally announced April 2024.

Comments: 14 pages, 6 figures

MSC Class: 68T01 (Primary) 68T40; 68T50; 68T45 (Secondary)

arXiv:2404.10966 [pdf, other]

Domain-Specific Block Selection and Paired-View Pseudo-Labeling for Online Test-Time Adaptation

Authors: Yeonguk Yu, Sungho Shin, Seunghyeok Back, Minhwan Ko, Sangjun Noh, Kyoobin Lee

Abstract: Test-time adaptation (TTA) aims to adapt a pre-trained model to a new test domain without access to source data after deployment. Existing approaches typically rely on self-training with pseudo-labels since ground-truth cannot be obtained from test data. Although the quality of pseudo labels is important for stable and accurate long-term adaptation, it has not been previously addressed. In this wo… ▽ More Test-time adaptation (TTA) aims to adapt a pre-trained model to a new test domain without access to source data after deployment. Existing approaches typically rely on self-training with pseudo-labels since ground-truth cannot be obtained from test data. Although the quality of pseudo labels is important for stable and accurate long-term adaptation, it has not been previously addressed. In this work, we propose DPLOT, a simple yet effective TTA framework that consists of two components: (1) domain-specific block selection and (2) pseudo-label generation using paired-view images. Specifically, we select blocks that involve domain-specific feature extraction and train these blocks by entropy minimization. After blocks are adjusted for current test domain, we generate pseudo-labels by averaging given test images and corresponding flipped counterparts. By simply using flip augmentation, we prevent a decrease in the quality of the pseudo-labels, which can be caused by the domain gap resulting from strong augmentation. Our experimental results demonstrate that DPLOT outperforms previous TTA methods in CIFAR10-C, CIFAR100-C, and ImageNet-C benchmarks, reducing error by up to 5.4%, 9.1%, and 2.9%, respectively. Also, we provide an extensive analysis to demonstrate effectiveness of our framework. Code is available at https://github.com/gist-ailab/domain-specific-block-selection-and-paired-view-pseudo-labeling-for-online-TTA. △ Less

Submitted 7 May, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

Comments: Accepted at CVPR 2024

arXiv:2404.09712 [pdf, ps, other]

1/2$^-$ $α$ cluster resonances of $^{13}$C studied by the analytic continuation in the coupling constant

Authors: Seungheon Shin, Masaaki Kimura, Bo Zhou, Qing Zhao

Abstract: The 1/2$^-$ resonant states in $^{13}{\rm C}$ are investigated to search for the Hoyle-analog state. In order to treat the resonance states located around the 3$α+n$ threshold, the analytic continuation in the coupling constant (ACCC) has been combined with the real-time evolution method (REM). The properties of the 1/2$^-$ resonance states such as the radii and monopole transition probabilities a… ▽ More The 1/2$^-$ resonant states in $^{13}{\rm C}$ are investigated to search for the Hoyle-analog state. In order to treat the resonance states located around the 3$α+n$ threshold, the analytic continuation in the coupling constant (ACCC) has been combined with the real-time evolution method (REM). The properties of the 1/2$^-$ resonance states such as the radii and monopole transition probabilities are calculated. We show the 1/2$^-_3$ and 1/2$^-_4$ states are well-developed $α$ cluster states, and the 1/2$^-_4$ state is a candidate of the Hoyle-analog state. △ Less

Submitted 15 April, 2024; originally announced April 2024.

arXiv:2404.09196 [pdf, other]

Surfactant-laden bubble bursting: dynamics of capillary waves and Worthington jet at large Bond number

Authors: Paula Pico, Lyes Kahouadji, Seungwon Shin, Jalel Chergui, Damir Juric, Omar K. Matar

Abstract: We present a numerical study of the main sub-stages preceding aerosol formation via bursting bubbles: capillary wave propagation along the bubble, convergence at the bubble's apex, the ascent of a Worthington jet and its break-up to release liquid drops. We focus on two crucial yet overlooked aspects of the system: the presence of surface-active agents and dynamics driven by non-negligible gravita… ▽ More We present a numerical study of the main sub-stages preceding aerosol formation via bursting bubbles: capillary wave propagation along the bubble, convergence at the bubble's apex, the ascent of a Worthington jet and its break-up to release liquid drops. We focus on two crucial yet overlooked aspects of the system: the presence of surface-active agents and dynamics driven by non-negligible gravitational effects, quantified by the Bond number. Our results propose, for the first time, a mechanism explaining capillary wave retardation in the presence of surfactants, involving the transition from bi- to uni-directional Marangoni stresses, which pull the interface upwards, countering the motion of the waves. We also quantitatively elucidate the variable nature of the waves' velocity with various surfactant parameters, including surfactant solubility and elasticity, a departure from the constant behaviour well-documented in clean interfaces. △ Less

Submitted 14 April, 2024; originally announced April 2024.

Comments: 22 pages, 14 figures

arXiv:2404.08274 [pdf, other]

doi 10.1103/PhysRevB.110.094101

(C$_5$H$_9$NH$_3$)$_2$CuBr$_4$: a metal-organic two-ladder quantum magnet

Authors: J. Philippe, F. Elson, M. P. N. Casati, S. Sanz, M. Metzelaars, O. Shliakhtun, O. K. Forslund, J. Lass, T. Shiroka, A. Linden, D. G. Mazzone, J. Ollivier, S. Shin, M. Medarde, B. Lake, M. Mansson, M. Bartkowiak, B. Normand, P. K�gerler, Y. Sassa, M. Janoschek, G. Simutis

Abstract: Low-dimensional quantum magnets are a versatile materials platform for studying the emergent many-body physics and collective excitations that can arise even in systems with only short-range interactions. Understanding their low-temperature structure and spin Hamiltonian is key to explaining their magnetic properties, including unconventional quantum phases, phase transitions, and excited states.… ▽ More Low-dimensional quantum magnets are a versatile materials platform for studying the emergent many-body physics and collective excitations that can arise even in systems with only short-range interactions. Understanding their low-temperature structure and spin Hamiltonian is key to explaining their magnetic properties, including unconventional quantum phases, phase transitions, and excited states. We study the metal-organic coordination compound (C$_5$H$_9$NH$_3$)$_2$CuBr$_4$ and its deuterated counterpart, which upon its discovery was identified as a candidate two-leg quantum ($S = 1/2$) spin ladder in the strong-leg coupling regime. By growing large single crystals and probing them with both bulk and microscopic techniques, we deduce that two previously unknown structural phase transitions take place between 136 K and 113 K. The low-temperature structure has a monoclinic unit cell giving rise to two inequivalent spin ladders. We further confirm the absence of long-range magnetic order down to 30 mK and discuss the implications of this two-ladder structure for the magnetic properties of (C$_5$H$_9$NH$_3$)$_2$CuBr$_4$. △ Less

Submitted 6 September, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

Journal ref: Phys. Rev. B 110, 094101 (2024)

arXiv:2404.02147 [pdf, other]

Harder, Better, Faster, Stronger: Interactive Visualization for Human-Centered AI Tools

Authors: Md Naimul Hoque, Sungbok Shin, Niklas Elmqvist

Abstract: Human-centered AI (HCAI), rather than replacing the human, puts the human user in the driver's seat of so-called human-centered AI-infused tools (HCAI tools): interactive software tools that amplify, augment, empower, and enhance human performance using AI models; often novel generative or foundation AI ones. In this paper, we discuss how interactive visualization can be a key enabling technology… ▽ More Human-centered AI (HCAI), rather than replacing the human, puts the human user in the driver's seat of so-called human-centered AI-infused tools (HCAI tools): interactive software tools that amplify, augment, empower, and enhance human performance using AI models; often novel generative or foundation AI ones. In this paper, we discuss how interactive visualization can be a key enabling technology for creating such human-centered AI tools. Visualization has already been shown to be a fundamental component in explainable AI models, and coupling this with data-driven, semantic, and unified interaction feedback loops will enable a human-centered approach to integrating AI models in the loop with human users. We present several examples of our past and current work on such HCAI tools, including for creative writing, temporal prediction, and user experience analysis. We then draw parallels between these tools to suggest common themes on how interactive visualization can support the design of future HCAI tools. △ Less

Submitted 2 April, 2024; originally announced April 2024.

arXiv:2404.01954 [pdf, other]

HyperCLOVA X Technical Report

Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seongjin Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment to responsible AI. The model is evaluated across various benchmarks, including comprehensive reasoning, knowledge, commonsense, factuality, coding, math, chatting, instruction-following, and harmlessness, in both Korean and English. HyperCLOVA X exhibits strong reasoning capabilities in Korean backed by a deep understanding of the language and cultural nuances. Further analysis of the inherent bilingual nature and its extension to multilingualism highlights the model's cross-lingual proficiency and strong generalization ability to untargeted languages, including machine translation between several language pairs and cross-lingual inference tasks. We believe that HyperCLOVA X can provide helpful guidance for regions or countries in developing their sovereign LLMs. △ Less

Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

Comments: 44 pages; updated authors list and fixed author names

arXiv:2403.15913 [pdf, other]

GPU-accelerated dynamic nonlinear optimization with ExaModels and MadNLP

Authors: Fran�ois Pacaud, Sungho Shin

Abstract: We investigate the potential of Graphics Processing Units (GPUs) to solve large-scale nonlinear programs with a dynamic structure. Using ExaModels, a GPU-accelerated automatic differentiation tool, and the interior-point solver MadNLP, we significantly reduce the time to solve dynamic nonlinear optimization problems. The sparse linear systems formulated in the interior-point method is solved on th… ▽ More We investigate the potential of Graphics Processing Units (GPUs) to solve large-scale nonlinear programs with a dynamic structure. Using ExaModels, a GPU-accelerated automatic differentiation tool, and the interior-point solver MadNLP, we significantly reduce the time to solve dynamic nonlinear optimization problems. The sparse linear systems formulated in the interior-point method is solved on the GPU using a hybrid solver combining an iterative method with a sparse Cholesky factorization, which harness the newly released NVIDIA cuDSS solver. Our results on the classical distillation column instance show that despite a significant pre-processing time, the hybrid solver allows to reduce the time per iteration by a factor of 25 for the largest instance. △ Less

Submitted 12 September, 2024; v1 submitted 23 March, 2024; originally announced March 2024.

Comments: 6 pages, 1 figure, 2 tables

Showing 1–50 of 827 results for author: Shin, S