Search | arXiv e-print repository

An Active Learning Framework for Inclusive Generation by Large Language Models

Authors: Sabit Hassan, Anthony Sicilia, Malihe Alikhani

Abstract: Ensuring that Large Language Models (LLMs) generate text representative of diverse sub-populations is essential, particularly when key concepts related to under-represented groups are scarce in the training data. We address this challenge with a novel clustering-based active learning framework, enhanced with knowledge distillation. The proposed framework transforms the intermediate outputs of the… ▽ More Ensuring that Large Language Models (LLMs) generate text representative of diverse sub-populations is essential, particularly when key concepts related to under-represented groups are scarce in the training data. We address this challenge with a novel clustering-based active learning framework, enhanced with knowledge distillation. The proposed framework transforms the intermediate outputs of the learner model, enabling effective active learning for generative tasks for the first time. Integration of clustering and knowledge distillation yields more representative models without prior knowledge of underlying data distribution and overbearing human efforts. We validate our approach in practice through case studies in counter-narration and style transfer. We construct two new datasets in tandem with model training, showing a performance improvement of 2%-10% over baseline models. Our results also show more consistent performance across various data subgroups and increased lexical diversity, underscoring our model's resilience to skewness in available data. Further, our results show that the data acquired via our approach improves the performance of secondary models not involved in the learning loop, showcasing practical utility of the framework. △ Less

Submitted 17 October, 2024; originally announced October 2024.

arXiv:2410.11114 [pdf, other]

Active Learning for Robust and Representative LLM Generation in Safety-Critical Scenarios

Authors: Sabit Hassan, Anthony Sicilia, Malihe Alikhani

Abstract: Ensuring robust safety measures across a wide range of scenarios is crucial for user-facing systems. While Large Language Models (LLMs) can generate valuable data for safety measures, they often exhibit distributional biases, focusing on common scenarios and neglecting rare but critical cases. This can undermine the effectiveness of safety protocols developed using such data. To address this, we p… ▽ More Ensuring robust safety measures across a wide range of scenarios is crucial for user-facing systems. While Large Language Models (LLMs) can generate valuable data for safety measures, they often exhibit distributional biases, focusing on common scenarios and neglecting rare but critical cases. This can undermine the effectiveness of safety protocols developed using such data. To address this, we propose a novel framework that integrates active learning with clustering to guide LLM generation, enhancing their representativeness and robustness in safety scenarios. We demonstrate the effectiveness of our approach by constructing a dataset of 5.4K potential safety violations through an iterative process involving LLM generation and an active learner model's feedback. Our results show that the proposed framework produces a more representative set of safety scenarios without requiring prior knowledge of the underlying data distribution. Additionally, data acquired through our method improves the accuracy and F1 score of both the active learner model as well models outside the scope of active learning process, highlighting its broad applicability. △ Less

Submitted 14 October, 2024; originally announced October 2024.

arXiv:2409.18718 [pdf, other]

Enhancing Spectrum Efficiency in 6G Satellite Networks: A GAIL-Powered Policy Learning via Asynchronous Federated Inverse Reinforcement Learning

Authors: Sheikh Salman Hassan, Yu Min Park, Yan Kyaw Tun, Walid Saad, Zhu Han, Choong Seon Hong

Abstract: In this paper, a novel generative adversarial imitation learning (GAIL)-powered policy learning approach is proposed for optimizing beamforming, spectrum allocation, and remote user equipment (RUE) association in NTNs. Traditional reinforcement learning (RL) methods for wireless network optimization often rely on manually designed reward functions, which can require extensive parameter tuning. To… ▽ More In this paper, a novel generative adversarial imitation learning (GAIL)-powered policy learning approach is proposed for optimizing beamforming, spectrum allocation, and remote user equipment (RUE) association in NTNs. Traditional reinforcement learning (RL) methods for wireless network optimization often rely on manually designed reward functions, which can require extensive parameter tuning. To overcome these limitations, we employ inverse RL (IRL), specifically leveraging the GAIL framework, to automatically learn reward functions without manual design. We augment this framework with an asynchronous federated learning approach, enabling decentralized multi-satellite systems to collaboratively derive optimal policies. The proposed method aims to maximize spectrum efficiency (SE) while meeting minimum information rate requirements for RUEs. To address the non-convex, NP-hard nature of this problem, we combine the many-to-one matching theory with a multi-agent asynchronous federated IRL (MA-AFIRL) framework. This allows agents to learn through asynchronous environmental interactions, improving training efficiency and scalability. The expert policy is generated using the Whale optimization algorithm (WOA), providing data to train the automatic reward function within GAIL. Simulation results show that the proposed MA-AFIRL method outperforms traditional RL approaches, achieving a $14.6\%$ improvement in convergence and reward value. The novel GAIL-driven policy learning establishes a novel benchmark for 6G NTN optimization. △ Less

Submitted 27 September, 2024; originally announced September 2024.

Comments: Submitted to IEEE Transactions on Mobile Computing (16 pages, 10 figures)

arXiv:2409.18715 [pdf, other]

doi 10.1109/ICIP51287.2024.10648275

Multi-modal Medical Image Fusion For Non-Small Cell Lung Cancer Classification

Authors: Salma Hassan, Hamad Al Hammadi, Ibrahim Mohammed, Muhammad Haris Khan

Abstract: The early detection and nuanced subtype classification of non-small cell lung cancer (NSCLC), a predominant cause of cancer mortality worldwide, is a critical and complex issue. In this paper, we introduce an innovative integration of multi-modal data, synthesizing fused medical imaging (CT and PET scans) with clinical health records and genomic data. This unique fusion methodology leverages advan… ▽ More The early detection and nuanced subtype classification of non-small cell lung cancer (NSCLC), a predominant cause of cancer mortality worldwide, is a critical and complex issue. In this paper, we introduce an innovative integration of multi-modal data, synthesizing fused medical imaging (CT and PET scans) with clinical health records and genomic data. This unique fusion methodology leverages advanced machine learning models, notably MedClip and BEiT, for sophisticated image feature extraction, setting a new standard in computational oncology. Our research surpasses existing approaches, as evidenced by a substantial enhancement in NSCLC detection and classification precision. The results showcase notable improvements across key performance metrics, including accuracy, precision, recall, and F1-score. Specifically, our leading multi-modal classifier model records an impressive accuracy of 94.04%. We believe that our approach has the potential to transform NSCLC diagnostics, facilitating earlier detection and more effective treatment planning and, ultimately, leading to superior patient outcomes in lung cancer care. △ Less

Submitted 27 September, 2024; originally announced September 2024.

arXiv:2409.15724 [pdf, other]

LLM-Cure: LLM-based Competitor User Review Analysis for Feature Enhancement

Authors: Maram Assi, Safwat Hassan, Ying Zou

Abstract: The exponential growth of the mobile app market underscores the importance of constant innovation and rapid response to user demands. As user satisfaction is paramount to the success of a mobile application (app), developers typically rely on user reviews, which represent user feedback that includes ratings and comments to identify areas for improvement. However, the sheer volume of user reviews p… ▽ More The exponential growth of the mobile app market underscores the importance of constant innovation and rapid response to user demands. As user satisfaction is paramount to the success of a mobile application (app), developers typically rely on user reviews, which represent user feedback that includes ratings and comments to identify areas for improvement. However, the sheer volume of user reviews poses challenges in manual analysis, necessitating automated approaches. Existing automated approaches either analyze only the target apps reviews, neglecting the comparison of similar features to competitors or fail to provide suggestions for feature enhancement. To address these gaps, we propose a Large Language Model (LLM)-based Competitive User Review Analysis for Feature Enhancement) (LLM-Cure), an approach powered by LLMs to automatically generate suggestion s for mobile app feature improvements. More specifically, LLM-Cure identifies and categorizes features within reviews by applying LLMs. When provided with a complaint in a user review, LLM-Cure curates highly rated (4 and 5 stars) reviews in competing apps related to the complaint and proposes potential improvements tailored to the target application. We evaluate LLM-Cure on 1,056,739 reviews of 70 popular Android apps. Our evaluation demonstrates that LLM-Cure significantly outperforms the state-of-the-art approaches in assigning features to reviews by up to 13% in F1-score, up to 16% in recall and up to 11% in precision. Additionally, LLM-Cure demonstrates its capability to provide suggestions for resolving user complaints. We verify the suggestions using the release notes that reflect the changes of features in the target mobile app. LLM-Cure achieves a promising average of 73% of the implementation of the provided suggestions. △ Less

Submitted 24 September, 2024; originally announced September 2024.

Comments: 25 pages

arXiv:2409.14726 [pdf, other]

Semantic Communication Enabled 6G-NTN Framework: A Novel Denoising and Gateway Hop Integration Mechanism

Authors: Loc X. Nguyen, Sheikh Salman Hassan, Yan Kyaw Tun, Kitae Kim, Zhu Han, Choong Seon Hong

Abstract: The sixth-generation (6G) non-terrestrial networks (NTNs) are crucial for real-time monitoring in critical applications like disaster relief. However, limited bandwidth, latency, rain attenuation, long propagation delays, and co-channel interference pose challenges to efficient satellite communication. Therefore, semantic communication (SC) has emerged as a promising solution to improve transmissi… ▽ More The sixth-generation (6G) non-terrestrial networks (NTNs) are crucial for real-time monitoring in critical applications like disaster relief. However, limited bandwidth, latency, rain attenuation, long propagation delays, and co-channel interference pose challenges to efficient satellite communication. Therefore, semantic communication (SC) has emerged as a promising solution to improve transmission efficiency and address these issues. In this paper, we explore the potential of SC as a bandwidth-efficient, latency-minimizing strategy specifically suited to 6G satellite communications. While existing SC methods have demonstrated efficacy in direct satellite-terrestrial transmissions, they encounter limitations in satellite networks due to distortion accumulation across gateway hop-relays. Additionally, certain ground users (GUs) experience poor signal-to-noise ratios (SNR), making direct satellite communication challenging. To address these issues, we propose a novel framework that optimizes gateway hop-relay selection for GUs with low SNR and integrates gateway-based denoising mechanisms to ensure high-quality-of-service (QoS) in satellite-based SC networks. This approach directly mitigates distortion, leading to significant improvements in satellite service performance by delivering customized services tailored to the unique signal conditions of each GU. Our findings represent a critical advancement in reliable and efficient data transmission from the Earth observation satellites, thereby enabling fast and effective responses to urgent events. Simulation results demonstrate that our proposed strategy significantly enhances overall network performance, outperforming conventional methods by offering tailored communication services based on specific GU conditions. △ Less

Submitted 23 September, 2024; originally announced September 2024.

Comments: 13 pages, 8 figures, 2 tables

arXiv:2408.07997 [pdf, other]

Enhanced Quantum Energy Teleportation using a 3-Qubit System

Authors: Md Shoyib Hassan, Syed Emad Uddin Shubha, M. R. C Mahdy

Abstract: Quantum Energy Teleportation (QET) is a novel method that leverages quantum entanglement to transfer energy between two distant locations without any physical movement of the energy. The first realization of QET on superconducting hardware, utilizing a 2-qubit system, demonstrated an average energy retrieval efficiency of 35.4% (observing only V ) by the receiver, Bob. In this paper, we present a… ▽ More Quantum Energy Teleportation (QET) is a novel method that leverages quantum entanglement to transfer energy between two distant locations without any physical movement of the energy. The first realization of QET on superconducting hardware, utilizing a 2-qubit system, demonstrated an average energy retrieval efficiency of 35.4% (observing only V ) by the receiver, Bob. In this paper, we present a new approach using a 3-qubit system to enhance the energy efficiency of QET. We have incorporated a novel 3-qubit ground state Hamiltonian H to achieve this, which conforms to the constraints of Zero mean energy and anti-commutative properties of the operations on the observable of the senders and receiver. Our experimental results show a significant improvement in terms of energy retrieval. Though the Multiple-Input Single-Output (MISO) model demonstrates a similar result achieving an average efficiency of 32.5% (observing only V ), the Single-Input Multiple-Output (SIMO) model shows a significantly higher result than that of the 2-qubit system considering practical usage, which is 58.2% △ Less

Submitted 14 October, 2024; v1 submitted 15 August, 2024; originally announced August 2024.

Comments: 13 pages, 13 figures, 2 table, 50+ equations

arXiv:2408.03959 [pdf, other]

Semantic Enabled 6G LEO Satellite Communication for Earth Observation: A Resource-Constrained Network Optimization

Authors: Sheikh Salman Hassan, Loc X. Nguyen, Yan Kyaw Tun, Zhu Han, Choong Seon Hong

Abstract: Earth observation satellites generate large amounts of real-time data for monitoring and managing time-critical events such as disaster relief missions. This presents a major challenge for satellite-to-ground communications operating under limited bandwidth capacities. This paper explores semantic communication (SC) as a potential alternative to traditional communication methods. The rationality f… ▽ More Earth observation satellites generate large amounts of real-time data for monitoring and managing time-critical events such as disaster relief missions. This presents a major challenge for satellite-to-ground communications operating under limited bandwidth capacities. This paper explores semantic communication (SC) as a potential alternative to traditional communication methods. The rationality for adopting SC is its inherent ability to reduce communication costs and make spectrum efficient for 6G non-terrestrial networks (6G-NTNs). We focus on the critical satellite imagery downlink communications latency optimization for Earth observation through SC techniques. We formulate the latency minimization problem with SC quality-of-service (SC-QoS) constraints and address this problem with a meta-heuristic discrete whale optimization algorithm (DWOA) and a one-to-one matching game. The proposed approach for captured image processing and transmission includes the integration of joint semantic and channel encoding to ensure downlink sum-rate optimization and latency minimization. Empirical results from experiments demonstrate the efficiency of the proposed framework for latency optimization while preserving high-quality data transmission when compared to baselines. △ Less

Submitted 31 July, 2024; originally announced August 2024.

Comments: Accepted in GLOBECOM 2024

arXiv:2407.15806 [pdf, other]

FSboard: Over 3 million characters of ASL fingerspelling collected via smartphones

Authors: Manfred Georg, Garrett Tanzer, Saad Hassan, Maximus Shengelia, Esha Uboweja, Sam Sepah, Sean Forbes, Thad Starner

Abstract: Progress in machine understanding of sign languages has been slow and hampered by limited data. In this paper, we present FSboard, an American Sign Language fingerspelling dataset situated in a mobile text entry use case, collected from 147 paid and consenting Deaf signers using Pixel 4A selfie cameras in a variety of environments. Fingerspelling recognition is an incomplete solution that is only… ▽ More Progress in machine understanding of sign languages has been slow and hampered by limited data. In this paper, we present FSboard, an American Sign Language fingerspelling dataset situated in a mobile text entry use case, collected from 147 paid and consenting Deaf signers using Pixel 4A selfie cameras in a variety of environments. Fingerspelling recognition is an incomplete solution that is only one small part of sign language translation, but it could provide some immediate benefit to Deaf/Hard of Hearing signers as more broadly capable technology develops. At >3 million characters in length and >250 hours in duration, FSboard is the largest fingerspelling recognition dataset to date by a factor of >10x. As a simple baseline, we finetune 30 Hz MediaPipe Holistic landmark inputs into ByT5-Small and achieve 11.1% Character Error Rate (CER) on a test set with unique phrases and signers. This quality degrades gracefully when decreasing frame rate and excluding face/body landmarks: plausible optimizations to help models run on device in real time. △ Less

Submitted 22 July, 2024; originally announced July 2024.

Comments: Access FSboard at https://www.kaggle.com/datasets/googleai/fsboard

arXiv:2407.12941 [pdf, other]

Robotic Arm Manipulation with Inverse Reinforcement Learning & TD-MPC

Authors: Md Shoyib Hassan, Sabir Md Sanaullah

Abstract: One unresolved issue is how to scale model-based inverse reinforcement learning (IRL) to actual robotic manipulation tasks with unpredictable dynamics. The ability to learn from both visual and proprioceptive examples, creating algorithms that scale to high-dimensional state-spaces, and mastering strong dynamics models are the main obstacles. In this work, we provide a gradient-based inverse reinf… ▽ More One unresolved issue is how to scale model-based inverse reinforcement learning (IRL) to actual robotic manipulation tasks with unpredictable dynamics. The ability to learn from both visual and proprioceptive examples, creating algorithms that scale to high-dimensional state-spaces, and mastering strong dynamics models are the main obstacles. In this work, we provide a gradient-based inverse reinforcement learning framework that learns cost functions purely from visual human demonstrations. The shown behavior and the trajectory is then optimized using TD visual model predictive control(MPC) and the learned cost functions. We test our system using fundamental object manipulation tasks on hardware. △ Less

Submitted 7 August, 2024; v1 submitted 17 July, 2024; originally announced July 2024.

Comments: 10 pages, 13 figures

ACM Class: I.2.9

arXiv:2407.04542 [pdf, other]

Rethinking Image Compression on the Web with Generative AI

Authors: Shayan Ali Hassan, Danish Humair, Ihsan Ayyub Qazi, Zafar Ayyub Qazi

Abstract: The rapid growth of the Internet, driven by social media, web browsing, and video streaming, has made images central to the Web experience, resulting in significant data transfer and increased webpage sizes. Traditional image compression methods, while reducing bandwidth, often degrade image quality. This paper explores a novel approach using generative AI to reconstruct images at the edge or clie… ▽ More The rapid growth of the Internet, driven by social media, web browsing, and video streaming, has made images central to the Web experience, resulting in significant data transfer and increased webpage sizes. Traditional image compression methods, while reducing bandwidth, often degrade image quality. This paper explores a novel approach using generative AI to reconstruct images at the edge or client-side. We develop a framework that leverages text prompts and provides additional conditioning inputs like Canny edges and color palettes to a text-to-image model, achieving up to 99.8% bandwidth savings in the best cases and 92.6% on average, while maintaining high perceptual similarity. Empirical analysis and a user study show that our method preserves image meaning and structure more effectively than traditional compression methods, offering a promising solution for reducing bandwidth usage and improving Internet affordability with minimal degradation in image quality. △ Less

Submitted 5 July, 2024; originally announced July 2024.

arXiv:2406.13280 [pdf, other]

Design Optimization of NOMA Aided Multi-STAR-RIS for Indoor Environments: A Convex Approximation Imitated Reinforcement Learning Approach

Authors: Yu Min Park, Sheikh Salman Hassan, Yan Kyaw Tun, Eui-Nam Huh, Walid Saad, Choong Seon Hong

Abstract: Non-orthogonal multiple access (NOMA) enables multiple users to share the same frequency band, and simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) provides 360-degree full-space coverage, optimizing both transmission and reflection for improved network performance and dynamic control of the indoor environment. However, deploying STAR-RIS indoors presents ch… ▽ More Non-orthogonal multiple access (NOMA) enables multiple users to share the same frequency band, and simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) provides 360-degree full-space coverage, optimizing both transmission and reflection for improved network performance and dynamic control of the indoor environment. However, deploying STAR-RIS indoors presents challenges in interference mitigation, power consumption, and real-time configuration. In this work, a novel network architecture utilizing multiple access points (APs), STAR-RISs, and NOMA is proposed for indoor communication. To address these, we formulate an optimization problem involving user assignment, access point (AP) beamforming, and STAR-RIS phase control. A decomposition approach is used to solve the complex problem efficiently, employing a many-to-one matching algorithm for user-AP assignment and K-means clustering for resource management. Additionally, multi-agent deep reinforcement learning (MADRL) is leveraged to optimize the control of the STAR-RIS. Within the proposed MADRL framework, a novel approach is introduced in which each decision variable acts as an independent agent, enabling collaborative learning and decision making. The MADRL framework is enhanced by incorporating convex approximation (CA), which accelerates policy learning through suboptimal solutions from successive convex approximation (SCA), leading to faster adaptation and convergence. Simulations demonstrate significant improvements in network utility compared to baseline approaches. △ Less

Submitted 17 September, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

Comments: 37 pages, 11 figures. arXiv admin note: text overlap with arXiv:2311.08708

arXiv:2406.03773 [pdf, other]

Optimizing Multi-User Semantic Communication via Transfer Learning and Knowledge Distillation

Authors: Loc X. Nguyen, Kitae Kim, Ye Lin Tun, Sheikh Salman Hassan, Yan Kyaw Tun, Zhu Han, Choong Seon Hong

Abstract: Semantic communication, notable for ensuring quality of service by jointly optimizing source and channel coding, effectively extracts data semantics, reduces transmission length, and mitigates channel noise. However, most studies overlook multi-user scenarios and resource availability, limiting real-world application. This paper addresses this gap by focusing on downlink communication from a base… ▽ More Semantic communication, notable for ensuring quality of service by jointly optimizing source and channel coding, effectively extracts data semantics, reduces transmission length, and mitigates channel noise. However, most studies overlook multi-user scenarios and resource availability, limiting real-world application. This paper addresses this gap by focusing on downlink communication from a base station to multiple users with varying computing capacities. Users employ variants of Swin transformer models for source decoding and a simple architecture for channel decoding. We propose a novel training regimen, incorporating transfer learning and knowledge distillation to improve low-computing users' performance. Extensive simulations validate the proposed methods. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: 5 pages, 5 figures

arXiv:2405.07189 [pdf]

A hybrid meta-heuristic approach for channel estimation in OFDM MIMO

Authors: Shahriar Hassan, Umme Farhana, Md Karam Newaz

Abstract: In wireless communication Multiple Input Multiple Output (MIMO) technology has brought significant improvement in service by adopting Orthogonal Frequency Division Multiplexing (OFDM), a digital modulation technique. To achieve great performance with MIMO efficiently gathering channel state information (CSI) plays a vital role. Among different approach of channel estimation techniques data-aided c… ▽ More In wireless communication Multiple Input Multiple Output (MIMO) technology has brought significant improvement in service by adopting Orthogonal Frequency Division Multiplexing (OFDM), a digital modulation technique. To achieve great performance with MIMO efficiently gathering channel state information (CSI) plays a vital role. Among different approach of channel estimation techniques data-aided channel estimation is more reliable. The existing methods of data-aided channel estimation are Least Square (LS) and Minimum Mean Square Error (MMSE) methods which do not achieve a great performance. Moreover, MMSE is little complex and has higher computational cost. That is why many attempts have been done previously to optimize the methods with help of meta heuristics and also other ways. In this paper we have tried to optimize LS estimation with a combined algorithm of Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). The proposed algorithm has outperformed LS and MMSE. And it gives similar result if we optimize LS with standard PSO but in less numbers of iteration. △ Less

Submitted 12 May, 2024; originally announced May 2024.

Journal ref: Journal of Gono Bishwabidyalay, Vol. 4, Issue. 1, PP. 224-236, 2023

arXiv:2404.17046 [pdf, other]

Unraveling Code Clone Dynamics in Deep Learning Frameworks

Authors: Maram Assi, Safwat Hassan, Ying Zou

Abstract: Deep Learning (DL) frameworks play a critical role in advancing artificial intelligence, and their rapid growth underscores the need for a comprehensive understanding of software quality and maintainability. DL frameworks, like other systems, are prone to code clones. Code clones refer to identical or highly similar source code fragments within the same project or even across different projects. C… ▽ More Deep Learning (DL) frameworks play a critical role in advancing artificial intelligence, and their rapid growth underscores the need for a comprehensive understanding of software quality and maintainability. DL frameworks, like other systems, are prone to code clones. Code clones refer to identical or highly similar source code fragments within the same project or even across different projects. Code cloning can have positive and negative implications for software development, influencing maintenance, readability, and bug propagation. In this paper, we aim to address the knowledge gap concerning the evolutionary dimension of code clones in DL frameworks and the extent of code reuse across these frameworks. We empirically analyze code clones in nine popular DL frameworks, i.e., TensorFlow, Paddle, PyTorch, Aesara, Ray, MXNet, Keras, Jax and BentoML, to investigate (1) the characteristics of the long-term code cloning evolution over releases in each framework, (2) the short-term, i.e., within-release, code cloning patterns and their influence on the long-term trends, and (3) the file-level code clones within the DL frameworks. Our findings reveal that DL frameworks adopt four distinct cloning trends and that these trends present some common and distinct characteristics. For instance, bug-fixing activities persistently happen in clones irrespective of the clone evolutionary trend but occur more in the "Serpentine" trend. Moreover, the within release level investigation demonstrates that short-term code cloning practices impact long-term cloning trends. The cross-framework code clone investigation reveals the presence of functional and architectural adaptation file-level cross-framework code clones across the nine studied frameworks. We provide insights that foster robust clone practices and collaborative maintenance in the development of DL frameworks. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: 37 pages

arXiv:2404.16208 [pdf, other]

GPU-RANC: A CUDA Accelerated Simulation Framework for Neuromorphic Architectures

Authors: Sahil Hassan, Michael Inouye, Miguel C. Gonzalez, Ilkin Aliyev, Joshua Mack, Maisha Hafiz, Ali Akoglu

Abstract: Open-source simulation tools play a crucial role for neuromorphic application engineers and hardware architects to investigate performance bottlenecks and explore design optimizations before committing to silicon. Reconfigurable Architecture for Neuromorphic Computing (RANC) is one such tool that offers ability to execute pre-trained Spiking Neural Network (SNN) models within a unified ecosystem t… ▽ More Open-source simulation tools play a crucial role for neuromorphic application engineers and hardware architects to investigate performance bottlenecks and explore design optimizations before committing to silicon. Reconfigurable Architecture for Neuromorphic Computing (RANC) is one such tool that offers ability to execute pre-trained Spiking Neural Network (SNN) models within a unified ecosystem through both software-based simulation and FPGA-based emulation. RANC has been utilized by the community with its flexible and highly parameterized design to study implementation bottlenecks, tune architectural parameters or modify neuron behavior based on application insights and study the trade space on hardware performance and network accuracy. In designing architectures for use in neuromorphic computing, there are an incredibly large number of configuration parameters such as number and precision of weights per neuron, neuron and axon counts per core, network topology, and neuron behavior. To accelerate such studies and provide users with a streamlined productive design space exploration, in this paper we introduce the GPU-based implementation of RANC. We summarize our parallelization approach and quantify the speedup gains achieved with GPU-based tick-accurate simulations across various use cases. We demonstrate up to 780 times speedup compared to serial version of the RANC simulator based on a 512 neuromorphic core MNIST inference application. We believe that the RANC ecosystem now provides a much more feasible avenue in the research of exploring different optimizations for accelerating SNNs and performing richer studies by enabling rapid convergence to optimized neuromorphic architectures. △ Less

Submitted 24 April, 2024; originally announced April 2024.

Comments: Accepted for publication in Neuro-Inspired Computational Elements (NICE) Workshop 2024

arXiv:2403.16609 [pdf, other]

Conversational Grounding: Annotation and Analysis of Grounding Acts and Grounding Units

Authors: Biswesh Mohapatra, Seemab Hassan, Laurent Romary, Justine Cassell

Abstract: Successful conversations often rest on common understanding, where all parties are on the same page about the information being shared. This process, known as conversational grounding, is crucial for building trustworthy dialog systems that can accurately keep track of and recall the shared information. The proficiencies of an agent in grounding the conveyed information significantly contribute to… ▽ More Successful conversations often rest on common understanding, where all parties are on the same page about the information being shared. This process, known as conversational grounding, is crucial for building trustworthy dialog systems that can accurately keep track of and recall the shared information. The proficiencies of an agent in grounding the conveyed information significantly contribute to building a reliable dialog system. Despite recent advancements in dialog systems, there exists a noticeable deficit in their grounding capabilities. Traum provided a framework for conversational grounding introducing Grounding Acts and Grounding Units, but substantial progress, especially in the realm of Large Language Models, remains lacking. To bridge this gap, we present the annotation of two dialog corpora employing Grounding Acts, Grounding Units, and a measure of their degree of grounding. We discuss our key findings during the annotation and also provide a baseline model to test the performance of current Language Models in categorizing the grounding acts of the dialogs. Our work aims to provide a useful resource for further research in making conversations with machines better understood and more reliable in natural day-to-day collaborative dialogs. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Journal ref: LREC-COLING 2024

arXiv:2403.14120 [pdf, other]

Advancing IIoT with Over-the-Air Federated Learning: The Role of Iterative Magnitude Pruning

Authors: Fazal Muhammad Ali Khan, Hatem Abou-Zeid, Aryan Kaushik, Syed Ali Hassan

Abstract: The industrial Internet of Things (IIoT) under Industry 4.0 heralds an era of interconnected smart devices where data-driven insights and machine learning (ML) fuse to revolutionize manufacturing. A noteworthy development in IIoT is the integration of federated learning (FL), which addresses data privacy and security among devices. FL enables edge sensors, also known as peripheral intelligence uni… ▽ More The industrial Internet of Things (IIoT) under Industry 4.0 heralds an era of interconnected smart devices where data-driven insights and machine learning (ML) fuse to revolutionize manufacturing. A noteworthy development in IIoT is the integration of federated learning (FL), which addresses data privacy and security among devices. FL enables edge sensors, also known as peripheral intelligence units (PIUs) to learn and adapt using their data locally, without explicit sharing of confidential data, to facilitate a collaborative yet confidential learning process. However, the lower memory footprint and computational power of PIUs inherently require deep neural network (DNN) models that have a very compact size. Model compression techniques such as pruning can be used to reduce the size of DNN models by removing unnecessary connections that have little impact on the model's performance, thus making the models more suitable for the limited resources of PIUs. Targeting the notion of compact yet robust DNN models, we propose the integration of iterative magnitude pruning (IMP) of the DNN model being trained in an over-the-air FL (OTA-FL) environment for IIoT. We provide a tutorial overview and also present a case study of the effectiveness of IMP in OTA-FL for an IIoT environment. Finally, we present future directions for enhancing and optimizing these deep compression techniques further, aiming to push the boundaries of IIoT capabilities in acquiring compact yet robust and high-performing DNN models. △ Less

Submitted 21 March, 2024; originally announced March 2024.

Comments: 6 pages, 6 figures

arXiv:2403.05530 [pdf, other]

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1110 additional authors not shown)

Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content. △ Less

Submitted 8 August, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

arXiv:2402.06803 [pdf, ps, other]

On graphs with well-distributed edge density

Authors: Syed Mujtaba Hassan, Shahid Hussain

Abstract: In this paper, we introduce a class of graphs which we call average hereditary graphs. Most graphs that occur in the usual graph theory applications belong to this class of graphs. Many popular types of graphs fall under this class, such as regular graphs, trees and other popular classes of graphs. We prove a new upper bound for the chromatic number of a graph in terms of its maximum average degre… ▽ More In this paper, we introduce a class of graphs which we call average hereditary graphs. Most graphs that occur in the usual graph theory applications belong to this class of graphs. Many popular types of graphs fall under this class, such as regular graphs, trees and other popular classes of graphs. We prove a new upper bound for the chromatic number of a graph in terms of its maximum average degree and show that this bound is an improvement on previous bounds. From this, we show a relationship between the average degree and the chromatic number of an average hereditary graph. This class of graphs is explored further by proving some interesting properties regarding the class of average hereditary graphs. An equivalent condition is provided for a graph to be average hereditary, through which we show that we can decide if a given graph is average hereditary in polynomial time. We then provide a construction for average hereditary graphs, using which an average hereditary graph can be recursively constructed. We also show that this class of graphs is closed under a binary operation, from this another construction is obtained for average hereditary graphs, and we see some interesting algebraic properties this class of graphs has. We then explore the effect on the complexity of graph 3-coloring problem when the input is restricted to average hereditary graphs. △ Less

Submitted 21 March, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

Comments: 12 pages, 2 figures

MSC Class: 05C15; 05C42; 05C07 (Primary) 05C75; 05C69; 05C25 (Secondary)

arXiv:2312.11805 [pdf, other]

Gemini: A Family of Highly Capable Multimodal Models

Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI. △ Less

Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

arXiv:2312.10647 [pdf, other]

Single-Stage Optimization of Open-loop Stable Limit Cycles with Smooth, Symbolic Derivatives

Authors: Muhammad Saud Ul Hassan, Christian Hubicki

Abstract: Open-loop stable limit cycles are foundational to legged robotics, providing inherent self-stabilization that minimizes the need for computationally intensive feedback-based gait correction. While previous methods have primarily targeted specific robotic models, this paper introduces a general framework for rapidly generating limit cycles across various dynamical systems, with the flexibility to i… ▽ More Open-loop stable limit cycles are foundational to legged robotics, providing inherent self-stabilization that minimizes the need for computationally intensive feedback-based gait correction. While previous methods have primarily targeted specific robotic models, this paper introduces a general framework for rapidly generating limit cycles across various dynamical systems, with the flexibility to impose arbitrarily tight stability bounds. We formulate the problem as a single-stage constrained optimization problem and use Direct Collocation to transcribe it into a nonlinear program with closed-form expressions for constraints, objectives, and their gradients. Our method supports multiple stability formulations. In particular, we tested two popular formulations for limit cycle stability in robotics: (1) based on the spectral radius of a discrete return map, and (2) based on the spectral radius of the monodromy matrix, and tested five different constraint-satisfaction formulations of the eigenvalue problem to bound the spectral radius. We compare the performance and solution quality of the various formulations on a robotic swing-leg model, highlighting the Schur decomposition of the monodromy matrix as a method with broader applicability due to weaker assumptions and stronger numerical convergence properties. As a case study, we apply our method on a hopping robot model, generating open-loop stable gaits in under 2 seconds on an Intel Core i7-6700K, while simultaneously minimizing energy consumption even under tight stability constraints. △ Less

Submitted 17 September, 2024; v1 submitted 17 December, 2023; originally announced December 2023.

Comments: 7 pages, 7 figures, submitted to ICRA-2025

arXiv:2311.18147 [pdf, other]

DisCGen: A Framework for Discourse-Informed Counterspeech Generation

Authors: Sabit Hassan, Malihe Alikhani

Abstract: Counterspeech can be an effective method for battling hateful content on social media. Automated counterspeech generation can aid in this process. Generated counterspeech, however, can be viable only when grounded in the context of topic, audience and sensitivity as these factors influence both the efficacy and appropriateness. In this work, we propose a novel framework based on theories of discou… ▽ More Counterspeech can be an effective method for battling hateful content on social media. Automated counterspeech generation can aid in this process. Generated counterspeech, however, can be viable only when grounded in the context of topic, audience and sensitivity as these factors influence both the efficacy and appropriateness. In this work, we propose a novel framework based on theories of discourse to study the inferential links that connect counter speeches to the hateful comment. Within this framework, we propose: i) a taxonomy of counterspeech derived from discourse frameworks, and ii) discourse-informed prompting strategies for generating contextually-grounded counterspeech. To construct and validate this framework, we present a process for collecting an in-the-wild dataset of counterspeech from Reddit. Using this process, we manually annotate a dataset of 3.9k Reddit comment pairs for the presence of hatespeech and counterspeech. The positive pairs are annotated for 10 classes in our proposed taxonomy. We annotate these pairs with paraphrased counterparts to remove offensiveness and first-person references. We show that by using our dataset and framework, large language models can generate contextually-grounded counterspeech informed by theories of discourse. According to our human evaluation, our approaches can act as a safeguard against critical failures of discourse-agnostic models. △ Less

Submitted 29 November, 2023; originally announced November 2023.

Comments: IJCNLP-AACL, 2023

arXiv:2311.18007 [pdf, other]

Towards out-of-distribution generalization in large-scale astronomical surveys: robust networks learn similar representations

Authors: Yash Gondhalekar, Sultan Hassan, Naomi Saphra, Sambatra Andrianomena

Abstract: The generalization of machine learning (ML) models to out-of-distribution (OOD) examples remains a key challenge in extracting information from upcoming astronomical surveys. Interpretability approaches are a natural way to gain insights into the OOD generalization problem. We use Centered Kernel Alignment (CKA), a similarity measure metric of neural network representations, to examine the relatio… ▽ More The generalization of machine learning (ML) models to out-of-distribution (OOD) examples remains a key challenge in extracting information from upcoming astronomical surveys. Interpretability approaches are a natural way to gain insights into the OOD generalization problem. We use Centered Kernel Alignment (CKA), a similarity measure metric of neural network representations, to examine the relationship between representation similarity and performance of pre-trained Convolutional Neural Networks (CNNs) on the CAMELS Multifield Dataset. We find that when models are robust to a distribution shift, they produce substantially different representations across their layers on OOD data. However, when they fail to generalize, these representations change less from layer to layer on OOD data. We discuss the potential application of similarity representation in guiding model design, training strategy, and mitigating the OOD problem by incorporating CKA as an inductive bias during training. △ Less

Submitted 29 November, 2023; originally announced November 2023.

Comments: Accepted to Machine Learning and the Physical Sciences Workshop, NeurIPS 2023

arXiv:2311.12179 [pdf, other]

Leveraging Closed-Access Multilingual Embedding for Automatic Sentence Alignment in Low Resource Languages

Authors: Idris Abdulmumin, Auwal Abubakar Khalid, Shamsuddeen Hassan Muhammad, Ibrahim Said Ahmad, Lukman Jibril Aliyu, Babangida Sani, Bala Mairiga Abduljalil, Sani Ahmad Hassan

Abstract: The importance of qualitative parallel data in machine translation has long been determined but it has always been very difficult to obtain such in sufficient quantity for the majority of world languages, mainly because of the associated cost and also the lack of accessibility to these languages. Despite the potential for obtaining parallel datasets from online articles using automatic approaches,… ▽ More The importance of qualitative parallel data in machine translation has long been determined but it has always been very difficult to obtain such in sufficient quantity for the majority of world languages, mainly because of the associated cost and also the lack of accessibility to these languages. Despite the potential for obtaining parallel datasets from online articles using automatic approaches, forensic investigations have found a lot of quality-related issues such as misalignment, and wrong language codes. In this work, we present a simple but qualitative parallel sentence aligner that carefully leveraged the closed-access Cohere multilingual embedding, a solution that ranked second in the just concluded #CoHereAIHack 2023 Challenge (see https://ai6lagos.devpost.com). The proposed approach achieved $94.96$ and $54.83$ f1 scores on FLORES and MAFAND-MT, compared to $3.64$ and $0.64$ of LASER respectively. Our method also achieved an improvement of more than 5 BLEU scores over LASER, when the resulting datasets were used with MAFAND-MT dataset to train translation models. Our code and data are available for research purposes here (https://github.com/abumafrim/Cohere-Align). △ Less

Submitted 20 November, 2023; originally announced November 2023.

Comments: To appear in the proceedings of ICCAIT 2023. 6 pages, 2 figures

arXiv:2310.02104 [pdf, other]

An empirical study of ChatGPT-3.5 on question answering and code maintenance

Authors: Md Mahir Asef Kabir, Sk Adnan Hassan, Xiaoyin Wang, Ying Wang, Hai Yu, Na Meng

Abstract: Ever since the launch of ChatGPT in 2022, a rising concern is whether ChatGPT will replace programmers and kill jobs. Motivated by this widespread concern, we conducted an empirical study to systematically compare ChatGPT against programmers in question-answering and software-maintaining. We reused a dataset introduced by prior work, which includes 130 StackOverflow (SO) discussion threads referre… ▽ More Ever since the launch of ChatGPT in 2022, a rising concern is whether ChatGPT will replace programmers and kill jobs. Motivated by this widespread concern, we conducted an empirical study to systematically compare ChatGPT against programmers in question-answering and software-maintaining. We reused a dataset introduced by prior work, which includes 130 StackOverflow (SO) discussion threads referred to by the Java developers of 357 GitHub projects. We mainly investigated three research questions (RQs). First, how does ChatGPT compare with programmers when answering technical questions? Second, how do developers perceive the differences between ChatGPT's answers and SO answers? Third, how does ChatGPT compare with humans when revising code for maintenance requests? For RQ1, we provided the 130 SO questions to ChatGPT, and manually compared ChatGPT answers with the accepted/most popular SO answers in terms of relevance, readability, informativeness, comprehensiveness, and reusability. For RQ2, we conducted a user study with 30 developers, asking each developer to assess and compare 10 pairs of answers, without knowing the information source (i.e., ChatGPT or SO). For RQ3, we distilled 48 software maintenance tasks from 48 GitHub projects citing the studied SO threads. We queried ChatGPT to revise a given Java file, and to incorporate the code implementation for any prescribed maintenance requirement. Our study reveals interesting phenomena: For the majority of SO questions (97/130), ChatGPT provided better answers; in 203 of 300 ratings, developers preferred ChatGPT answers to SO answers; ChatGPT revised code correctly for 22 of the 48 tasks. Our research will expand people's knowledge of ChatGPT capabilities, and shed light on future adoption of ChatGPT by the software industry. △ Less

Submitted 3 October, 2023; originally announced October 2023.

arXiv:2307.15469 [pdf, other]

SpaceRIS: LEO Satellite Coverage Maximization in 6G Sub-THz Networks by MAPPO DRL and Whale Optimization

Authors: Sheikh Salman Hassan, Yu Min Park, Yan Kyaw Tun, Walid Saad, Zhu Han, Choong Seon Hong

Abstract: Satellite systems face a significant challenge in effectively utilizing limited communication resources to meet the demands of ground network traffic, characterized by asymmetrical spatial distribution and time-varying characteristics. Moreover, the coverage range and signal transmission distance of low Earth orbit (LEO) satellites are restricted by notable propagation attenuation, molecular absor… ▽ More Satellite systems face a significant challenge in effectively utilizing limited communication resources to meet the demands of ground network traffic, characterized by asymmetrical spatial distribution and time-varying characteristics. Moreover, the coverage range and signal transmission distance of low Earth orbit (LEO) satellites are restricted by notable propagation attenuation, molecular absorption, and space losses in sub-terahertz (THz) frequencies. This paper introduces a novel approach to maximize LEO satellite coverage by leveraging reconfigurable intelligent surfaces (RISs) within 6G sub-THz networks. The optimization objectives encompass enhancing the end-to-end data rate, optimizing satellite-remote user equipment (RUE) associations, data packet routing within satellite constellations, RIS phase shift, and ground base station (GBS) transmit power (i.e., active beamforming). The formulated joint optimization problem poses significant challenges owing to its time-varying environment, non-convex characteristics, and NP-hard complexity. To address these challenges, we propose a block coordinate descent (BCD) algorithm that integrates balanced K-means clustering, multi-agent proximal policy optimization (MAPPO) deep reinforcement learning (DRL), and whale optimization (WOA) techniques. The performance of the proposed approach is demonstrated through comprehensive simulation results, exhibiting its superiority over existing baseline methods in the literature. △ Less

Submitted 28 July, 2023; originally announced July 2023.

arXiv:2307.14623 [pdf, other]

BubbleML: A Multi-Physics Dataset and Benchmarks for Machine Learning

Authors: Sheikh Md Shakeel Hassan, Arthur Feeney, Akash Dhruv, Jihoon Kim, Youngjoon Suh, Jaiyoung Ryu, Yoonjin Won, Aparna Chandramowlishwaran

Abstract: In the field of phase change phenomena, the lack of accessible and diverse datasets suitable for machine learning (ML) training poses a significant challenge. Existing experimental datasets are often restricted, with limited availability and sparse ground truth data, impeding our understanding of this complex multiphysics phenomena. To bridge this gap, we present the BubbleML Dataset \footnote{\la… ▽ More In the field of phase change phenomena, the lack of accessible and diverse datasets suitable for machine learning (ML) training poses a significant challenge. Existing experimental datasets are often restricted, with limited availability and sparse ground truth data, impeding our understanding of this complex multiphysics phenomena. To bridge this gap, we present the BubbleML Dataset \footnote{\label{git_dataset}\url{https://github.com/HPCForge/BubbleML}} which leverages physics-driven simulations to provide accurate ground truth information for various boiling scenarios, encompassing nucleate pool boiling, flow boiling, and sub-cooled boiling. This extensive dataset covers a wide range of parameters, including varying gravity conditions, flow rates, sub-cooling levels, and wall superheat, comprising 79 simulations. BubbleML is validated against experimental observations and trends, establishing it as an invaluable resource for ML research. Furthermore, we showcase its potential to facilitate exploration of diverse downstream tasks by introducing two benchmarks: (a) optical flow analysis to capture bubble dynamics, and (b) operator networks for learning temperature dynamics. The BubbleML dataset and its benchmarks serve as a catalyst for advancements in ML-driven research on multiphysics phase change phenomena, enabling the development and comparison of state-of-the-art techniques and models. △ Less

Submitted 24 August, 2023; v1 submitted 27 July, 2023; originally announced July 2023.

Comments: Submitted to Neurips Datasets and Benchmarks Track 2023

arXiv:2306.14476 [pdf, other]

STEF-DHNet: Spatiotemporal External Factors Based Deep Hybrid Network for Enhanced Long-Term Taxi Demand Prediction

Authors: Sheraz Hassan, Muhammad Tahir, Momin Uppal, Zubair Khalid, Ivan Gorban, Selim Turki

Abstract: Accurately predicting the demand for ride-hailing services can result in significant benefits such as more effective surge pricing strategies, improved driver positioning, and enhanced customer service. By understanding the demand fluctuations, companies can anticipate and respond to consumer requirements more efficiently, leading to increased efficiency and revenue. However, forecasting demand in… ▽ More Accurately predicting the demand for ride-hailing services can result in significant benefits such as more effective surge pricing strategies, improved driver positioning, and enhanced customer service. By understanding the demand fluctuations, companies can anticipate and respond to consumer requirements more efficiently, leading to increased efficiency and revenue. However, forecasting demand in a particular region can be challenging, as it is influenced by several external factors, such as time of day, weather conditions, and location. Thus, understanding and evaluating these factors is essential for predicting consumer behavior and adapting to their needs effectively. Grid-based deep learning approaches have proven effective in predicting regional taxi demand. However, these models have limitations in integrating external factors in their spatiotemporal complexity and maintaining high accuracy over extended time horizons without continuous retraining, which makes them less suitable for practical and commercial applications. To address these limitations, this paper introduces STEF-DHNet, a demand prediction model that combines Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) to integrate external features as spatiotemporal information and capture their influence on ride-hailing demand. The proposed model is evaluated using a long-term performance metric called the rolling error, which assesses its ability to maintain high accuracy over long periods without retraining. The results show that STEF-DHNet outperforms existing state-of-the-art methods on three diverse datasets, demonstrating its potential for practical use in real-world scenarios. △ Less

Submitted 26 June, 2023; originally announced June 2023.

Comments: 8 pages, 3 Figures

arXiv:2306.04010 [pdf, other]

A Novel Implementation Methodology for Error Correction Codes on a Neuromorphic Architecture

Authors: Sahil Hassan, Parker Dattilo, Ali Akoglu

Abstract: The Internet of Things infrastructure connects a massive number of edge devices with an increasing demand for intelligent sensing and inferencing capability. Such data-sensitive functions necessitate energy-efficient and programmable implementations of Error Correction Codes (ECC) and decoders. The algorithmic flow of ECCs with concurrent accumulation and comparison types of operations are innatel… ▽ More The Internet of Things infrastructure connects a massive number of edge devices with an increasing demand for intelligent sensing and inferencing capability. Such data-sensitive functions necessitate energy-efficient and programmable implementations of Error Correction Codes (ECC) and decoders. The algorithmic flow of ECCs with concurrent accumulation and comparison types of operations are innately exploitable by neuromorphic architectures for energy efficient execution -- an area that is relatively unexplored outside of machine learning applications. For the first time, we propose a methodology to map the hard-decision class of decoder algorithms on a neuromorphic architecture. We present the implementation of the Gallager B (GaB) decoding algorithm on a TrueNorth-inspired architecture that is emulated on the Xilinx Zynq ZCU102 MPSoC. Over this reference implementation, we propose architectural modifications at the neuron block level that result in a reduction of energy consumption by 31% with a negligible increase in resource usage while achieving the same error correction performance. △ Less

Submitted 6 June, 2023; originally announced June 2023.

Comments: To be published in IEEE Transactions On Computer-Aided Design Of Integrated Circuits And Systems (TCAD)

arXiv:2305.19981 [pdf, other]

MedNgage: A Dataset for Understanding Engagement in Patient-Nurse Conversations

Authors: Yan Wang, Heidi Ann Scharf Donovan, Sabit Hassan, Mailhe Alikhani

Abstract: Patients who effectively manage their symptoms often demonstrate higher levels of engagement in conversations and interventions with healthcare practitioners. This engagement is multifaceted, encompassing cognitive and socio-affective dimensions. Consequently, it is crucial for AI systems to understand the engagement in natural conversations between patients and practitioners to better contribute… ▽ More Patients who effectively manage their symptoms often demonstrate higher levels of engagement in conversations and interventions with healthcare practitioners. This engagement is multifaceted, encompassing cognitive and socio-affective dimensions. Consequently, it is crucial for AI systems to understand the engagement in natural conversations between patients and practitioners to better contribute toward patient care. In this paper, we present a novel dataset (MedNgage), which consists of patient-nurse conversations about cancer symptom management. We manually annotate the dataset with a novel framework of categories of patient engagement from two different angles, namely: i) socio-affective (3.1K spans), and ii) cognitive use of language (1.8K spans). Through statistical analysis of the data that is annotated using our framework, we show a positive correlation between patient symptom management outcomes and their engagement in conversations. Additionally, we demonstrate that pre-trained transformer models fine-tuned on our dataset can reliably predict engagement classes in patient-nurse conversations. Lastly, we use LIME (Ribeiro et al., 2016) to analyze the underlying challenges of the tasks that state-of-the-art transformer models encounter. The de-identified data is available for research purposes upon request. △ Less

Submitted 20 June, 2023; v1 submitted 31 May, 2023; originally announced May 2023.

Comments: ACL Findings 2023

arXiv:2305.17013 [pdf, other]

D-CALM: A Dynamic Clustering-based Active Learning Approach for Mitigating Bias

Authors: Sabit Hassan, Malihe Alikhani

Abstract: Despite recent advancements, NLP models continue to be vulnerable to bias. This bias often originates from the uneven distribution of real-world data and can propagate through the annotation process. Escalated integration of these models in our lives calls for methods to mitigate bias without overbearing annotation costs. While active learning (AL) has shown promise in training models with a small… ▽ More Despite recent advancements, NLP models continue to be vulnerable to bias. This bias often originates from the uneven distribution of real-world data and can propagate through the annotation process. Escalated integration of these models in our lives calls for methods to mitigate bias without overbearing annotation costs. While active learning (AL) has shown promise in training models with a small amount of annotated data, AL's reliance on the model's behavior for selective sampling can lead to an accumulation of unwanted bias rather than bias mitigation. However, infusing clustering with AL can overcome the bias issue of both AL and traditional annotation methods while exploiting AL's annotation efficiency. In this paper, we propose a novel adaptive clustering-based active learning algorithm, D-CALM, that dynamically adjusts clustering and annotation efforts in response to an estimated classifier error-rate. Experiments on eight datasets for a diverse set of text classification tasks, including emotion, hatespeech, dialog act, and book type detection, demonstrate that our proposed algorithm significantly outperforms baseline AL approaches with both pretrained transformers and traditional Support Vector Machines. D-CALM showcases robustness against different measures of information gain and, as evident from our analysis of label and error distribution, can significantly reduce unwanted model bias. △ Less

Submitted 26 May, 2023; originally announced May 2023.

Comments: ACL FINDINGS 2023

arXiv:2304.12396 [pdf, other]

CEDR-API: Productive, Performant Programming of Domain-Specific Embedded Systems

Authors: Joshua Mack, Serhan Gener, Sahil Hassan, H. Umut Suluhan, Ali Akoglu

Abstract: As the computing landscape evolves, system designers continue to explore design methodologies that leverage increased levels of heterogeneity to push performance within limited size, weight, power, and cost budgets. One such methodology is to build Domain-Specific System on Chips (DSSoCs) that promise increased productivity through narrowed scope of their target application domain. In previous wor… ▽ More As the computing landscape evolves, system designers continue to explore design methodologies that leverage increased levels of heterogeneity to push performance within limited size, weight, power, and cost budgets. One such methodology is to build Domain-Specific System on Chips (DSSoCs) that promise increased productivity through narrowed scope of their target application domain. In previous works, we have proposed CEDR, an open source, unified compilation and runtime framework for DSSoC architectures that allows applications, scheduling heuristics, and accelerators to be co-designed in a cohesive manner that maximizes system performance. In this work, we present changes to the application development workflow that enable a more productive and expressive API-based programming methodology. These changes allow for more rapid integration of new applications without sacrificing application performance. Towards the design of heterogeneous SoCs with rich set of accelerators, in this study we experimentally study the impact of increase in workload complexity and growth in the pool of compute resources on execution time of dynamically arriving workloads composed of real-life applications executed over architectures emulated on Xilinx ZCU102 MPSoC and Nvidia Jetson AGX Xavier. We expand CEDR into the application domain of autonomous vehicles, and we find that API-based CEDR achieves a runtime overhead reduction of 19.5% with respect to the original CEDR. △ Less

Submitted 24 April, 2023; originally announced April 2023.

Comments: 10 pages, 10 figures. Accepted for publication in the 2023 International Parallel and Distributed Processing Symposium (IPDPS) Heterogeneity in Computing Workshop (HCW)

arXiv:2302.09618 [pdf, other]

Multilingual Content Moderation: A Case Study on Reddit

Authors: Meng Ye, Karan Sikka, Katherine Atwell, Sabit Hassan, Ajay Divakaran, Malihe Alikhani

Abstract: Content moderation is the process of flagging content based on pre-defined platform rules. There has been a growing need for AI moderators to safeguard users as well as protect the mental health of human moderators from traumatic content. While prior works have focused on identifying hateful/offensive language, they are not adequate for meeting the challenges of content moderation since 1) moderat… ▽ More Content moderation is the process of flagging content based on pre-defined platform rules. There has been a growing need for AI moderators to safeguard users as well as protect the mental health of human moderators from traumatic content. While prior works have focused on identifying hateful/offensive language, they are not adequate for meeting the challenges of content moderation since 1) moderation decisions are based on violation of rules, which subsumes detection of offensive speech, and 2) such rules often differ across communities which entails an adaptive solution. We propose to study the challenges of content moderation by introducing a multilingual dataset of 1.8 Million Reddit comments spanning 56 subreddits in English, German, Spanish and French. We perform extensive experimental analysis to highlight the underlying challenges and suggest related research problems such as cross-lingual transfer, learning under label noise (human biases), transfer of moderation models, and predicting the violated rule. Our dataset and analysis can help better prepare for the challenges and opportunities of auto moderation. △ Less

Submitted 19 February, 2023; originally announced February 2023.

arXiv:2212.07979 [pdf, other]

Improving Developers' Understanding of Regex Denial of Service Tools through Anti-Patterns and Fix Strategies

Authors: Sk Adnan Hassan, Zainab Aamir, Dongyoon Lee, James C. Davis, Francisco Servant

Abstract: Regular expressions are used for diverse purposes, including input validation and firewalls. Unfortunately, they can also lead to a security vulnerability called ReDoS (Regular Expression Denial of Service), caused by a super-linear worst-case execution time during regex matching. Due to the severity and prevalence of ReDoS, past work proposed automatic tools to detect and fix regexes. Although th… ▽ More Regular expressions are used for diverse purposes, including input validation and firewalls. Unfortunately, they can also lead to a security vulnerability called ReDoS (Regular Expression Denial of Service), caused by a super-linear worst-case execution time during regex matching. Due to the severity and prevalence of ReDoS, past work proposed automatic tools to detect and fix regexes. Although these tools were evaluated in automatic experiments, their usability has not yet been studied; usability has not been a focus of prior work. Our insight is that the usability of existing tools to detect and fix regexes will improve if we complement them with anti-patterns and fix strategies of vulnerable regexes. We developed novel anti-patterns for vulnerable regexes, and a collection of fix strategies to fix them. We derived our anti-patterns and fix strategies from a novel theory of regex infinite ambiguity - a necessary condition for regexes vulnerable to ReDoS. We proved the soundness and completeness of our theory. We evaluated the effectiveness of our anti-patterns, both in an automatic experiment and when applied manually. Then, we evaluated how much our anti-patterns and fix strategies improve developers' understanding of the outcome of detection and fixing tools. Our evaluation found that our anti-patterns were effective over a large dataset of regexes (N=209,188): 100% precision and 99% recall, improving the state of the art 50% precision and 87% recall. Our anti-patterns were also more effective than the state of the art when applied manually (N=20): 100% developers applied them effectively vs. 50% for the state of the art. Finally, our anti-patterns and fix strategies increased developers' understanding using automatic tools (N=9): from median "Very weakly" to median "Strongly" when detecting vulnerabilities, and from median "Very weakly" to median "Very strongly" when fixing them. △ Less

Submitted 15 December, 2022; originally announced December 2022.

Comments: IEEE Security & Privacy 2023

arXiv:2212.05757 [pdf, other]

Satellite-based ITS Data Offloading & Computation in 6G Networks: A Cooperative Multi-Agent Proximal Policy Optimization DRL with Attention Approach

Authors: Sheikh Salman Hassan, Yu Min Park, Yan Kyaw Tun, Walid Saad, Zhu Han, Choong Seon Hong

Abstract: The proliferation of intelligent transportation systems (ITS) has led to increasing demand for diverse network applications. However, conventional terrestrial access networks (TANs) are inadequate in accommodating various applications for remote ITS nodes, i.e., airplanes and ships. In contrast, satellite access networks (SANs) offer supplementary support for TANs, in terms of coverage flexibility… ▽ More The proliferation of intelligent transportation systems (ITS) has led to increasing demand for diverse network applications. However, conventional terrestrial access networks (TANs) are inadequate in accommodating various applications for remote ITS nodes, i.e., airplanes and ships. In contrast, satellite access networks (SANs) offer supplementary support for TANs, in terms of coverage flexibility and availability. In this study, we propose a novel approach to ITS data offloading and computation services based on SANs. We use low-Earth orbit (LEO) and cube satellites (CubeSats) as independent mobile edge computing (MEC) servers that schedule the processing of data generated by ITS nodes. To optimize offloading task selection, computing, and bandwidth resource allocation for different satellite servers, we formulate a joint delay and rental price minimization problem that is mixed-integer non-linear programming (MINLP) and NP-hard. We propose a cooperative multi-agent proximal policy optimization (Co-MAPPO) deep reinforcement learning (DRL) approach with an attention mechanism to deal with intelligent offloading decisions. We also decompose the remaining subproblem into three independent subproblems for resource allocation and use convex optimization techniques to obtain their optimal closed-form analytical solutions. We conduct extensive simulations and compare our proposed approach to baselines, resulting in performance improvements of 9.9%, 5.2%, and 4.2%, respectively. △ Less

Submitted 14 June, 2023; v1 submitted 12 December, 2022; originally announced December 2022.

Comments: 18 Pages, 20 Figures, Submitted to IEEE Transactions on Mobile Computing (TMC)-(Under Major Revision)

arXiv:2211.01675 [pdf, other]

doi 10.1109/IEMCON.2019.8936148

Spam Review Detection Using Deep Learning

Authors: G. M. Shahariar, Swapnil Biswas, Faiza Omar, Faisal Muhammad Shah, Samiha Binte Hassan

Abstract: A robust and reliable system of detecting spam reviews is a crying need in todays world in order to purchase products without being cheated from online sites. In many online sites, there are options for posting reviews, and thus creating scopes for fake paid reviews or untruthful reviews. These concocted reviews can mislead the general public and put them in a perplexity whether to believe the rev… ▽ More A robust and reliable system of detecting spam reviews is a crying need in todays world in order to purchase products without being cheated from online sites. In many online sites, there are options for posting reviews, and thus creating scopes for fake paid reviews or untruthful reviews. These concocted reviews can mislead the general public and put them in a perplexity whether to believe the review or not. Prominent machine learning techniques have been introduced to solve the problem of spam review detection. The majority of current research has concentrated on supervised learning methods, which require labeled data - an inadequacy when it comes to online review. Our focus in this article is to detect any deceptive text reviews. In order to achieve that we have worked with both labeled and unlabeled data and proposed deep learning methods for spam review detection which includes Multi-Layer Perceptron (MLP), Convolutional Neural Network (CNN) and a variant of Recurrent Neural Network (RNN) that is Long Short-Term Memory (LSTM). We have also applied some traditional machine learning classifiers such as Nave Bayes (NB), K Nearest Neighbor (KNN) and Support Vector Machine (SVM) to detect spam reviews and finally, we have shown the performance comparison for both traditional and deep learning classifiers. △ Less

Submitted 3 November, 2022; originally announced November 2022.

Journal ref: 2019 IEEE 10th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON). IEEE, 2019

arXiv:2209.12146 [pdf]

doi 10.1007/978-3-030-72322-4_149-1

Machine Learning and Artificial Intelligence-Driven Multi-Scale Modeling for High Burnup Accident-Tolerant Fuels for Light Water-Based SMR Applications

Authors: Md. Shamim Hassan, Abid Hossain Khan, Richa Verma, Dinesh Kumar, Kazuma Kobayashi, Shoaib Usman, Syed Alam

Abstract: The concept of small modular reactor has changed the outlook for tackling future energy crises. This new reactor technology is very promising considering its lower investment requirements, modularity, design simplicity, and enhanced safety features. The application of artificial intelligence-driven multi-scale modeling (neutronics, thermal hydraulics, fuel performance, etc.) incorporating Digital… ▽ More The concept of small modular reactor has changed the outlook for tackling future energy crises. This new reactor technology is very promising considering its lower investment requirements, modularity, design simplicity, and enhanced safety features. The application of artificial intelligence-driven multi-scale modeling (neutronics, thermal hydraulics, fuel performance, etc.) incorporating Digital Twin and associated uncertainties in the research of small modular reactors is a recent concept. In this work, a comprehensive study is conducted on the multiscale modeling of accident-tolerant fuels. The application of these fuels in the light water-based small modular reactors is explored. This chapter also focuses on the application of machine learning and artificial intelligence in the design optimization, control, and monitoring of small modular reactors. Finally, a brief assessment of the research gap on the application of artificial intelligence to the development of high burnup composite accident-tolerant fuels is provided. Necessary actions to fulfill these gaps are also discussed. △ Less

Submitted 25 September, 2022; originally announced September 2022.

Journal ref: Handbook of Smart Energy Systems, 2022

arXiv:2209.08207 [pdf, other]

APPDIA: A Discourse-aware Transformer-based Style Transfer Model for Offensive Social Media Conversations

Authors: Katherine Atwell, Sabit Hassan, Malihe Alikhani

Abstract: Using style-transfer models to reduce offensiveness of social media comments can help foster a more inclusive environment. However, there are no sizable datasets that contain offensive texts and their inoffensive counterparts, and fine-tuning pretrained models with limited labeled data can lead to the loss of original meaning in the style-transferred text. To address this issue, we provide two maj… ▽ More Using style-transfer models to reduce offensiveness of social media comments can help foster a more inclusive environment. However, there are no sizable datasets that contain offensive texts and their inoffensive counterparts, and fine-tuning pretrained models with limited labeled data can lead to the loss of original meaning in the style-transferred text. To address this issue, we provide two major contributions. First, we release the first publicly-available, parallel corpus of offensive Reddit comments and their style-transferred counterparts annotated by expert sociolinguists. Then, we introduce the first discourse-aware style-transfer models that can effectively reduce offensiveness in Reddit text while preserving the meaning of the original text. These models are the first to examine inferential links between the comment and the text it is replying to when transferring the style of offensive Reddit text. We propose two different methods of integrating discourse relations with pretrained transformer models and evaluate them on our dataset of offensive comments from Reddit and their inoffensive counterparts. Improvements over the baseline with respect to both automatic metrics and human evaluation indicate that our discourse-aware models are better at preserving meaning in style-transferred text when compared to the state-of-the-art discourse-agnostic models. △ Less

Submitted 16 September, 2022; originally announced September 2022.

Comments: To be published in Proceedings of COLING 2022, the 29th International Conference on Computational Linguistics

arXiv:2209.07228 [pdf, other]

doi 10.1109/TVT.2023.3311537

Joint Trajectory and Resource Optimization of MEC-Assisted UAVs in Sub-THz Networks: A Resources-based Multi-Agent Proximal Policy Optimization DRL with Attention Mechanism

Authors: Yu Min Park, Sheikh Salman Hassan, Yan Kyaw Tun, Zhu Han, Choong Seon Hong

Abstract: THz band communication technology will be used in the 6G networks to enable high-speed and high-capacity data service demands. However, THz-communication losses arise owing to limitations, i.e., molecular absorption, rain attenuation, and coverage range. Furthermore, to maintain steady THz-communications and overcome coverage distances in rural and suburban regions, the required number of BSs is v… ▽ More THz band communication technology will be used in the 6G networks to enable high-speed and high-capacity data service demands. However, THz-communication losses arise owing to limitations, i.e., molecular absorption, rain attenuation, and coverage range. Furthermore, to maintain steady THz-communications and overcome coverage distances in rural and suburban regions, the required number of BSs is very high. Consequently, a new communication platform that enables aerial communication services is required. Furthermore, the airborne platform supports LoS communications rather than NLoS communications, which helps overcome these losses. Therefore, in this work, we investigate the deployment and resource optimization for MEC-enabled UAVs, which can provide THz-based communications in remote regions. To this end, we formulate an optimization problem to minimize the sum of the energy consumption of both MEC-UAV and MUs and the delay incurred by MUs under the given task information. The formulated problem is a MINLP problem, which is NP-hard. We decompose the main problem into two subproblems to address the formulated problem. We solve the first subproblem with a standard optimization solver, i.e., CVXPY, due to its convex nature. To solve the second subproblem, we design a RMAPPO DRL algorithm with an attention mechanism. The considered attention mechanism is utilized for encoding a diverse number of observations. This is designed by the network coordinator to provide a differentiated fit reward to each agent in the network. The simulation results show that the proposed algorithm outperforms the benchmark and yields a network utility which is $2.22\%$, $15.55\%$, and $17.77\%$ more than the benchmarks. △ Less

Submitted 15 September, 2022; originally announced September 2022.

Comments: 13 pages, 12 figures

arXiv:2208.11951 [pdf, other]

Design of an Efficient CSI Feedback Mechanism in Massive MIMO Systems: A Machine Learning Approach using Empirical Data

Authors: Muhammad Karam Shehzad, Luca Rose, Stefan Wesemann, Mohamad Assaad, Syed Ali Hassan

Abstract: Massive multiple-input multiple-output (mMIMO) regime reaps the benefits of spatial diversity and multiplexing gains, subject to precise channel state information (CSI) acquisition. In the current communication architecture, the downlink CSI is estimated by the user equipment (UE) via dedicated pilots and then fed back to the gNodeB (gNB). The feedback information is compressed with the goal of re… ▽ More Massive multiple-input multiple-output (mMIMO) regime reaps the benefits of spatial diversity and multiplexing gains, subject to precise channel state information (CSI) acquisition. In the current communication architecture, the downlink CSI is estimated by the user equipment (UE) via dedicated pilots and then fed back to the gNodeB (gNB). The feedback information is compressed with the goal of reducing over-the-air overhead. This compression increases the inaccuracy of acquired CSI, thus degrading the overall spectral efficiency. This paper proposes a computationally inexpensive machine learning (ML)-based CSI feedback algorithm, which exploits twin channel predictors. The proposed approach can work for both time-division duplex (TDD) and frequency-division duplex (FDD) systems, and it allows to reduce feedback overhead and improves the acquired CSI accuracy. To observe real benefits, we demonstrate the performance of the proposed approach using the empirical data recorded at the Nokia campus in Stuttgart, Germany. Numerical results show the effectiveness of the proposed approach in terms of reducing overhead, minimizing quantization errors, increasing spectral efficiency, cosine similarity, and precoding gain compared to the traditional CSI feedback mechanism. △ Less

Submitted 25 August, 2022; originally announced August 2022.

arXiv:2208.11852 [pdf]

An Empirical Analysis of the Efficacy of Different Sampling Techniques for Imbalanced Classification

Authors: Asif Newaz, Shahriar Hassan, Farhan Shahriyar Haq

Abstract: Learning from imbalanced data is a challenging task. Standard classification algorithms tend to perform poorly when trained on imbalanced data. Some special strategies need to be adopted, either by modifying the data distribution or by redesigning the underlying classification algorithm to achieve desirable performance. The prevalence of imbalance in real-world datasets has led to the creation of… ▽ More Learning from imbalanced data is a challenging task. Standard classification algorithms tend to perform poorly when trained on imbalanced data. Some special strategies need to be adopted, either by modifying the data distribution or by redesigning the underlying classification algorithm to achieve desirable performance. The prevalence of imbalance in real-world datasets has led to the creation of a multitude of strategies for the class imbalance issue. However, not all the strategies are useful or provide good performance in different imbalance scenarios. There are numerous approaches to dealing with imbalanced data, but the efficacy of such techniques or an experimental comparison among those techniques has not been conducted. In this study, we present a comprehensive analysis of 26 popular sampling techniques to understand their effectiveness in dealing with imbalanced data. Rigorous experiments have been conducted on 50 datasets with different degrees of imbalance to thoroughly investigate the performance of these techniques. A detailed discussion of the advantages and limitations of the techniques, as well as how to overcome such limitations, has been presented. We identify some critical factors that affect the sampling strategies and provide recommendations on how to choose an appropriate sampling technique for a particular application. △ Less

Submitted 24 August, 2022; originally announced August 2022.

Comments: Submitted to "Information Sciences" (Elsevier)

arXiv:2207.11360 [pdf, other]

doi 10.1109/VLSI-SoC54400.2022.9939623

A Hardware-based HEFT Scheduler Implementation for Dynamic Workloads on Heterogeneous SoCs

Authors: Alexander Fusco, Sahil Hassan, Joshua Mack, Ali Akoglu

Abstract: Non-uniform performance and power consumption across the processing elements (PEs) of heterogeneous SoCs increase the computation complexity of the task scheduling problem compared to homogeneous architectures. Latency of a software-based scheduler with the increased heterogeneity level in terms of number and types of PEs creates the necessity of deploying a scheduler as an overlay processor in ha… ▽ More Non-uniform performance and power consumption across the processing elements (PEs) of heterogeneous SoCs increase the computation complexity of the task scheduling problem compared to homogeneous architectures. Latency of a software-based scheduler with the increased heterogeneity level in terms of number and types of PEs creates the necessity of deploying a scheduler as an overlay processor in hardware to be able to make scheduling decisions rapidly and enable deployment of real-life applications on heterogeneous SoCs. In this study we present the design trade-offs involved for implementing and deploying the runtime variant of the heterogeneous earliest finish time algorithm (HEFT_RT) on the FPGA. We conduct performance evaluations on a SoC configuration emulated over the Xilinx Zynq ZCU102 platform. In a runtime environment we demonstrate hardware-based HEFT_RT's ability to make scheduling decisions with 9.144 ns latency on average, process 26.7% more tasks per second compared to its software counterpart, and reduce the scheduling latency by up to a factor of 183x based on workloads composed of mixture of dynamically arriving real-life signal processing applications. △ Less

Submitted 13 November, 2022; v1 submitted 22 July, 2022; originally announced July 2022.

Comments: Presented at 2022 IFIP/IEEE 30th International Conference on Very Large Scale Integration (October 3-5)

Journal ref: IFIP/IEEE 30th Int. Conf. on Very Large Scale Integr. (VLSI-SoC), 2022, pp. 1-6

arXiv:2207.04021 [pdf, ps, other]

ASL-Homework-RGBD Dataset: An annotated dataset of 45 fluent and non-fluent signers performing American Sign Language homeworks

Authors: Saad Hassan, Matthew Seita, Larwan Berke, Yingli Tian, Elaine Gale, Sooyeon Lee, Matt Huenerfauth

Abstract: We are releasing a dataset containing videos of both fluent and non-fluent signers using American Sign Language (ASL), which were collected using a Kinect v2 sensor. This dataset was collected as a part of a project to develop and evaluate computer vision algorithms to support new technologies for automatic detection of ASL fluency attributes. A total of 45 fluent and non-fluent participants were… ▽ More We are releasing a dataset containing videos of both fluent and non-fluent signers using American Sign Language (ASL), which were collected using a Kinect v2 sensor. This dataset was collected as a part of a project to develop and evaluate computer vision algorithms to support new technologies for automatic detection of ASL fluency attributes. A total of 45 fluent and non-fluent participants were asked to perform signing homework assignments that are similar to the assignments used in introductory or intermediate level ASL courses. The data is annotated to identify several aspects of signing including grammatical features and non-manual markers. Sign language recognition is currently very data-driven and this dataset can support the design of recognition technologies, especially technologies that can benefit ASL learners. This dataset might also be interesting to ASL education researchers who want to contrast fluent and non-fluent signing. △ Less

Submitted 8 July, 2022; originally announced July 2022.

arXiv:2207.00426 [pdf, ps, other]

Parallel square-root statistical linear regression for inference in nonlinear state space models

Authors: Fatemeh Yaghoobi, Adrien Corenflos, Sakira Hassan, Simo S�rkk�

Abstract: In this article, we introduce parallel-in-time methods for state and parameter estimation in general nonlinear non-Gaussian state-space models using the statistical linear regression and the iterated statistical posterior linearization paradigms. We also reformulate the proposed methods in a square-root form, resulting in improved numerical stability while preserving the parallelization capabiliti… ▽ More In this article, we introduce parallel-in-time methods for state and parameter estimation in general nonlinear non-Gaussian state-space models using the statistical linear regression and the iterated statistical posterior linearization paradigms. We also reformulate the proposed methods in a square-root form, resulting in improved numerical stability while preserving the parallelization capabilities. We then leverage the fixed-point structure of our methods to perform likelihood-based parameter estimation in logarithmic time with respect to the number of observations. Finally, we demonstrate the practical performance of the methodology with numerical experiments run on a graphics processing unit (GPU). △ Less

Submitted 5 April, 2023; v1 submitted 29 June, 2022; originally announced July 2022.

arXiv:2206.12368 [pdf, other]

doi 10.18653/v1/2022.ltedi-1.5

Using BERT Embeddings to Model Word Importance in Conversational Transcripts for Deaf and Hard of Hearing Users

Authors: Akhter Al Amin, Saad Hassan, Cecilia O. Alm, Matt Huenerfauth

Abstract: Deaf and hard of hearing individuals regularly rely on captioning while watching live TV. Live TV captioning is evaluated by regulatory agencies using various caption evaluation metrics. However, caption evaluation metrics are often not informed by preferences of DHH users or how meaningful the captions are. There is a need to construct caption evaluation metrics that take the relative importance… ▽ More Deaf and hard of hearing individuals regularly rely on captioning while watching live TV. Live TV captioning is evaluated by regulatory agencies using various caption evaluation metrics. However, caption evaluation metrics are often not informed by preferences of DHH users or how meaningful the captions are. There is a need to construct caption evaluation metrics that take the relative importance of words in a transcript into account. We conducted correlation analysis between two types of word embeddings and human-annotated labeled word-importance scores in existing corpus. We found that normalized contextualized word embeddings generated using BERT correlated better with manually annotated importance scores than word2vec-based word embeddings. We make available a pairing of word embeddings and their human-annotated importance scores. We also provide proof-of-concept utility by training word importance models, achieving an F1-score of 0.57 in the 6-class word importance classification task. △ Less

Submitted 24 June, 2022; originally announced June 2022.

Comments: 5 pages, 3 tables, 1 figure

arXiv:2205.09084 [pdf, other]

Industry 5.0 is Coming: A Survey on Intelligent NextG Wireless Networks as Technological Enablers

Authors: Shah Zeb, Aamir Mahmood, Sunder Ali Khowaja, Kapal Dev, Syed Ali Hassan, Nawab Muhammad Faseeh Qureshi, Mikael Gidlund, Paolo Bellavista

Abstract: Industry 5.0 vision, a step toward the next industrial revolution and enhancement to Industry 4.0, envisioned the new goals of resilient, sustainable, and human-centric approaches in diverse emerging applications, e.g., factories-of-the-future, digital society. The vision seeks to leverage human intelligence and creativity in nexus with intelligent, efficient, and reliable cognitive collaborating… ▽ More Industry 5.0 vision, a step toward the next industrial revolution and enhancement to Industry 4.0, envisioned the new goals of resilient, sustainable, and human-centric approaches in diverse emerging applications, e.g., factories-of-the-future, digital society. The vision seeks to leverage human intelligence and creativity in nexus with intelligent, efficient, and reliable cognitive collaborating robots (cobots) to achieve zero waste, zerodefect, and mass customization-based manufacturing solutions. However, the vision requires the merging of cyber-physical worlds through utilizing Industry 5.0 technological enablers, e.g., cognitive cobots, person-centric artificial intelligence (AI), cyberphysical systems, digital twins, hyperconverged data storage and computing, communication infrastructure, and others. In this regard, the convergence of the emerging computational intelligence (CI) paradigm and next-generation wireless networks (NGWNs) can fulfill the stringent communication and computation requirements of the technological enablers in the Industry 5.0 vision, which is the aim of this survey-based tutorial. In this article, we address this issue by reviewing and analyzing current emerging concepts and technologies, e.g., CI tools and frameworks, network-in-box architecture, open radio access networks, softwarized service architectures, potential enabling services, and others, essential for designing the objectives of CINGWNs to fulfill the Industry 5.0 vision requirements. Finally, we provide a list of lessons learned from our detailed review, research challenges, and open issues that should be addressed in CI-NGWNs to realize Industry 5.0. △ Less

Submitted 18 May, 2022; originally announced May 2022.

arXiv:2204.12526 [pdf, other]

Identification of feasible pathway information for c-di-GMP binding proteins in cellulose production

Authors: Syeda Sakira Hassan, Rahul Mangayil, Tommi Aho, Olli Yli-Harja, Matti Karp

Abstract: In this paper, we utilize a machine learning approach to identify the significant pathways for c-di-GMP signaling proteins. The dataset involves gene counts from 12 pathways and 5 essential c-di-GMP binding domains for 1024 bacterial genomes. Two novel approaches, Least absolute shrinkage and selection operator (Lasso) and Random forests, have been applied for analyzing and modeling the dataset. B… ▽ More In this paper, we utilize a machine learning approach to identify the significant pathways for c-di-GMP signaling proteins. The dataset involves gene counts from 12 pathways and 5 essential c-di-GMP binding domains for 1024 bacterial genomes. Two novel approaches, Least absolute shrinkage and selection operator (Lasso) and Random forests, have been applied for analyzing and modeling the dataset. Both approaches show that bacterial chemotaxis is the most essential pathway for c-di-GMP encoding domains. Though popular for feature selection, the strong regularization of Lasso method fails to associate any pathway to MshE domain. Results from the analysis may help to understand and emphasize the supporting pathways involved in bacterial cellulose production. These findings demonstrate the need for a chassis to restrict the behavior or functionality by deactivating the selective pathways in cellulose production. △ Less

Submitted 26 April, 2022; originally announced April 2022.

Journal ref: EMBEC & NBC 2017. EMBEC NBC 2017 2017. IFMBE Proceedings, vol 65. Springer, Singapore

arXiv:2204.08962 [pdf, other]

doi 10.1145/3529257

CEDR -- A Compiler-integrated, Extensible DSSoC Runtime

Authors: Joshua Mack, Sahil Hassan, Nirmal Kumbhare, Miguel Castro-Gonzalez, Ali Akoglu

Abstract: In this work, we present CEDR, a Compiler-integrated, Extensible Domain Specific System on Chip Runtime ecosystem to facilitate research towards addressing the challenges of architecture, system software and application development with distinct plug-and-play integration points in a unified compile time and run time workflow. We demonstrate the utility of CEDR on the Xilinx Zynq MPSoC-ZCU102 for e… ▽ More In this work, we present CEDR, a Compiler-integrated, Extensible Domain Specific System on Chip Runtime ecosystem to facilitate research towards addressing the challenges of architecture, system software and application development with distinct plug-and-play integration points in a unified compile time and run time workflow. We demonstrate the utility of CEDR on the Xilinx Zynq MPSoC-ZCU102 for evaluating performance of pre-silicon hardware in the trade space of SoC configuration, scheduling policy and workload complexity based on dynamically arriving workload scenarios composed of real-life signal processing applications scaling to thousands of application instances with FFT and matrix multiply accelerators. We provide insights into the trade-offs present in this design space through a number of distinct case studies. CEDR is portable and has been deployed and validated on Odroid-XU3, X86 and Nvidia Jetson Xavier based SoC platforms. Taken together, CEDR is a capable environment for enabling research in exploring the boundaries of productive application development, resource management heuristic development, and hardware configuration analysis for heterogeneous architectures. △ Less

Submitted 15 April, 2022; originally announced April 2022.

Comments: 35 pages single column, 16 figures, 7 tables. Accepted for publication in the ACM Transactions on Embedded and Computing Systems

arXiv:2204.03794 [pdf]

On the Importance of Performing App Analysis Within Peer Groups

Authors: Safwat Hassan, Heng Li, Ahmed E. Hassan

Abstract: The competing nature of the app market motivates us to shift our focus on apps that provide similar functionalities and directly compete with each other (i.e., peer apps). In this work, we study the ratings and the review text of 100 Android apps across 10 peer app groups. We highlight the importance of performing peer-app analysis by showing that it can provide a unique perspective over performin… ▽ More The competing nature of the app market motivates us to shift our focus on apps that provide similar functionalities and directly compete with each other (i.e., peer apps). In this work, we study the ratings and the review text of 100 Android apps across 10 peer app groups. We highlight the importance of performing peer-app analysis by showing that it can provide a unique perspective over performing a global analysis of apps (i.e., mixing apps from multiple categories). First, we observe that comparing user ratings within peer groups can provide very different results from comparing user ratings from a global perspective. Then, we show that peer-app analysis provides a different perspective to spot the dominant topics in the user reviews, and to understand the impact of the topics on user ratings. Our findings suggest that future efforts may pay more attention to performing and supporting app analysis from a peer group context. For example, app store owners may consider an additional rating mechanism that normalizes app ratings within peer groups, and future research may help developers understand the characteristics of specific peer groups and prioritize their efforts. △ Less

Submitted 7 April, 2022; originally announced April 2022.

Showing 1–50 of 146 results for author: Hassan, S