Search | arXiv e-print repository

Learning Decomposable and Debiased Representations via Attribute-Centric Information Bottlenecks

Authors: Jinyung Hong, Eun Som Jeon, Changhoon Kim, Keun Hee Park, Utkarsh Nath, Yezhou Yang, Pavan Turaga, Theodore P. Pavlic

Abstract: Biased attributes, spuriously correlated with target labels in a dataset, can problematically lead to neural networks that learn improper shortcuts for classifications and limit their capabilities for out-of-distribution (OOD) generalization. Although many debiasing approaches have been proposed to ensure correct predictions from biased datasets, few studies have considered learning latent embeddi… ▽ More Biased attributes, spuriously correlated with target labels in a dataset, can problematically lead to neural networks that learn improper shortcuts for classifications and limit their capabilities for out-of-distribution (OOD) generalization. Although many debiasing approaches have been proposed to ensure correct predictions from biased datasets, few studies have considered learning latent embedding consisting of intrinsic and biased attributes that contribute to improved performance and explain how the model pays attention to attributes. In this paper, we propose a novel debiasing framework, Debiasing Global Workspace, introducing attention-based information bottlenecks for learning compositional representations of attributes without defining specific bias types. Based on our observation that learning shape-centric representation helps robust performance on OOD datasets, we adopt those abilities to learn robust and generalizable representations of decomposable latent embeddings corresponding to intrinsic and biasing attributes. We conduct comprehensive evaluations on biased datasets, along with both quantitative and qualitative analyses, to showcase our approach's efficacy in attribute-centric representation learning and its ability to differentiate between intrinsic and bias-related features. △ Less

Submitted 21 March, 2024; originally announced March 2024.

Comments: 24 pages, 16 figures, 3 tables

arXiv:2401.02437 [pdf, other]

Randomly Weighted Neuromodulation in Neural Networks Facilitates Learning of Manifolds Common Across Tasks

Authors: Jinyung Hong, Theodore P. Pavlic

Abstract: Geometric Sensitive Hashing functions, a family of Local Sensitive Hashing functions, are neural network models that learn class-specific manifold geometry in supervised learning. However, given a set of supervised learning tasks, understanding the manifold geometries that can represent each task and the kinds of relationships between the tasks based on them has received little attention. We explo… ▽ More Geometric Sensitive Hashing functions, a family of Local Sensitive Hashing functions, are neural network models that learn class-specific manifold geometry in supervised learning. However, given a set of supervised learning tasks, understanding the manifold geometries that can represent each task and the kinds of relationships between the tasks based on them has received little attention. We explore a formalization of this question by considering a generative process where each task is associated with a high-dimensional manifold, which can be done in brain-like models with neuromodulatory systems. Following this formulation, we define \emph{Task-specific Geometric Sensitive Hashing~(T-GSH)} and show that a randomly weighted neural network with a neuromodulation system can realize this function. △ Less

Submitted 17 November, 2023; originally announced January 2024.

Comments: 10 pages, 7 figures, 1 table, Appear in NeurIPS 2023

arXiv:2305.15775 [pdf, other]

Concept-Centric Transformers: Enhancing Model Interpretability through Object-Centric Concept Learning within a Shared Global Workspace

Authors: Jinyung Hong, Keun Hee Park, Theodore P. Pavlic

Abstract: Many interpretable AI approaches have been proposed to provide plausible explanations for a model's decision-making. However, configuring an explainable model that effectively communicates among computational modules has received less attention. A recently proposed shared global workspace theory showed that networks of distributed modules can benefit from sharing information with a bottlenecked me… ▽ More Many interpretable AI approaches have been proposed to provide plausible explanations for a model's decision-making. However, configuring an explainable model that effectively communicates among computational modules has received less attention. A recently proposed shared global workspace theory showed that networks of distributed modules can benefit from sharing information with a bottlenecked memory because the communication constraints encourage specialization, compositionality, and synchronization among the modules. Inspired by this, we propose Concept-Centric Transformers, a simple yet effective configuration of the shared global workspace for interpretability, consisting of: i) an object-centric-based memory module for extracting semantic concepts from input features, ii) a cross-attention mechanism between the learned concept and input embeddings, and iii) standard classification and explanation losses to allow human analysts to directly assess an explanation for the model's classification reasoning. We test our approach against other existing concept-based methods on classification tasks for various datasets, including CIFAR100, CUB-200-2011, and ImageNet, and we show that our model achieves better classification accuracy than all baselines across all problems but also generates more consistent concept-based explanations of classification output. △ Less

Submitted 7 November, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

Comments: 23 pages, 9 tables, 18 figures, Accepted at WACV2024

arXiv:2210.00377 [pdf, other]

CHARTOPOLIS: A Small-Scale Labor-art-ory for Research and Reflection on Autonomous Vehicles, Human-Robot Interaction, and Sociotechnical Imaginaries

Authors: Sangeet Sankaramangalam Ulhas, Aditya Ravichander, Kathryn A. Johnson, Theodore P. Pavlic, Lance Gharavi, Spring Berman

Abstract: CHARTOPOLIS is a multi-faceted sociotechnical testbed meant to aid in building connections among engineers, psychologists, anthropologists, ethicists, and artists. Superficially, it is an urban autonomous-vehicle testbed that includes both a physical environment for small-scale robotic vehicles as well as a high-fidelity virtual replica that provides extra flexibility by way of computer simulation… ▽ More CHARTOPOLIS is a multi-faceted sociotechnical testbed meant to aid in building connections among engineers, psychologists, anthropologists, ethicists, and artists. Superficially, it is an urban autonomous-vehicle testbed that includes both a physical environment for small-scale robotic vehicles as well as a high-fidelity virtual replica that provides extra flexibility by way of computer simulation. However, both environments have been developed to allow for participatory simulation with human drivers as well. Each physical vehicle can be remotely operated by human drivers that have a driver-seat point of view that immerses them within the small-scale testbed, and those same drivers can also pilot high-fidelity models of those vehicles in a virtual replica of the environment. Juxtaposing human driving performance across these two contexts will help identify to what extent human driving behaviors are sensorimotor responses or involve psychological engagement with a system that has physical, not virtual, side effects and consequences. Furthermore, through collaboration with artists, we have designed the physical testbed to make tangible the reality that technological advancement causes the history of a city to fork into multiple, parallel timelines that take place within populations whose increasing isolation effectively creates multiple independent cities in one. Ultimately, CHARTOPOLIS is meant to challenge engineers to take a more holistic view when designing autonomous systems, while also enabling them to gather novel data that will assist them in making these systems more trustworthy. △ Less

Submitted 1 October, 2022; originally announced October 2022.

Comments: Submission to 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022) Workshop on Miniature Robot Platforms for Full Scale Autonomous Vehicle Research

MSC Class: 93C85 (Primary) 91Cxx (Secondary) ACM Class: I.2.9; J.4; J.5

arXiv:2204.04297 [pdf, other]

Learning to Modulate Random Weights: Neuromodulation-inspired Neural Networks For Efficient Continual Learning

Authors: Jinyung Hong, Theodore P. Pavlic

Abstract: Existing Continual Learning (CL) approaches have focused on addressing catastrophic forgetting by leveraging regularization methods, replay buffers, and task-specific components. However, realistic CL solutions must be shaped not only by metrics of catastrophic forgetting but also by computational efficiency and running time. Here, we introduce a novel neural network architecture inspired by neuro… ▽ More Existing Continual Learning (CL) approaches have focused on addressing catastrophic forgetting by leveraging regularization methods, replay buffers, and task-specific components. However, realistic CL solutions must be shaped not only by metrics of catastrophic forgetting but also by computational efficiency and running time. Here, we introduce a novel neural network architecture inspired by neuromodulation in biological nervous systems to economically and efficiently address catastrophic forgetting and provide new avenues for interpreting learned representations. Neuromodulation is a biological mechanism that has received limited attention in machine learning; it dynamically controls and fine tunes synaptic dynamics in real time to track the demands of different behavioral contexts. Inspired by this, our proposed architecture learns a relatively small set of parameters per task context that \emph{neuromodulates} the activity of unchanging, randomized weights that transform the input. We show that this approach has strong learning performance per task despite the very small number of learnable parameters. Furthermore, because context vectors are so compact, multiple networks can be stored concurrently with no interference and little spatial footprint, thus completely eliminating catastrophic forgetting and accelerating the training process. △ Less

Submitted 9 October, 2023; v1 submitted 8 April, 2022; originally announced April 2022.

Comments: 13 pages, 13 figures, 5 tables

arXiv:2111.10686 [pdf, other]

Representing Prior Knowledge Using Randomly, Weighted Feature Networks for Visual Relationship Detection

Authors: Jinyung Hong, Theodore P. Pavlic

Abstract: The single-hidden-layer Randomly Weighted Feature Network (RWFN) introduced by Hong and Pavlic (2021) was developed as an alternative to neural tensor network approaches for relational learning tasks. Its relatively small footprint combined with the use of two randomized input projections -- an insect-brain-inspired input representation and random Fourier features -- allow it to achieve rich expre… ▽ More The single-hidden-layer Randomly Weighted Feature Network (RWFN) introduced by Hong and Pavlic (2021) was developed as an alternative to neural tensor network approaches for relational learning tasks. Its relatively small footprint combined with the use of two randomized input projections -- an insect-brain-inspired input representation and random Fourier features -- allow it to achieve rich expressiveness for relational learning with relatively low training cost. In particular, when Hong and Pavlic compared RWFN to Logic Tensor Networks (LTNs) for Semantic Image Interpretation (SII) tasks to extract structured semantic descriptions from images, they showed that the RWFN integration of the two hidden, randomized representations better captures relationships among inputs with a faster training process even though it uses far fewer learnable parameters. In this paper, we use RWFNs to perform Visual Relationship Detection (VRD) tasks, which are more challenging SII tasks. A zero-shot learning approach is used with RWFN that can exploit similarities with other seen relationships and background knowledge -- expressed with logical constraints between subjects, relations, and objects -- to achieve the ability to predict triples that do not appear in the training set. The experiments on the Visual Relationship Dataset to compare the performance between RWFNs and LTNs, one of the leading Statistical Relational Learning frameworks, show that RWFNs outperform LTNs for the predicate-detection task while using fewer number of adaptable parameters (1:56 ratio). Furthermore, background knowledge represented by RWFNs can be used to alleviate the incompleteness of training sets even though the space complexity of RWFNs is much smaller than LTNs (1:27 ratio). △ Less

Submitted 22 December, 2021; v1 submitted 20 November, 2021; originally announced November 2021.

Comments: 9 pages, 2 figures, Accepted to CLeaR2022 at AAAI2022

arXiv:2109.06663 [pdf, other]

An Insect-Inspired Randomly, Weighted Neural Network with Random Fourier Features For Neuro-Symbolic Relational Learning

Authors: Jinyung Hong, Theodore P. Pavlic

Abstract: Insects, such as fruit flies and honey bees, can solve simple associative learning tasks and learn abstract concepts such as "sameness" and "difference", which is viewed as a higher-order cognitive function and typically thought to depend on top-down neocortical processing. Empirical research with fruit flies strongly supports that a randomized representational architecture is used in olfactory pr… ▽ More Insects, such as fruit flies and honey bees, can solve simple associative learning tasks and learn abstract concepts such as "sameness" and "difference", which is viewed as a higher-order cognitive function and typically thought to depend on top-down neocortical processing. Empirical research with fruit flies strongly supports that a randomized representational architecture is used in olfactory processing in insect brains. Based on these results, we propose a Randomly Weighted Feature Network (RWFN) that incorporates randomly drawn, untrained weights in an encoder that uses an adapted linear model as a decoder. The randomized projections between input neurons and higher-order processing centers in the input brain is mimicked in RWFN by a single-hidden-layer neural network that specially structures latent representations in the hidden layer using random Fourier features that better represent complex relationships between inputs using kernel approximation. Because of this special representation, RWFNs can effectively learn the degree of relationship among inputs by training only a linear decoder model. We compare the performance of RWFNs to LTNs for Semantic Image Interpretation (SII) tasks that have been used as a representative example of how LTNs utilize reasoning over first-order logic to surpass the performance of solely data-driven methods. We demonstrate that compared to LTNs, RWFNs can achieve better or similar performance for both object classification and detection of the part-of relations between objects in SII tasks while using much far fewer learnable parameters (1:62 ratio) and a faster learning process (1:2 ratio of running speed). Furthermore, we show that because the randomized weights do not depend on the data, several decoders can share a single randomized encoder, giving RWFNs a unique economy of spatial scale for simultaneous classification tasks. △ Less

Submitted 11 September, 2021; originally announced September 2021.

Comments: 17 pages, 5 figures, 2 tables, submitted to NeSy20/21 @ IJCLR. arXiv admin note: text overlap with arXiv:2006.12392

arXiv:2108.09394 [pdf, other]

Beyond Tracking: Using Deep Learning to Discover Novel Interactions in Biological Swarms

Authors: Taeyeong Choi, Benjamin Pyenson, Juergen Liebig, Theodore P. Pavlic

Abstract: Most deep-learning frameworks for understanding biological swarms are designed to fit perceptive models of group behavior to individual-level data (e.g., spatial coordinates of identified features of individuals) that have been separately gathered from video observations. Despite considerable advances in automated tracking, these methods are still very expensive or unreliable when tracking large n… ▽ More Most deep-learning frameworks for understanding biological swarms are designed to fit perceptive models of group behavior to individual-level data (e.g., spatial coordinates of identified features of individuals) that have been separately gathered from video observations. Despite considerable advances in automated tracking, these methods are still very expensive or unreliable when tracking large numbers of animals simultaneously. Moreover, this approach assumes that the human-chosen features include sufficient features to explain important patterns in collective behavior. To address these issues, we propose training deep network models to predict system-level states directly from generic graphical features from the entire view, which can be relatively inexpensive to gather in a completely automated fashion. Because the resulting predictive models are not based on human-understood predictors, we use explanatory modules (e.g., Grad-CAM) that combine information hidden in the latent variables of the deep-network model with the video data itself to communicate to a human observer which aspects of observed individual behaviors are most informative in predicting group behavior. This represents an example of augmented intelligence in behavioral ecology -- knowledge co-creation in a human-AI team. As proof of concept, we utilize a 20-day video recording of a colony of over 50 Harpegnathos saltator ants to showcase that, without any individual annotations provided, a trained model can generate an "importance map" across the video frames to highlight regions of important behaviors, such as dueling (which the AI has no a priori knowledge of), that play a role in the resolution of reproductive-hierarchy re-formation. Based on the empirical results, we also discuss the potential use and current challenges. △ Less

Submitted 20 August, 2021; originally announced August 2021.

Comments: Best Paper Winner at the 4th International Symposium on Swarm Behavior and Bio-Inspired Robotics (SWARM 2021)

arXiv:2108.07554 [pdf, other]

KCNet: An Insect-Inspired Single-Hidden-Layer Neural Network with Randomized Binary Weights for Prediction and Classification Tasks

Authors: Jinyung Hong, Theodore P. Pavlic

Abstract: Fruit flies are established model systems for studying olfactory learning as they will readily learn to associate odors with both electric shock or sugar rewards. The mechanisms of the insect brain apparently responsible for odor learning form a relatively shallow neuronal architecture. Olfactory inputs are received by the antennal lobe (AL) of the brain, which produces an encoding of each odor mi… ▽ More Fruit flies are established model systems for studying olfactory learning as they will readily learn to associate odors with both electric shock or sugar rewards. The mechanisms of the insect brain apparently responsible for odor learning form a relatively shallow neuronal architecture. Olfactory inputs are received by the antennal lobe (AL) of the brain, which produces an encoding of each odor mixture across ~50 sub-units known as glomeruli. Each of these glomeruli then project its component of this feature vector to several of ~2000 so-called Kenyon Cells (KCs) in a region of the brain known as the mushroom body (MB). Fly responses to odors are generated by small downstream neuropils that decode the higher-order representation from the MB. Research has shown that there is no recognizable pattern in the glomeruli--KC connections (and thus the particular higher-order representations); they are akin to fingerprints~-- even isogenic flies have different projections. Leveraging insights from this architecture, we propose KCNet, a single-hidden-layer neural network that contains sparse, randomized, binary weights between the input layer and the hidden layer and analytically learned weights between the hidden layer and the output layer. Furthermore, we also propose a dynamic optimization algorithm that enables the KCNet to increase performance beyond its structural limits by searching a more efficient set of inputs. For odorant-perception tasks that predict perceptual properties of an odorant, we show that KCNet outperforms existing data-driven approaches, such as XGBoost. For image-classification tasks, KCNet achieves reasonable performance on benchmark datasets (MNIST, Fashion-MNIST, and EMNIST) without any data-augmentation methods or convolutional layers and shows particularly fast running time. Thus, neural networks inspired by the insect brain can be both economical and perform well. △ Less

Submitted 17 August, 2021; originally announced August 2021.

Comments: 27 pages, 46 figures, 3 tables

arXiv:2009.08626 [pdf, other]

Identification of Abnormal States in Videos of Ants Undergoing Social Phase Change

Authors: Taeyeong Choi, Benjamin Pyenson, Juergen Liebig, Theodore P. Pavlic

Abstract: Biology is both an important application area and a source of motivation for development of advanced machine learning techniques. Although much attention has been paid to large and complex data sets resulting from high-throughput sequencing, advances in high-quality video recording technology have begun to generate similarly rich data sets requiring sophisticated techniques from both computer visi… ▽ More Biology is both an important application area and a source of motivation for development of advanced machine learning techniques. Although much attention has been paid to large and complex data sets resulting from high-throughput sequencing, advances in high-quality video recording technology have begun to generate similarly rich data sets requiring sophisticated techniques from both computer vision and time-series analysis. Moreover, just as studying gene expression patterns in one organism can reveal general principles that apply to other organisms, the study of complex social interactions in an experimentally tractable model system, such as a laboratory ant colony, can provide general principles about the dynamics of other social groups. Here, we focus on one such example from the study of reproductive regulation in small laboratory colonies of more than 50 Harpegnathos ants. These ants can be artificially induced to begin a ~20 day process of hierarchy reformation. Although the conclusion of this process is conspicuous to a human observer, it remains unclear which behaviors during the transient period are contributing to the process. To address this issue, we explore the potential application of One-class Classification (OC) to the detection of abnormal states in ant colonies for which behavioral data is only available for the normal societal conditions during training. Specifically, we build upon the Deep Support Vector Data Description (DSVDD) and introduce the Inner-Outlier Generator (IO-GEN) that synthesizes fake "inner outlier" observations during training that are near the center of the DSVDD data description. We show that IO-GEN increases the reliability of the final OC classifier relative to other DSVDD baselines. This method can be used to screen video frames for which additional human observation is needed. △ Less

Submitted 14 August, 2021; v1 submitted 18 September, 2020; originally announced September 2020.

Comments: Published in the proceedings of AAAI 2021

Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence, 35(17), 15286-15292 (2021)

arXiv:2006.12392 [pdf, other]

Randomly Weighted, Untrained Neural Tensor Networks Achieve Greater Relational Expressiveness

Authors: Jinyung Hong, Theodore P. Pavlic

Abstract: Neural Tensor Networks (NTNs), which are structured to encode the degree of relationship among pairs of entities, are used in Logic Tensor Networks (LTNs) to facilitate Statistical Relational Learning (SRL) in first-order logic. In this paper, we propose Randomly Weighted Tensor Networks (RWTNs), which incorporate randomly drawn, untrained tensors into an NTN encoder network with a trained decoder… ▽ More Neural Tensor Networks (NTNs), which are structured to encode the degree of relationship among pairs of entities, are used in Logic Tensor Networks (LTNs) to facilitate Statistical Relational Learning (SRL) in first-order logic. In this paper, we propose Randomly Weighted Tensor Networks (RWTNs), which incorporate randomly drawn, untrained tensors into an NTN encoder network with a trained decoder network. We show that RWTNs meet or surpass the performance of traditionally trained LTNs for Semantic Image Interpretation (SII) tasks that have been used as a representative example of how LTNs utilize reasoning over first-order logic to exceed the performance of solely data-driven methods. We demonstrate that RWTNs outperform LTNs for the detection of the relevant part-of relations between objects, and we show that RWTNs can achieve similar performance as LTNs for object classification while using fewer parameters for learning. Furthermore, we demonstrate that because the randomized weights do not depend on the data, several decoder networks can share a single NTN, giving RWTNs a unique economy of spatial scale for simultaneous classification tasks. △ Less

Submitted 3 October, 2020; v1 submitted 1 June, 2020; originally announced June 2020.

Comments: 7 pages, 4 figures, 1 table, submitted to IAAI-21. arXiv admin note: text overlap with arXiv:1705.08968 by other authors

arXiv:1604.02122 [pdf, other]

Modeling Concurrency and Reconfiguration in Vehicular Systems: A $π$-calculus Approach

Authors: Joseph Campbell, Cumhur Erkan Tuncali, Theodore P. Pavlic, Georgios Fainekos

Abstract: As autonomous or semi-autonomous vehicles are deployed on the roads, they will have to eventually start communicating with each other in order to achieve increased efficiency and safety. Current approaches in the control of collaborative vehicles primarily consider homogeneous simplified vehicle dynamics and usually ignore any communication issues. This raises an important question of how systems… ▽ More As autonomous or semi-autonomous vehicles are deployed on the roads, they will have to eventually start communicating with each other in order to achieve increased efficiency and safety. Current approaches in the control of collaborative vehicles primarily consider homogeneous simplified vehicle dynamics and usually ignore any communication issues. This raises an important question of how systems without the aforementioned limiting assumptions can be modeled, analyzed and certified for safe operation by both industry and governmental agencies. In this work, we propose a modeling framework where communication and system reconfiguration is modeled through $π$-calculus expressions while the closed-loop control systems are modeled through hybrid automata. We demonstrate how the framework can be utilized for modeling and simulation of platooning behaviors of heterogeneous vehicles. △ Less

Submitted 7 April, 2016; originally announced April 2016.

arXiv:1405.4070 [pdf, ps, other]

doi 10.7551/978-0-262-32621-6-ch083

Self-referencing cellular automata: A model of the evolution of information control in biological systems

Authors: Theodore P. Pavlic, Alyssa M. Adams, Paul C. W. Davies, Sara Imari Walker

Abstract: Cellular automata have been useful artificial models for exploring how relatively simple rules combined with spatial memory can give rise to complex emergent patterns. Moreover, studying the dynamics of how rules emerge under artificial selection for function has recently become a powerful tool for understanding how evolution can innovate within its genetic rule space. However, conventional cellul… ▽ More Cellular automata have been useful artificial models for exploring how relatively simple rules combined with spatial memory can give rise to complex emergent patterns. Moreover, studying the dynamics of how rules emerge under artificial selection for function has recently become a powerful tool for understanding how evolution can innovate within its genetic rule space. However, conventional cellular automata lack the kind of state feedback that is surely present in natural evolving systems. Each new generation of a population leaves an indelible mark on its environment and thus affects the selective pressures that shape future generations of that population. To model this phenomenon, we have augmented traditional cellular automata with state-dependent feedback. Rather than generating automata executions from an initial condition and a static rule, we introduce mappings which generate iteration rules from the cellular automaton itself. We show that these new automata contain disconnected regions which locally act like conventional automata, thus encapsulating multiple functions into one structure. Consequently, we have provided a new model for processes like cell differentiation. Finally, by studying the size of these regions, we provide additional evidence that the dynamics of self-reference may be critical to understanding the evolution of natural language. In particular, the rules of elementary cellular automata appear to be distributed in the same way as words in the corpus of a natural language. △ Less

Submitted 16 May, 2014; originally announced May 2014.

Comments: Accepted to ALIFE 2014. 8 pages, 9 figures (20 subfigures), 2 tables

MSC Class: 03D10; 18B20; 20M35; 37B15; 37F99; 68Q70; 68Q05; 68Q80; ACM Class: F.1.1; G.2.1

Showing 1–13 of 13 results for author: Pavlic, T P