Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleApril 2024
Fully Sparse Fusion for 3D Object Detection
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 46, Issue 11Pages 7217–7231https://doi.org/10.1109/TPAMI.2024.3392303Currently prevalent multi-modal 3D detection methods rely on dense detectors that usually use dense Bird’s-Eye-View (BEV) feature maps. However, the cost of such BEV feature maps is quadratic to the detection range, making it not scalable for long-...
- research-articleApril 2024
VST++: Efficient and Stronger Visual Saliency Transformer
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 46, Issue 11Pages 7300–7316https://doi.org/10.1109/TPAMI.2024.3388153While previous CNN-based models have exhibited promising results for salient object detection (SOD), their ability to explore global long-range dependencies is restricted. Our previous work, the Visual Saliency Transformer (VST), addressed this constraint ...
- research-articleApril 2024
Vote2Cap-DETR++: Decoupling Localization and Describing for End-to-End 3D Dense Captioning
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 46, Issue 11Pages 7331–7347https://doi.org/10.1109/TPAMI.2024.33878383D dense captioning requires a model to translate its understanding of an input 3D scene into several captions associated with different object regions. Existing methods adopt a sophisticated “detect-then-describe” pipeline, which builds ...
- research-articleApril 2024
Representing Noisy Image Without Denoising
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 46, Issue 10Pages 6713–6730https://doi.org/10.1109/TPAMI.2024.3386985A long-standing topic in artificial intelligence is the effective recognition of patterns from noisy images. In this regard, the recent data-driven paradigm considers 1) improving the representation robustness by adding noisy samples in training phase (...
- research-articleApril 2024
PPDM++: Parallel Point Detection and Matching for Fast and Accurate HOI Detection
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 46, Issue 10Pages 6826–6841https://doi.org/10.1109/TPAMI.2024.3386891Human-Object Interaction (HOI) detection aims to understand human activities by detecting interaction triplets. Previous HOI detection methods adopt a two-stage instance-driven paradigm. Unfortunately, many non-interactive human-object pairs generated by ...
-
- research-articleMarch 2024
<italic>FeatAug-DETR:</italic> Enriching One-to-Many Matching for DETRs With Feature Augmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 46, Issue 9Pages 6402–6415https://doi.org/10.1109/TPAMI.2024.3381961One-to-one matching is a crucial design in DETR-like object detection frameworks. It enables the DETR to perform end-to-end detection. However, it also faces challenges of lacking positive sample supervision and slow convergence speed. Several recent ...
- research-articleMarch 2024
Gradient-Based Instance-Specific Visual Explanations for Object Specification and Object Discrimination
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 46, Issue 9Pages 5967–5985https://doi.org/10.1109/TPAMI.2024.3380604We propose the gradient-weighted Object Detector Activation Maps (ODAM), a visual explanation technique for interpreting the predictions of object detectors. Utilizing the gradients of detector targets flowing into the intermediate feature maps, ODAM ...
- research-articleMarch 2024
Turning a CLIP Model Into a Scene Text Spotter
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 46, Issue 9Pages 6040–6054https://doi.org/10.1109/TPAMI.2024.3379828We exploit the potential of the large-scale Contrastive Language-Image Pretraining (CLIP) model to enhance scene text detection and spotting tasks, transforming it into a robust backbone, FastTCM-CR50. This backbone utilizes visual prompt learning and ...
- research-articleMarch 2024
On Boundary Discontinuity in Angle Regression Based Arbitrary Oriented Object Detection
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 46, Issue 10Pages 6494–6508https://doi.org/10.1109/TPAMI.2024.3378777With vigorous development e.g., in autonomous driving and remote sensing, oriented object detection has gradually been featured. The majority of existing methods directly perform regression on the rotation angle, which we argue has fundamental limitations ...
- research-articleNovember 2023
Unified Adversarial Patch for Visible-Infrared Cross-Modal Attacks in the Physical World
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 46, Issue 4Pages 2348–2363https://doi.org/10.1109/TPAMI.2023.3330769Physical adversarial attacks have put a severe threat to DNN-based object detectors. To enhance security, a combination of visible and infrared sensors is deployed in various scenarios, which has proven effective in disabling existing single-modal ...
- research-articleSeptember 2023
Mutual-Assistance Learning for Object Detection
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 45, Issue 12Pages 15171–15184https://doi.org/10.1109/TPAMI.2023.3319634Object detection is a fundamental yet challenging task in computer vision. Despite the great strides made over recent years, modern detectors may still produce unsatisfactory performance due to certain factors, such as non-universal object features and ...
- research-articleSeptember 2023
Attribute-Guided Collaborative Learning for Partial Person Re-Identification
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 45, Issue 12Pages 14144–14160https://doi.org/10.1109/TPAMI.2023.3312302Partial person re-identification (ReID) aims to solve the problem of image spatial misalignment due to occlusions or out-of-views. Despite significant progress through the introduction of additional information, such as human pose landmarks, mask maps, ...
- research-articleAugust 2023
QDTrack: Quasi-Dense Similarity Learning for Appearance-Only Multiple Object Tracking
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 45, Issue 12Pages 15380–15393https://doi.org/10.1109/TPAMI.2023.3301975Similarity learning has been recognized as a crucial step for object tracking. However, existing multiple object tracking methods only use sparse ground truth matching as the training objective, while ignoring the majority of the informative regions in ...
- research-articleJuly 2023
Multiscale Dynamic Graph Representation for Biometric Recognition With Occlusions
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 45, Issue 12Pages 15120–15136https://doi.org/10.1109/TPAMI.2023.3298836Occlusion is a common problem with biometric recognition in the wild. The generalization ability of CNNs greatly decreases due to the adverse effects of various occlusions. To this end, we propose a novel unified framework integrating the merits of both ...
- research-articleOctober 2022
End2End Occluded Face Recognition by Masking Corrupted Features
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 44, Issue 10_Part_2Pages 6939–6952https://doi.org/10.1109/TPAMI.2021.3098962With the recent advancement of deep convolutional neural networks, significant progress has been made in general face recognition. However, the state-of-the-art general face recognition models do not generalize well to occluded face images, which are ...
- research-articleOctober 2022
Fast and Robust Multi-Person 3D Pose Estimation and Tracking From Multiple Views
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 44, Issue 10_Part_2Pages 6981–6992https://doi.org/10.1109/TPAMI.2021.3098052This paper addresses the problem of reconstructing 3D poses of multiple people from a few calibrated camera views. The main challenge of this problem is to find the cross-view correspondences among noisy and incomplete 2D pose predictions. Most previous ...
- research-articleOctober 2022
Joint Detection and Matching of Feature Points in Multimodal Images
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 44, Issue 10_Part_1Pages 6585–6593https://doi.org/10.1109/TPAMI.2021.3092289In this work, we propose a novel Convolutional Neural Network (CNN) architecture for the joint detection and matching of feature points in images acquired by different sensors using a single forward pass. The resulting feature detector is tightly coupled ...
- research-articleOctober 2022
Segment as Points for Efficient and Effective Online Multi-Object Tracking and Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 44, Issue 10_Part_1Pages 6424–6437https://doi.org/10.1109/TPAMI.2021.3087898Current multi-object tracking and segmentation (MOTS) methods follow the tracking-by-detection paradigm and adopt 2D or 3D convolutions to extract instance embeddings for instance association. However, due to the large receptive field of deep ...
- research-articleOctober 2022
Concealed Object Detection
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 44, Issue 10_Part_1Pages 6024–6042https://doi.org/10.1109/TPAMI.2021.3085766We present the first systematic study on concealed object detection (COD), which aims to identify objects that are visually embedded in their background. The high intrinsic similarities between the concealed objects and their background make COD far more ...
- research-articleSeptember 2022
Bayesian Embeddings for Few-Shot Open World Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 46, Issue 3Pages 1513–1529https://doi.org/10.1109/TPAMI.2022.3201541As autonomous decision-making agents move from narrow operating environments to unstructured worlds, learning systems must move from a closed-world formulation to an open-world and few-shot setting in which agents continuously learn new classes from small ...