HTML conversions sometimes display errors due to content that did not convert correctly from the source. This paper uses the following packages that are not yet supported by the HTML conversion tool. Feedback on these issues are not necessary; they are known and are being worked on.

  • failed: bold-extra
  • failed: slantsc

Authors: achieve the best HTML results from your LaTeX submissions by following these best practices.

License: arXiv.org perpetual non-exclusive license
arXiv:2402.15232v1 [astro-ph.IM] 23 Feb 2024

Classification of compact radio sources in the Galactic plane
with supervised machine learning

S. Riggi [email protected] G. Umana C. Trigilio C. Bordiu F. Bufano A. Ingallinera F. Cavallaro Y. Gordon R.P. Norris G. Gürkan P. Leto C. Buemi S. Loru A.M. Hopkins M.D. Filipović T. Cecconello INAF - Osservatorio Astrofisico di Catania, Via Santa Sofia 78, 95123 Catania, Italy Department of Physics, University of Wisconsin-Madison, 1150 University Ave, Madison, WI 53706, USA Western Sydney University, Locked Bag 1797, Penrith South DC, NSW 2751, Australia CSIRO Space & Astronomy, P.O. Box 76, Epping, NSW 1710, Australia Th�ringer Landessternwarte Tautenburg (TLS), Sternwarte 5, D-07778 Tautenburg, Germany CSIRO Space & Astronomy, ATNF, PO Box 1130, Bentley WA 6102, Australia Australian Astronomical Optics, Macquarie University, 105 Delhi Rd, North Ryde, NSW 2113, Australia Department of Electrical, Electronic and Computer Engineering, University of Catania, Viale Andrea Doria 6, 95125 Catania, Italy
Abstract

Generation of science-ready data from processed data products is one of the major challenges in next-generation radio continuum surveys with the Square Kilometre Array (SKA) and its precursors, due to the expected data volume and the need to achieve a high degree of automated processing. Source extraction, characterization, and classification are the major stages involved in this process.
In this work we focus on the classification of compact radio sources in the Galactic plane using both radio and infrared images as inputs. To this aim, we produced a curated dataset of similar-to\sim20,000 images of compact sources of different astronomical classes, obtained from past radio and infrared surveys, and novel radio data from pilot surveys carried out with the Australian SKA Pathfinder (ASKAP). Radio spectral index information was also obtained for a subset of the data. We then trained two different classifiers on the produced dataset. The first model uses gradient-boosted decision trees and is trained on a set of pre-computed features derived from the data, which include radio-infrared colour indices and the radio spectral index. The second model is trained directly on multi-channel images, employing convolutional neural networks.
Using a completely supervised procedure, we obtained a high classification accuracy (F1-score>90%) for separating Galactic objects from the extragalactic background. Individual class discrimination performances, ranging from 60% to 75%, increased by 10% when adding far-infrared and spectral index information, with extragalactic objects, PNe and Hii regions identified with higher accuracies.
The implemented tools and trained models were publicly released, and made available to the radioastronomical community for future application on new radio data.

keywords:
Galactic radio sources, Radio source catalogs, Infrared sources, Classification, Astronomy image processing, Convolutional neural networks
journal: arXiv

1 Introduction

The Square Kilometre Array (SKA) (Dewdney et al., 2016) will open a golden era in radio astronomy due to its anticipated sensitivity, frequency coverage and angular resolution. While the SKA is currently in the construction phase, SKA precursor telescopes have already started their planned survey programs, delivering valuable scientific results even during the commissioning phase. Among them, the Evolutionary Map of the Universe (EMU) program (Norris et al., 2011) of the Australian SKA Pathfinder (ASKAP, Johnston et al. 2008; Hotan et al. 2021) will survey similar-to\sim75% of the sky at similar-to\sim940 MHz with an angular resolution of 10" and a target rms of 15 μ𝜇\muitalic_μJy/beam. As EMU is expected to detect similar-to\sim50 million sources, the cataloguing process will require a significant degree of automation and knowledge extraction compared to previous surveys. Source finding is a major stage involved in such post-processing of observations.

In the last years, several developments were made within the SKA precursor community, and new tools were produced to improve compact source extraction and measurement capabilities (e.g. completeness, reliability, positional and flux density accuracy) and processing speed, also employing parallel computing methodologies (e.g. see Riggi et al. 2019 and references therein). Fewer efforts, however, has been spent on source classification, particularly for Galactic science targets, as almost all source finders do not provide any information (e.g. labels or tags) on the extracted source class identity. The implication for Galactic plane observations is that, after taking out source classifications made through automated cross-matching to previously classified objects (e.g. similar-to\sim5% in the Scorpio field in Riggi et al. 2021a), the vast majority of the catalogued sources are unclassified. Of these, more than 90% are typically single-island and single-component sources111By ”island” we denote a group of 4-connected pixels in the analysed map having brightness above a threshold (typically 2.5--3.0 σrmssubscript𝜎𝑟𝑚𝑠\sigma_{rms}italic_σ start_POSTSUBSCRIPT italic_r italic_m italic_s end_POSTSUBSCRIPT), located around a seed pixel with brightness above a detection threshold (typically 5 σrmssubscript𝜎𝑟𝑚𝑠\sigma_{rms}italic_σ start_POSTSUBSCRIPT italic_r italic_m italic_s end_POSTSUBSCRIPT). An island can include multiple source ”components”, each typically modelled with a 2D Gaussian distribution.. From the number of previously classified objects, it is reasonable to expect that the majority of unknown sources are extragalactic (radio galaxies, quasars), and Hii regions, with a smaller fraction222PNe and pulsars are, respectively, similar-to\sim60 and 30% less numerous than Hii regions according to existing catalogue counts (see Section 3.1 and catalogue references therein). of Planetary Nebulae (PNe) and pulsars, and an even smaller fraction (<<<10%) of radio stars of different types and evolution stage (e.g. including evolved massive stars like Luminous Blue Variables or Wolf-Rayet), or even completely new or unexpected classes of objects. Classification tools could therefore significantly increase the number of known sources in the Galaxy, or at least guide science groups in proposing follow-up multi-wavelength observations for selected source samples. Machine learning, in general, and specifically deep learning techniques, have proven to be very powerful for this kind of analysis. We summarize here the developments made on radio source classification in recent works.
Most of the contributions focused on galaxy morphology classification for extragalactic science cases. For example, Aniyan & Thorat (2017) employed convolutional neural networks (CNNs) for classification of Fanaroff–Riley (FR) type I and type II (Fanaroff & Riley, 1974), and bent-tailed radio galaxies, using images from the Very Large Array (VLA) FIRST333FIRST: Faint Images of the Radio Sky at Twenty-cm (Becker et al., 1995) survey. Similar analysis were conducted using CNNs (Lukic et al., 2018; Wu et al., 2019; Maslej-Krešňáková et al., 2021; Rustige et al., 2022) or capsule networks (Lukic et al., 2019) on FIRST444The Radio Galaxy Zoo (RGZ) DR1 dataset (Banfield et al., 2015) is similar-to\sim99% made up by FIRST survey images. and LOFAR (Low Frequency Array) images (Alegre et al., 2022). Sadeghi et al. (2021) studied morphological-based classification of FRI/FRII radio galaxies with Support Vector Machine (SVM) (Cortes & Vapnik, 1995) models, using computed Zernike moments of source images from the FIRST survey. Radio galaxy morphology was also studied using semi-supervised (Slijepcevic et al, 2022) and unsupervised learning methods, for example employing Kohonen maps (Polsterer, 2016; Gupta et al, 2022; Galvin et al., 2020) or K-means clustering algorithm applied to compressed features learnt by convolutional autoencoders and Self-Organising Maps (SOMs) (Ralph et al., 2019).
Various works used ML techniques to target Galactic science objectives, such as the identification of Galactic objects or selected object classes from the dominant background of extragalactic sources, or the discovery of anomalous/unexpected objects. Among them, Akras et al. (2019) employed decision trees for classifying PNe against mimics (Hii regions, stars, YSO) using near- and mid-infrared colours. Awang Iskandar et al. (2020) tested several deep network architectures to identify PNe from rejected PNe listed in the HASH555HASH: Hong Kong/AAO/Strasbourg H-alpha (Parker et al., 2016) and Pan-STARRS666Pan-STARRS: Panoramic Survey Telescope and Rapid Response System (Flewelling et al, 2010) databases, using infrared (WISE777WISE: Wide-Field Infrared Survey Explorer (Wright et al., 2010)) and optical (IPHAS888IPHAS: INT Photometric Hα𝛼\alphaitalic_α Survey of the Northern Galactic Plane (Drew et al., 2005), VPHAS+999VPHAS+: VST/OmegaCAM Photometric Hα𝛼\alphaitalic_α Survey (Drew et al., 2014), SHS101010SHS: SuperCOSMOS Halpha Survey (Parker et al., 2005)/SSS111111SSS: SuperCOSMOS Sky Survey (Hambly et al., 2001)) images. Anderson et al. (2012) considered mid- and far-infrared colours, providing diagnostic selection criteria for discriminating PNe and Hii regions. Morello et al. (2018) also considered near-infrared (2MASS121212IRAC: Infrared Array Camera (Fazio et al., 2004)) colours to identify new Wolf-Rayet star candidates from other stellar populations contaminants (Young Stellar Objects (YSOs), asymptotic giant branch (AGB) candidates, Be/M--S type stars), using variants of the k-nearest neighbours algorithm. None of these studies, however, included radio data in their analysis or had the radio domain as their primary target. In this context, various ML applications were instead primarily developed for classification of radio sources in the Galactic plane. Among them, Liu et al. (2019) used radio data from different surveys (MGPS131313MGPS: Molonglo Galactic Plane Survey (Murphy et al., 2007), MAGPIS141414MAGPIS: Multi-Array Galactic Plane Imaging Survey (Helfand et al., 2006), NVSS151515NVSS: NRAO VLA Sky Survey (Condon et al., 1998), CGPS161616CGPS: Canadian Galactic Plane Survey (Taylor et al., 2003)) to train a deep CNN to identify Supernova Remnants (SNRs) from non-SNRs (e.g. regions surrounding the SNRs in their analysis). Several studies (Bates et al., 2012; Lyon et al., 2016; Tan et al., 2018) employed machine learning methodologies to classify pulsars from non-pulsars or to filter pulsar candidates. We also recently provided some contributions in this field. In Riggi et al. (2023) we have applied the Mask R-CNN object detection framework to detect and classify compact point-source, extended radio galaxies, and imaging artefacts, making use of radio data from the FIRST, Scorpio ATCA171717ATCA: Australia Telescope Compact Array(Umana et al., 2015a) and ASKAP EMU pilot surveys. In Riggi et al. (2021a) we have trained a decision tree to identify Galactic-like sources from extragalactic ones on the basis of their radio-infrared colours. The classifier was also applied to a set of 284 unclassified sources selected in the ASKAP Scorpio survey field, highlighting potential Galactic objects for future studies. This analysis was however limited by the size and reliability of the dataset used for model training, mostly based on past low-resolution Galactic plane surveys.
In this work, we made significant steps further, building a much larger and curated dataset of different Galactic and extragalactic compact objects, including previous and newest radio data in the Galactic plane, combined with mid- and far-infrared data, and measuring the radio spectral index for a portion of them. Such a dataset will be used as a reference for performing classification studies with different machine learning methodologies in a series of planned papers. The scope of this first paper, besides presenting the dataset, is firstly to explore and select suitable parameters for source classification, from more traditional science-aware features (e.g. radio-infrared colours, spectral indices), to more abstract features automatically learnt in convolutional neural network architectures. Secondly, we would like to build and test a supervised learning model able to predict a classification label for an input set of unknown sources, from the considered set of class categories, providing also the relative membership score. As a final goal, we aim to deliver the trained model and the classification tool/service to end users, supporting SKA and precursor science projects planned in the Galactic plane (e.g. production of added-value catalogues from pipeline catalogue products, source selection for follow-up analysis, and so forth). In future papers we will focus on testing unsupervised techniques for cluster search and anomaly detection on the same dataset.
This paper is organized as follows. In Section 2 we describe the observational radio data and supplementary surveys used to create our compact source image dataset. The source classes considered for the analysis, the methodology followed to prepare the dataset, and summary dataset information, are reported in Section 3. In Section 4 we describe the techniques explored to extract a set of sensitive features for classification from the produced dataset. The results of our classification analysis are reported in Section 5. Details on the analysis pipeline and the implemented tool are provided in Section 6. In Section 7, we summarized our findings, and highlighted future steps and analysis that are planned to be done with the produced dataset.

Table 1: Centres of the ASKAP EMU pilot phase 2 images used in this work. Each image covers an area of similar-to\sim40 deg22{}^{2}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPT. Column (1) indicates the observation scheduling blocks.
SB Right ascension (J2000) Declination (J2000)
(hh:mm:ss.ss) (dd:mm:ss.ss)
28280 16:50:44.24 --41:47:22.93
32043 16:45:03.28 --46:28:26.90
32145 17:18:08.79 --41:52:51.98
32526 17:15:03.15 --46:28:53.52
33284 15:53:57.03 --55:41:52.77

2 Observational data

2.1 ASKAP radio surveys

We searched for sources of different classes in ASKAP pilot survey observations, carried out both far and towards the Galactic plane. Details are reported in the following sections.

2.1.1 ASKAP EMU pilot survey data

The ASKAP-EMU survey (Norris et al., 2011) started observations at the end of 2022. This work makes use of different radio continuum maps that were produced with the ASKAP telescope during the commissioning and science preparation activities for EMU:

  • 1.

    Early Science data: The Scorpio field was the only region observed in the Galactic plane by ASKAP at multiple radio frequencies during the Early Science and pilot 1 phase. First observations, done in 2018 with 15 antennas at 912 MHz, and covering 40similar-toabsent40\sim\!40∼ 40 square degrees centred on (l𝑙litalic_l,b𝑏bitalic_b)=(343.5{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT, 0.75{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT), were described in Umana et al. (2021) along with data reduction, while scientific results on compact sources were presented in Riggi et al. (2021a).
    As the array was nearly completed, new observations of the same region were carried out with 30 antennas in band 1 (900 MHz), 2 (1250 MHz), and 3 (1550 MHz), each with a 288 MHz bandwidth, thus providing a much better sensitivity and an almost full frequency coverage from 0.75 to 1.7 GHz when combining all observations. Observation configuration and data reduction were described in more detail in Ingallinera et al. (2022). Final data products181818These data still have a parametrized primary beam correction in the three bands, affecting flux density measurement by similar-to\sim10% (Riggi et al., 2021a), as precise measurements of the beam shape became available afterwards at pilot 1 phase (Norris et al., 2021). include a total intensity map at the reference frequency of 1243 MHz and 5 sub-band maps (reference frequencies: 871 MHz, 1015 MHz, 1356 MHz, 1480 MHz, 1615 MHz).
    The synthesized beam of the total intensity maps is 9.4"×\times×7.7" at a position angle of 84{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT. The background rms noise in regions far from the Galactic plane and point-sources was found of the order of 50 μ𝜇\muitalic_μJy/beam.

  • 2.

    Pilot data: When the array was completed, a pilot survey program was undertaken within EMU. In Phase 1, the survey covered similar-to\sim270 deg22{}^{2}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPT of the Dark Energy Survey area, reaching an angular resolution of 11"--18" and similar-to\sim30 μ𝜇\muitalic_μJy/beam noise rms at 944 MHz (Norris et al., 2021). Observations towards the Galactic plane were carried out in Phase 2. They consist of 5 tiles, each covering similar-to\sim40 deg22{}^{2}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPT. Their coordinate centers and corresponding observation scheduling blocks are reported in Table 1. The achieved angular resolution of the total intensity maps varies from 14" to 20", and the noise rms is of the order of 200 μ𝜇\muitalic_μJy/beam far from the Galactic plane and from regions of diffuse emission.

2.1.2 The Rapid ASKAP Continuum Survey

The RACS survey (McConnell et al., 2020) is the first large area survey carried out at 887.5 MHz with ASKAP. It reached an angular resolution of 15"--25", a rms sensitivity of 0.2--0.4 mJy/beam, and source positional accuracy better than 1", delivering a catalogue of 2,123,638 sources, 95% complete above 3 mJy (Hale et al., 2021)191919Image products are publicly available through the CSIRO ASKAP Science Data Archive (CASDA) at https://data.csiro.au/domain/casdaObservation.

2.2 Previous radio surveys

We also searched for sources in the following previous radio surveys carried out between 1.4 and 5 GHz. Some of them cover a large portion of the Galactic plane in the first quadrant. Details are reported below:

  • 1.

    The HI/OH/Recombination line survey of the Milky Way: THOR (Wang et al., 2018) is a Galactic plane survey (14.0{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT<l𝑙litalic_l<67.4{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT, |b|𝑏|b|| italic_b |<1.25{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT) carried out with the VLA in C-configuration at 1.42 GHz. Observations achieved an angular resolution of 10"--25" with a noise rms of 0.3--1.0 mJy/beam202020THOR data products are available at https://www2.mpia-hd.mpg.de/thor/DATA/www/.

  • 2.

    The Global view on Star formation in the Milky Way: GLOSTAR (Brunthaler et al., 2021; Medina et al., 2019) is a Galactic plane survey (28{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT<l𝑙litalic_l<36{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT, |b|𝑏|b|| italic_b |<1{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT) carried out with the VLA in B and D configuration at 4--8 GHz. The integrated map has a resolution of 18" and a sensitivity of similar-to\sim60--150 μ𝜇\muitalic_μJy/beam at the effective frequency of 5.8 GHz212121GLOSTAR data products are available at https://glostar.mpifr-bonn.mpg.de/glostar/image_server.

  • 3.

    Multi-Array Galactic Plane Imaging Survey: MAGPIS (Helfand et al., 2006) is a Galactic plane survey (5{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT<l𝑙litalic_l<48.5{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT, |b|𝑏|b|| italic_b |<0.8{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT) carried out with the VLA in B, C, and D configurations at 1.4 GHz. Observations achieved an angular resolution of 6" with a noise rms of 0.3 mJy/beam222222MAGPIS data products can be downloaded from the public cutout web interface https://third.ucllnl.org/cgi-bin/gpscutout.

  • 4.

    The Coordinated Radio and Infrared Survey for High-Mass Star Formation: CORNISH (Hoare et al., 2012; Purcell, 2013) is a Galactic plane survey (10<l<{}^{\circ}<l<start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT < italic_l <65{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT, |b|<𝑏absent|b|<| italic_b | <1.1{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT) carried out with the VLA in B and BnA configurations at 5 GHz. Observations achieved an angular resolution of 1.5" with a noise rms of 0.4 mJy/beam232323CORNISH data products can be retrieved from the public cutout web interface https://cornish.leeds.ac.uk/public/img_server.php.

  • 5.

    Faint Images of the Radio Sky at Twenty cm (FIRST) survey: The FIRST survey (Becker et al., 1995) is a large area (similar-to\sim10,500 deg22{}^{2}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPT, similar-to\sim80% covering the north Galactic cap) carried out at 1.4 GHz with the NRAO VLA. It reached an angular resolution of similar-to\sim5.4", a rms sensitivity of 0.15 mJy/beam, and source positional accuracy better than 1", delivering a catalogue of 946,432 sources in its latest version (Helfand, White, & Becker, 2015), 95% complete above 2 mJy242424FIRST data products are publicly available at ftp://archive.stsci.edu/pub/vla_first/data or can be retrieved from the cutout web service interface https://third.ucllnl.org/cgi-bin/firstcutout.

Table 2: Reference catalogues considered for searching radio stars in our dataset.
Reference Nentriessubscript𝑁𝑒𝑛𝑡𝑟𝑖𝑒𝑠N_{entries}italic_N start_POSTSUBSCRIPT italic_e italic_n italic_t italic_r italic_i italic_e italic_s end_POSTSUBSCRIPT Type
Wendker (1995) 1128 mostly M-type stars
Benaglia (2010) 65 O--B2 stars with wind radio
emission
Kimball et al. (2009)(a) 112 similar-to\sim75% late-type stars (K--M)
Rosslowe & Crowther (2015)(b) 667 Wolf-Rayet (WR) stars
Richardson & Mehner (2018) 88(c) Luminous Blue Variables
Wachter et al. (2010) 71 Spitzer massive stars with
circumstellar shells, including
early and late-type stars,
16 LBVs candidates and 6 WRs
Leto et al. (2021) 50 magnetic chemically
Shultz et al (2022) peculiar (MCP) radio stars
Liu et al. (2007) 187 Low-mass X-ray binaries
in the Galaxy, LMC, and SMC,
4th ed.(d)
Liu et al. (2006) 114 High-mass X-ray binaries
in the Galaxy, 4th ed. (e)

2.3 Supplementary infrared data

In this study, we complemented the radio observations with mid- and far-infrared data from the following surveys:

  • 1.

    AllWISE (Cutri et al., 2013): This WISE survey is fully covering the Galactic plane in four bands at 3.4  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m (W1), 4.6  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m (W2), 12  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m (W3), and 22  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m (W4). The angular resolutions are 6.1", 6.4", 6.5" and 12" and the 5σ𝜎\sigmaitalic_σ flux sensitivities for point sources are 0.08 mJy, 0.11 mJy, 1 mJy and 6 mJy, respectively.

  • 2.

    GLIMPSE (Galactic Legacy Infrared MidPlane Survey Extraordinaire) 8.0  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m surveys (Churchwell et al., 2009) of the Spitzer Space Telescope (Werner et al., 2004): GLIMPSE (I, II) fully covers this Galactic coordinate range: 0{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT<l𝑙litalic_l<65{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT, 295{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT<l𝑙litalic_l<360{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT, |b|𝑏absent|b|\leq| italic_b | ≤1252525The exact sky coverage of all GLIPSE surveys (including GLIMPSE-3D) is summarized at https://irsa.ipac.caltech.edu/data/SPITZER/GLIMPSE/overview.html.. The angular resolution is 2" and the 5σ𝜎\sigmaitalic_σ flux sensitivity similar-to\sim0.4 mJy.

  • 3.

    Hi-GAL (Herschel infrared Galactic plane Survey) 70  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m survey (Molinari et al., 2016) of the Herschel Space Observatory (Pilbratt et al., 2010): The survey covers |l|𝑙absent|l|\leq| italic_l | ≤60{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT, |b|𝑏absent|b|\leq| italic_b | ≤1{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT, with an angular resolution of similar-to\sim8.5" and a 1σ𝜎\sigmaitalic_σ flux sensitivity similar-to\sim20 MJy/sr.

Table 3: Summary information on the compact source data extracted from previous radio surveys (FIRST, THOR, GLOSTAR, MAGPIS, CORNISH). Columns (3) and (4) are the average radio source angular size and its standard deviation in arcsec. Column (5) lists the number of sources from previous radio surveys for each considered class or sub-class in columns (1) and (2) with available Mid-Infrared data (3.4  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 4.6  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 12  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 22  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m) from AllWISE survey. Column (6) lists the number of radio sources with available Mid-Infrared data (3.4  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 4.6  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 8  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 12  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 22  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m) from AllWISE and GLIMPSE surveys, and Far-Infrared data (70  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m) from Hi-GAL survey. Column (7) reports the number of radio sources with average spectral index information available (see text). Columns (8) and (9) reports how many sources listed in columns (5) and (6), respectively, also have a measured spectral index.
class subclass rdelimited-⟨⟩r\langle\textsc{r}\rangle⟨ r ⟩ (") σrsubscript𝜎r\sigma_{\textsc{r}}italic_σ start_POSTSUBSCRIPT r end_POSTSUBSCRIPT (") nmirsubscript𝑛mirn_{\textsc{mir}}italic_n start_POSTSUBSCRIPT mir end_POSTSUBSCRIPT nmir+firsubscript𝑛mirfirn_{\textsc{mir}+\textsc{fir}}italic_n start_POSTSUBSCRIPT mir + fir end_POSTSUBSCRIPT nαsubscript𝑛𝛼n_{\alpha}italic_n start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT nα+mirsubscript𝑛𝛼mirn_{\alpha+\textsc{mir}}italic_n start_POSTSUBSCRIPT italic_α + mir end_POSTSUBSCRIPT nα+mir+firsubscript𝑛𝛼mirfirn_{\alpha+\textsc{mir}+\textsc{fir}}italic_n start_POSTSUBSCRIPT italic_α + mir + fir end_POSTSUBSCRIPT
(1) (2) (3) (4) (5) (6) (7) (8) (9)
Hii 22.1 12.6 2295 2257 1214 1178 1168
pn 18.2 7.1 1411 1214 783 782 718
pulsar 10.8 2.9 645 534 221 221 57
yso 10.6 4.1 552 501 215 204 71
star wr 10.1 3.9 51 47 7 7 6
lbv 20.7 11.3 25 24 11 10 10
star xb 12.5 4.9 14 9 7 7 6
other 9.7 6.1 348 217 89 88 74
all 438 297 114 112 96
rg 15.2 4.4 6920 264 1285 1285 134
qso 18.6 4.8 5136 0 1045 1045 0
all 17397 5013 4877 4827 2244
Table 4: Summary information on the compact source data extracted from different ASKAP radio surveys (RACS, EMU pilot). See Table�3 caption for column description.
class subclass rdelimited-⟨⟩r\langle\textsc{r}\rangle⟨ r ⟩ (") σrsubscript𝜎r\sigma_{\textsc{r}}italic_σ start_POSTSUBSCRIPT r end_POSTSUBSCRIPT (") nmirsubscript𝑛mirn_{\textsc{mir}}italic_n start_POSTSUBSCRIPT mir end_POSTSUBSCRIPT nmir+firsubscript𝑛mirfirn_{\textsc{mir}+\textsc{fir}}italic_n start_POSTSUBSCRIPT mir + fir end_POSTSUBSCRIPT nαsubscript𝑛𝛼n_{\alpha}italic_n start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT nα+mirsubscript𝑛𝛼mirn_{\alpha+\textsc{mir}}italic_n start_POSTSUBSCRIPT italic_α + mir end_POSTSUBSCRIPT nα+mir+firsubscript𝑛𝛼mirfirn_{\alpha+\textsc{mir}+\textsc{fir}}italic_n start_POSTSUBSCRIPT italic_α + mir + fir end_POSTSUBSCRIPT
(1) (2) (3) (4) (5) (6) (7) (8) (9)
Hii 26.8 13.9 369 364 57 56 55
pn 27.0 8.7 174 45 33 33 10
pulsar 18.5 6.6 166 72 11 11 9
yso 12.4 4.1 37 30 17 16 15
star wr 14.5 3.9 11 7 1 1 1
lbv 23.1 10.4 3 3 0 0 0
star xb 12.5 1.7 2 0 0 0 0
other 14.2 3.6 7 1 2 2 0
all 23 11 3 3 1
rg 22.3 6.0 1892 0 0 0 0
qso 26.1 7.7 1343 0 0 0 0
all 4004 522 121 119 90

3 Generating training/testing datasets

Refer to caption
(a) 3.4   µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m
Refer to caption
(b) 4.6   µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m
Refer to caption
(c) 8   µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m
Refer to caption
(d) 12   µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m
Refer to caption
(e) 22   µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m
Refer to caption
(f) 70   µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m
Refer to caption
(g) radio
Figure 1: Template source (G324.161+00.264, Hii region) from the dataset, observed in 7-bands (3.4  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 4.6  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 8  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 12  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 22  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 70  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, and ASKAP radio 944 MHz), shown in left to right panels, respectively.

3.1 Compact source sample

To build our dataset, we searched for compact sources in the available radio data (Section 2), using the following selection criteria:

  1. 1.

    Isolated single-island point-sources or slightly resolved sources. We assumed an upper threshold of 10×\times× synthesized beam size262626The average size of sources (e.g. Hii regions or PNe) in the dataset is similar-to\sim2.5×\times× synthesized beam size.;

  2. 2.

    No diffuse, extended or complex radio morphologies, e.g. no child point-sources or inner filaments inside source contour;

  3. 3.

    Source position cross-matching to known or candidate objects in reference catalogues, within a match radius equal to the synthesized beam size;

  4. 4.

    Source island clearly distinguishable from the background, e.g. peak flux larger than 3σ𝜎\sigmaitalic_σ and number of island pixels larger than 6;

  5. 5.

    Source island not located at radio map borders.

We considered possible associations to these classes of astrophysical objects (Galactic or Extragalactic), having a compact radio morphology (as defined above), in most of the cases (e.g. pulsars or radio stars), or in a considerable fraction of cases (e.g. Hii regions, PNe) compared to more extended morphologies:

  • 1.

    Radio stars: We included in this class stars of different spectral types and evolution stages, including late stages, like Wolf-Rayet (WR) stars or Luminous Blue Variables (LBVs), and X-ray binaries (hereafter abbreviated as XBs for brevity). The sensitivity of existing telescopes has been the major limitation in radio star searches, as the emission is rather faint, often below the mJy level. Furthermore, a limited angular resolution, e.g. above 1" (Helfand et al., 1999), makes cross-matching with densely populated optical catalogues ineffective. In fact, the number of reported radio stars is rather low, and no comprehensive catalogue, including all possible stellar types, is currently available. To build a sufficiently large dataset, we considered different reference catalogues of known and candidate radio stars, to be cross-matched with available radio data. References are reported in Table 2.

  • 2.

    Hii regions: We have used the WISE Catalogue of Galactic Hii regions (Anderson et al., 2014; Makai, 2017), as a reference for searching Hii region associations in our radio data. The catalogue is actively updated online272727http://astro.phys.wvu.edu/wise/. The version used for this work (v2.2) contains 8412 entries, similar-to\sim10% of them with measured radio flux information reported at 20--21 cm.

  • 3.

    Planetary Nebulae (PNe): We have used the Hong Kong/AAO/Strasbourg H-alpha (HASH) Planetary Nebula Database (Parker et al., 2016), representing the largest compilation to date, as a reference for searching PNe in our radio data. The HASH catalogue is actively updated online282828http://202.189.117.101:8999/gpne/dbMainPage.php. The version used for this work contains 5591 entries, similar-to\sim24% of them with measured radio flux density reported at 20 cm or 36 cm.

  • 4.

    Young Stellar Objects (YSOs): We carried out a search for possible associations to confirmed YSOs in our radio data, using the SIMBAD database 292929https://simbad.unistra.fr/simbad/ as a reference. No distinction is made among possible evolution or mass classes of YSOs. In the search, we discarded all matches found to compact radio sources, previously labelled as Hii regions and PNe.

  • 5.

    Pulsars: We have searched for pulsar matches in our radio data, using the ATNF Pulsar Catalogue303030https://www.atnf.csiro.au/research/pulsar/psrcat/ (Manchester et al., 2005) as a reference. The version used (version 1.63) for this work contains 2800 entries, 67% of them with measured radio flux density reported at 21 cm.

  • 6.

    Active Galactic Nuclei: For our analysis, we considered a catalogue of radio galaxies and quasars obtained by Kimball and Ivezić (2008) through cross-matching of different radio surveys (FIRST, primarily) with optical data from the Sloan Digital Sky Survey (SDSS) (York et al., 2000), providing source spectroscopic classification ("GALAXY", "QSO") (see Bolton et al. 2012 for details). After applying the criteria given by Kimball and Ivezić (2008) to select compact and unresolved sources, we selected 7967 radio galaxies (RG), and 5994 QSOs. By visual inspection, we removed residual extended sources passing the selection cuts, and sources found with incorrect/unclear position reported in the catalogue, as compared to FIRST images. The final selected sample includes: 6646 radio galaxies, and 5213 QSOs.

A brief description of physical properties for each of these source classes is reported in Appendix A. As we expect these types to be the most abundant classes of compact sources found in Galactic plane observations, we did not consider other rarer classes. Actually, star-forming galaxies (SFG) are expected to become dominant over AGNs at sub-mJy flux levels (<100 μ𝜇\muitalic_μJy) (Mancuso et al., 2017) but their counts should be very small in FIRST/ASKAP-RACS surveys, given their sensitivities. This is, however, not the case for future ASKAP-EMU observations, so future studies should aim to incorporate SFGs in our dataset, once reference labelled catalogues become available within EMU.
Sources detected in our considered radio maps are reported in Tables 3 and 4. The resulting dataset is not expected to be completely free of spurious associations, due to the cross-matching procedure and to possible object misclassifications affecting the reference catalogues. Indeed, one of the goal of this and future studies is to make these unlikely classifications discoverable by means of both supervised and unsupervised techniques. The uncertainty associated with the automated cross-matching procedure was evaluated on the ASKAP data by comparing the observed Hii regions matches (i.e. the most densely populated reference catalogue) against the expected number of matches purely arising by chance. Following Riggi et al. (2021a); Mauch & Sadler (2007); Ching et al. (2017), the latter was estimated by averaging the number of matches found with multiple random catalogues in which the measured source positions were uniformly randomized inside the radio map. We found that less than 3% of the selected matches are spurious. For each class, the obtained matches were all validated by visual inspection to reduce the number of spurious associations.

3.2 Image dataset preparation

Using the scutout tool313131https://github.com/SKA-INAF/scutout, we extracted postage-stamp images around each compact source detected in reference radio maps listed in Section 2. Additionally, source cutouts were extracted from the supplementary infrared survey maps described in Section 2.3. Cutout raw size was set to 10×\times× the source radius rssubscript𝑟𝑠r_{s}italic_r start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT323232rssubscript𝑟𝑠r_{s}italic_r start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT was computed as the radius of the circle circumscribed to the source bounding box obtained from source segmentation mask.. The image cutout set for each source, including the radio plus a configurable number of infrared bands (3.4  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 4.6  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 8  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 12  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 22  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 70  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m), were all re-processed (e.g. re-gridding/re-projection, re-scaling, cropping) to bring them to the same pixel size, sky coordinate system, resolution, flux density units (Jy/pixel), and final image size (2.5×rsabsentsubscript𝑟𝑠\times r_{s}× italic_r start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT). Final images have a different size in pixels, depending on the source size radius rssubscript𝑟𝑠r_{s}italic_r start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT. In the analysis reported in Section 5.2, all source images will be resized to a common size in pixels.
As the 8  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m and far-infrared surveys only cover the Galactic plane, in contrast to the full WISE sky coverage, we considered two possible radio-infrared combinations when making the image cutouts, denoted throughout the paper as follows:

  • 1.

    5-bands (or radio+MIR333333Mid-Infrared) dataset: comprising radio, 3.4  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 4.6  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 12  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, and 22  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m images;

  • 2.

    7-bands (or radio+MIR+FIR343434Far-Infrared) dataset: comprising radio, 3.4  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 4.6  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 8  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 12  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 22  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, and 70  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m images.

In Fig. 1 we report the image data for a sample source (G324.161+00.264 Hii region) detected in 7 different channels. Infrared (3.4  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 4.6  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 8  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 12  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 22  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 70  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m) and radio (ASKAP) data are shown in left to right panels, respectively.
The number of available images finally selected in previous radio surveys (FIRST, THOR, GLOSTAR, MAGPIS, CORNISH) and in ASKAP surveys is reported for each source class in Tables 3 and 4, respectively. Columns (5) and (6) reports the number of sources detected in radio, for which MIR and FIR images are available353535Availability of MIR or FIR images does not imply that the source is actually detected in that infrared bands.. Overall, similar-to\sim17400 radio sources are available in the first dataset with MIR (5-bands) information, similar-to\sim30% of them with also FIR information (7-bands). Extragalactic sources are almost completely missing in our 7-bands dataset, due to the limited coverage of far-infrared surveys. A major consequence is that, unfortunately, galactic-extragalactic source separation studies can be carried out only with the 5-bands dataset. On the other hand, this is, to the best of our knowledge, the largest radio data compilation simultaneously including different classes of Galactic and extragalactic compact objects, suitable for machine-learning and other algorithmic studies.

Table 5: Summary of extracted color features used for classification analysis. See Section 4.1 for details.
Dataset Feature ID Description
5-band F11{}_{1}start_FLOATSUBSCRIPT 1 end_FLOATSUBSCRIPT, …, F44{}_{4}start_FLOATSUBSCRIPT 4 end_FLOATSUBSCRIPT cradio,jradio,j{}_{\text{radio,j}}start_FLOATSUBSCRIPT radio,j end_FLOATSUBSCRIPT (j𝑗jitalic_j=12, 22, 3.4, 4.6  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m)
(radio+MIR) F55{}_{5}start_FLOATSUBSCRIPT 5 end_FLOATSUBSCRIPT, …, F88{}_{8}start_FLOATSUBSCRIPT 8 end_FLOATSUBSCRIPT IoUradio,jradio,j{}_{\text{radio,j}}start_FLOATSUBSCRIPT radio,j end_FLOATSUBSCRIPT (j𝑗jitalic_j=12, 22, 3.4, 4.6  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m)
F99{}_{9}start_FLOATSUBSCRIPT 9 end_FLOATSUBSCRIPT, …, F1212{}_{12}start_FLOATSUBSCRIPT 12 end_FLOATSUBSCRIPT SSIMradio,jradio,j{}_{\text{radio,j}}start_FLOATSUBSCRIPT radio,j end_FLOATSUBSCRIPT (j𝑗jitalic_j=12, 22, 3.4, 4.6  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m)
7-band F11{}_{1}start_FLOATSUBSCRIPT 1 end_FLOATSUBSCRIPT, …, F66{}_{6}start_FLOATSUBSCRIPT 6 end_FLOATSUBSCRIPT cradio,jradio,j{}_{\text{radio,j}}start_FLOATSUBSCRIPT radio,j end_FLOATSUBSCRIPT (j𝑗jitalic_j=12, 22, 3.4, 4.6, 8, 70  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m)
(radio+MIR+FIR) F77{}_{7}start_FLOATSUBSCRIPT 7 end_FLOATSUBSCRIPT, …, F1212{}_{12}start_FLOATSUBSCRIPT 12 end_FLOATSUBSCRIPT IoUradio,jradio,j{}_{\text{radio,j}}start_FLOATSUBSCRIPT radio,j end_FLOATSUBSCRIPT (j𝑗jitalic_j=12, 22, 3.4, 4.6, 8, 70  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m)
F1313{}_{13}start_FLOATSUBSCRIPT 13 end_FLOATSUBSCRIPT, …, F1818{}_{18}start_FLOATSUBSCRIPT 18 end_FLOATSUBSCRIPT SSIMradio,jradio,j{}_{\text{radio,j}}start_FLOATSUBSCRIPT radio,j end_FLOATSUBSCRIPT (j𝑗jitalic_j=12, 22, 3.4, 4.6, 8, 70  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m)
Table 6: Percentage of radio sources potentially detected (e.g. IoU>0) in each infrared band.
Class 3.4  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m 4.6  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m 8  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m 12  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m 22  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m 70  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m
(%) (%) (%) (%) (%) (%)
Hii 37.6 49.2 63.0 58.1 64.5 67.7
PN 17.0 21.5 28.5 45.1 55.3 47.5
PULSAR 4.8 4.2 1.7 0.7 0.4 0.5
YSO 19.5 26.8 36.4 30.4 32.9 38.9
STAR 35.6 34.9 26.3 18.2 19.1 14.0
RG 86.5 83.6 7.6 37.1 21.1 0.0
QSO 69.9 75.6 -- 60.3 36.4 --
ALL 64.2 66.6 41.4 45.4 33.2 47.0

4 Feature extraction and data exploration

In this section, we describe the methods used to process our dataset and extract parameters suitable for data inspection and source classification.

Refer to caption
Refer to caption
Figure 2: Scatter plots of representative infrared/radio colour indices computed over the entire dataset for images with detected sources in both the radio and infrared channels (IoUs>0). Radio flux densities are obtained at different frequencies ranging from 0.912 GHz (ASKAP Early Science survey data) to 5.8 GHz (GLOSTAR). See Section 2 for details on survey frequencies.

4.1 Infrared-radio color parameters

Colour indices ci,jsubscript𝑐𝑖𝑗c_{i,j}italic_c start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT are defined as the magnitude difference between measured fluxes Fisubscript𝐹𝑖F_{i}italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and Fjsubscript𝐹𝑗F_{j}italic_F start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT in band i𝑖iitalic_i and j𝑗jitalic_j where λjsubscript𝜆𝑗\lambda_{j}italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT>λisubscript𝜆𝑖\lambda_{i}italic_λ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT (Nikutta et al., 2014), e.g. ci,j=log10(Fj/Fi)subscript𝑐𝑖𝑗subscript10subscript𝐹𝑗subscript𝐹𝑖c_{i,j}=\log_{10}(F_{j}/F_{i})italic_c start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT = roman_log start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT ( italic_F start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT / italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ). We considered these radio-infrared colour indices (cradio,3.4µmsubscript𝑐𝑟𝑎𝑑𝑖𝑜3.4micrometerc_{radio,3.4\leavevmode\nobreak\ $\mathrm{\SIUnitSymbolMicro m}$}italic_c start_POSTSUBSCRIPT italic_r italic_a italic_d italic_i italic_o , 3.4 start_ID roman_µ roman_m end_ID end_POSTSUBSCRIPT, cradio,4.6µmsubscript𝑐𝑟𝑎𝑑𝑖𝑜4.6micrometerc_{radio,4.6\leavevmode\nobreak\ $\mathrm{\SIUnitSymbolMicro m}$}italic_c start_POSTSUBSCRIPT italic_r italic_a italic_d italic_i italic_o , 4.6 start_ID roman_µ roman_m end_ID end_POSTSUBSCRIPT, cradio,8µmsubscript𝑐𝑟𝑎𝑑𝑖𝑜8micrometerc_{radio,8\leavevmode\nobreak\ $\mathrm{\SIUnitSymbolMicro m}$}italic_c start_POSTSUBSCRIPT italic_r italic_a italic_d italic_i italic_o , 8 start_ID roman_µ roman_m end_ID end_POSTSUBSCRIPT, cradio,12µmsubscript𝑐𝑟𝑎𝑑𝑖𝑜12micrometerc_{radio,12\leavevmode\nobreak\ $\mathrm{\SIUnitSymbolMicro m}$}italic_c start_POSTSUBSCRIPT italic_r italic_a italic_d italic_i italic_o , 12 start_ID roman_µ roman_m end_ID end_POSTSUBSCRIPT, cradio,22µmsubscript𝑐𝑟𝑎𝑑𝑖𝑜22micrometerc_{radio,22\leavevmode\nobreak\ $\mathrm{\SIUnitSymbolMicro m}$}italic_c start_POSTSUBSCRIPT italic_r italic_a italic_d italic_i italic_o , 22 start_ID roman_µ roman_m end_ID end_POSTSUBSCRIPT, cradio,70µmsubscript𝑐𝑟𝑎𝑑𝑖𝑜70micrometerc_{radio,70\leavevmode\nobreak\ $\mathrm{\SIUnitSymbolMicro m}$}italic_c start_POSTSUBSCRIPT italic_r italic_a italic_d italic_i italic_o , 70 start_ID roman_µ roman_m end_ID end_POSTSUBSCRIPT), in which source fluxes F𝐹Fitalic_F were computed for each band as follows:

  • 1.

    Compute background level B𝐵Bitalic_B and noise rms σrmssubscript𝜎𝑟𝑚𝑠\sigma_{rms}italic_σ start_POSTSUBSCRIPT italic_r italic_m italic_s end_POSTSUBSCRIPT from median and standard deviation of 3σ𝜎\sigmaitalic_σ-clipped pixel flux distribution;

  • 2.

    Find local maxima (or peaks) in image;

  • 3.

    Extract sources with a flood-fill algorithm, assuming a 5σ𝜎\sigmaitalic_σ and 2.5σ𝜎\sigmaitalic_σ seed and merge detection thresholds, respectively, with respect to previously computed background. Further, require at least one peak detected inside extracted source aperture;

  • 4.

    Compute flux information by standard aperture photometry, i.e. F=iNFiN×B𝐹superscriptsubscript𝑖𝑁subscript𝐹𝑖𝑁𝐵F=\sum_{i}^{N}F_{i}-N\times Bitalic_F = ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_N × italic_B, where Fisubscript𝐹𝑖F_{i}italic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and N𝑁Nitalic_N are the flux of i𝑖iitalic_i-th pixel and number of pixels in source aperture, respectively.

Besides colour indices, we also computed these additional parameters for radio-infrared band combinations (radio,j𝑗jitalic_j with j𝑗jitalic_j=[3.4  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 4.6  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 8  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 12  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 22  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 70  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m]) to quantify the likelihood of source cross-match association:

  • 1.

    𝐼𝑜𝑈radio,jsubscript𝐼𝑜𝑈radio𝑗\text{{IoU}}_{\text{radio},j}IoU start_POSTSUBSCRIPT radio , italic_j end_POSTSUBSCRIPT: Intersection-Over-Union (IoU) between source islands detected in radio and infrared band j𝑗jitalic_j. IoU is computed as:

    IoU=noverlapnunionIoUsubscript𝑛𝑜𝑣𝑒𝑟𝑙𝑎𝑝subscript𝑛𝑢𝑛𝑖𝑜𝑛\text{IoU}=\frac{n_{overlap}}{n_{union}}IoU = divide start_ARG italic_n start_POSTSUBSCRIPT italic_o italic_v italic_e italic_r italic_l italic_a italic_p end_POSTSUBSCRIPT end_ARG start_ARG italic_n start_POSTSUBSCRIPT italic_u italic_n italic_i italic_o italic_n end_POSTSUBSCRIPT end_ARG

    where noverlapsubscript𝑛𝑜𝑣𝑒𝑟𝑙𝑎𝑝n_{overlap}italic_n start_POSTSUBSCRIPT italic_o italic_v italic_e italic_r italic_l italic_a italic_p end_POSTSUBSCRIPT is the number of pixels that overlap in radio and infrared islands, while nunionsubscript𝑛𝑢𝑛𝑖𝑜𝑛n_{union}italic_n start_POSTSUBSCRIPT italic_u italic_n italic_i italic_o italic_n end_POSTSUBSCRIPT is the number of pixels of island union. IoU is set to 0 if no source is detected in band j𝑗jitalic_j;

  • 2.

    𝑆𝑆𝐼𝑀radio,jsubscript𝑆𝑆𝐼𝑀radio𝑗\text{{SSIM}}_{\text{radio},j}SSIM start_POSTSUBSCRIPT radio , italic_j end_POSTSUBSCRIPT: Average Structural Similarity Index (SSIM, Wang et al. 2004) computed between source image in radio and infrared band j𝑗jitalic_j. SSIM metric is computed on various image windows and measures the perceptual difference between two images. For two windows x𝑥xitalic_x and y𝑦yitalic_y of size K×K𝐾𝐾K\times Kitalic_K × italic_K, SSIM is computed as363636The SSIM implementation of the scikit-image library (Van der Walt, 2014) was used.:

    SSIMx,y=(2μxμy+c1)(2σxy+c2)(μx2+μy2+c1)(σx2+σy2+c2)subscriptSSIM𝑥𝑦2subscript𝜇𝑥subscript𝜇𝑦subscript𝑐12subscript𝜎𝑥𝑦subscript𝑐2subscriptsuperscript𝜇2𝑥subscriptsuperscript𝜇2𝑦subscript𝑐1subscriptsuperscript𝜎2𝑥subscriptsuperscript𝜎2𝑦subscript𝑐2\text{SSIM}_{x,y}=\frac{(2\mu_{x}\mu_{y}+c_{1})(2\sigma_{xy}+c_{2})}{(\mu^{2}_% {x}+\mu^{2}_{y}+c_{1})(\sigma^{2}_{x}+\sigma^{2}_{y}+c_{2})}SSIM start_POSTSUBSCRIPT italic_x , italic_y end_POSTSUBSCRIPT = divide start_ARG ( 2 italic_μ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_μ start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT + italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ( 2 italic_σ start_POSTSUBSCRIPT italic_x italic_y end_POSTSUBSCRIPT + italic_c start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_ARG start_ARG ( italic_μ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT + italic_μ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT + italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ( italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT + italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT + italic_c start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_ARG (1)

    where μxsubscript𝜇𝑥\mu_{x}italic_μ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT/σxsubscript𝜎𝑥\sigma_{x}italic_σ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT, μysubscript𝜇𝑦\mu_{y}italic_μ start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT/σysubscript𝜎𝑦\sigma_{y}italic_σ start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT are the pixel sample mean/variance of x𝑥xitalic_x and y𝑦yitalic_y, respectively, and σxysubscript𝜎𝑥𝑦\sigma_{xy}italic_σ start_POSTSUBSCRIPT italic_x italic_y end_POSTSUBSCRIPT is their covariance. c1subscript𝑐1c_{1}italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and c2subscript𝑐2c_{2}italic_c start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are constant values used to stabilize the ratio. SSIM index close to 1 indicates high similarity, while negative or close to zero indices denote a high discrepancy.

Overall, 12 (18) parameters are selected for classification analysis with the 5-band (7-band) dataset (see feature summary Table 5). In Fig. B.1 and B.5 we explored the degree of correlation among the extracted features, reporting the Pearson correlation coefficient r𝑟ritalic_r for each class in both the 5-band and 7-band datasets, respectively. In general, we observe a moderate correlation trend (r𝑟ritalic_r=0.5-0.7) for many variables in all classes. The strongest correlation (r𝑟ritalic_r>0.8) is found between radio-3.4  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m and radio-4.6  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m colors, but also among SSIM and IoU parameters computed for these infrared bands. For galactic classes, the correlation becomes more important also among 12  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m and 22  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m parameters. Given the computed 2-tailed p-values, we conclude that these correlations are significant at the 1% confidence level.
In Table 6 we report the fraction of sources detected in each infrared band (according to the above criteria) having a minimum overlap (IoU>0) with the radio source. These counts include possible spurious detections. On the other hand, missed counts may include IR sources failing to pass the applied detection criteria. In Fig. 2 we report scatter plots of (cradio,3.4µmsubscript𝑐radio3.4micrometerc_{\text{radio},3.4\leavevmode\nobreak\ $\mathrm{\SIUnitSymbolMicro m}$}italic_c start_POSTSUBSCRIPT radio , 3.4 start_ID roman_µ roman_m end_ID end_POSTSUBSCRIPT, cradio,22µmsubscript𝑐radio22micrometerc_{\text{radio},22\leavevmode\nobreak\ $\mathrm{\SIUnitSymbolMicro m}$}italic_c start_POSTSUBSCRIPT radio , 22 start_ID roman_µ roman_m end_ID end_POSTSUBSCRIPT), (cradio,8µmsubscript𝑐radio8micrometerc_{\text{radio},8\leavevmode\nobreak\ $\mathrm{\SIUnitSymbolMicro m}$}italic_c start_POSTSUBSCRIPT radio , 8 start_ID roman_µ roman_m end_ID end_POSTSUBSCRIPT, cradio,70µmsubscript𝑐radio70micrometerc_{\text{radio},70\leavevmode\nobreak\ $\mathrm{\SIUnitSymbolMicro m}$}italic_c start_POSTSUBSCRIPT radio , 70 start_ID roman_µ roman_m end_ID end_POSTSUBSCRIPT) colour indices obtained for sources simultaneously detected (IoU>0) in both bands over the entire dataset. As can be seen, extragalactic objects tend to cluster on the bottom left region of near- and mid-infrared colour space. Unfortunately, no data for extragalactic sources are available at 8  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m and 70  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m in our dataset, where a larger separation is found among classes of Galactic sources, compared to other colour parameters.

Refer to caption
Refer to caption
Figure 3: Radio spectral indices measured for different source classes with the T-T plot method. Spectral indices for RG and QSO sources were computed using RACS-FIRST radio frequencies (887.5--1400 MHz). Indices for the remaining Galactic classes were computed from survey selected sub-bands (when available), i.e. 871--1480 MHz (ASKAP Scorpio), 1060--1440 MHz (THOR), 4240--4670 MHz (GLOSTAR).

4.2 Radio spectral indices

We computed the radio spectral index α𝛼\alphaitalic_α (Fναproportional-to𝐹superscript𝜈𝛼F\propto\nu^{\alpha}italic_F ∝ italic_ν start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT) of sources in our dataset using the T-T plot method (Turtle et al, 1962), e.g. taking the slope of a linear regression of pixel flux densities for source images at two different radio frequencies. This method enables a measurement of the spectral index that is less dependent on the zero level of each image, under the hypothesis of background isotropy and constant α𝛼\alphaitalic_α. These conditions are holding since we are considering compact sources and regions of size comparable with the synthesised beam of the instrument.
A subset of our survey data (THOR, ASKAP pilot, GLOSTAR) provide sub-band data that can be used for T-T spectral fit. For VLA FIRST data, instead, we resorted to use data from the ASKAP RACS survey to obtain an estimate of the radio spectral index. It is worth to note that such two-point spectral index estimate is not accurate for sources having a curved spectrum, not well described by a power-law model. Indeed, some classes of sources, such as PN (Hajduk et al., 2018) or UC Hii regions (Yang et al., 2021), could present a turnover frequency in the frequency range (0.8-5 GHz). The frequency coverage of our in-band survey data is, however, rather limited (e.g. 0.87-1.6 GHz for ASKAP) to expect a reliable measurement of any spectral turnovers. Nevertheless, we inspected the ASKAP dataset to search for possible departures from the power-law assumption, by fitting ASKAP source SEDs with different curved spectrum models (e.g. free-free, synchrotron with free-free absorption, see Tingay & de Kool 2003). We found only 5 sources (out of 190 sources with flux measurement available in all five ASKAP sub-bands) that can be fitted (χ~2superscript~𝜒2\tilde{\chi}^{2}over~ start_ARG italic_χ end_ARG start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT<5) with a curved model.
In Fig. 3 we report the obtained spectral indices for different source classes in our dataset. To select more reliable measurements, we selected sources for which the spectral regression correlation coefficient was larger than 0.9. The number of sources per class with measured spectral index (and infrared data) have been reported in Tables 3 and 4 (columns 7-9). The obtained values follow expectations (e.g. see A) or previous measurements for some source classes. For example, pulsars have the steepest radio spectrum, while Hii regions and PNe have predominantly flatter radio spectra (αsimilar-to𝛼absent\alpha\simitalic_α ∼0), with a significantly smaller fraction peaking around α𝛼\alphaitalic_α=1. The observed spectral indices of radio galaxies and quasars peak around --0.9, in general agreement with the --0.95 value reported by Randall et al. (2012) (Fig. 8) in the frequency range 0.843-2.3 GHz, but slightly steeper than conventional value αdelimited-⟨⟩𝛼\langle\alpha\rangle⟨ italic_α ⟩=--0.7 (Condon et al., 2002; Best et al., 2005) or measured averages reported at different frequency ranges, e.g. αdelimited-⟨⟩𝛼\langle\alpha\rangle⟨ italic_α ⟩=--0.79 (0.147--1.4 GHz) (de Gasperin et al., 2018) or αdelimited-⟨⟩𝛼\langle\alpha\rangle⟨ italic_α ⟩=--0.71 (1.4--3.0 GHz) (Gordon et al., 2021). This comparison is only indicative as the measured average spectral indices are known to steepen (from --0.7 to --1) with increasing flux densities, and vary with other parameters such as the size of the source or the flux density threshold (e.g. see de Gasperin et al. 2018 and references therein).
Considering the large synthesized beams, it is worth to note that for some radio star types (e.g. LBVs) we could be actually measuring a composite spectral index of the point-source (typically αsimilar-to𝛼absent\alpha\simitalic_α ∼0.6) and the surrounding nebula (which could be α𝛼\alphaitalic_α<0). This may represent a potential source of misclassification of radio stars when incorporating the spectral index information in the classification analysis (Section 5.1.4). We also note the absence of radio stars with spectral indices in the range [0.2--0.3], where we would expect about 4 counts. This is not understood at present and should be investigated in the future with an extended source sample.

Refer to caption
Figure 4: Average F1-score metric achieved by the LightGBM trained classifier for binary classification of Galactic and Extragalactic source groups and for multiclass classification, computed over five "mixed" survey test sets (labelled as "mixed" and shown with filled markers) and pure ASKAP test sets (labelled as "askap" and shown with open markers). The error bars are the F1-score standard deviations obtained over the five test sets. Results obtained over the 5-band (radio+MIR) datasets without and with the spectral index (α𝛼\alphaitalic_α) information are respectively shown with black dots and green triangles, while results obtained over the 7-band (radio+MIR+FIR) datasets are respectively shown with red squares and blue inverted triangles.
Table 7: Average F1-score metrics achieved by the LightGBM trained classifier for binary classification of Galactic and Extragalactic source groups and for multiclass classification, computed over five "mixed" survey test sets (labelled as "mixed") and pure ASKAP test sets (labelled as "askap"). Metrics were not reported if less than 10 sources are available in the test set. Column groups (2-3) and (6-7) report the results obtained over the 5-band (radio+MIR) datasets without and with the spectral index (α𝛼\alphaitalic_α) information, respectively. Results in column groups (4-5) and (8-9) are relative to the 7-band (radio+MIR+FIR) dataset. Parameters for binary (multiclass) models were set to: num_leaves=2 (32), min_data_in_leaf=20 (20), max_depth=1 (5).
F1-score (%)
MIR MIR+FIR MIR+α𝛼\alphaitalic_α MIR+FIR+α𝛼\alphaitalic_α
mixed askap mixed askap mixed askap mixed askap
(2) (3) (4) (5) (6) (7) (8) (9)
gal 95.2±plus-or-minus\pm±0.1 95.8±plus-or-minus\pm±0.1 97.9±plus-or-minus\pm±0.1 -- 96.7±plus-or-minus\pm±0.1 -- 98.1±plus-or-minus\pm±0.2 --
egal 98.1±plus-or-minus\pm±0.1 98.9±plus-or-minus\pm±0.1 40.4±plus-or-minus\pm±4.1 -- 96.2±plus-or-minus\pm±0.1 -- 52.8±plus-or-minus\pm±2.8 --
all 97.2±plus-or-minus\pm±0.1 98.3±plus-or-minus\pm±0.1 95.3±plus-or-minus\pm±0.4 -- 96.4±plus-or-minus\pm±0.1 -- 95.9±plus-or-minus\pm±0.4 --
pn 64.1±plus-or-minus\pm±0.5 56.3±plus-or-minus\pm±0.5 70.9±plus-or-minus\pm±0.6 66.9±plus-or-minus\pm±0.6 73.2±plus-or-minus\pm±0.8 77.8±plus-or-minus\pm±1.3 78.9±plus-or-minus\pm±0.5 70.8±plus-or-minus\pm±2.5
Hii 81.6±plus-or-minus\pm±0.3 80.0±plus-or-minus\pm±0.4 86.9±plus-or-minus\pm±0.2 91.3±plus-or-minus\pm±0.3 86.5±plus-or-minus\pm±0.5 86.7±plus-or-minus\pm±0.7 91.7±plus-or-minus\pm±0.3 93.0±plus-or-minus\pm±0.6
pulsar 64.2±plus-or-minus\pm±0.3 70.2±plus-or-minus\pm±1.0 63.7±plus-or-minus\pm±1.0 76.0±plus-or-minus\pm±0.9 81.9±plus-or-minus\pm±1.4 81.5±plus-or-minus\pm±2.5 81.9±plus-or-minus\pm±0.8 80.6±plus-or-minus\pm±4.2
yso 22.4±plus-or-minus\pm±0.7 17.8±plus-or-minus\pm±0.6 19.4±plus-or-minus\pm±2.1 28.5±plus-or-minus\pm±2.9 23.9±plus-or-minus\pm±1.7 32.0±plus-or-minus\pm±4.9 30.9±plus-or-minus\pm±2.7 23.0±plus-or-minus\pm±5.3
star 22.8±plus-or-minus\pm±1.4 32.4±plus-or-minus\pm±2.8 32.8±plus-or-minus\pm±3.1 -- 19.8±plus-or-minus\pm±2.2 -- 28.3±plus-or-minus\pm±5.8 --
rg 87.1±plus-or-minus\pm±0.3 87.7±plus-or-minus\pm±0.2 53.9±plus-or-minus\pm±2.1 -- 81.9±plus-or-minus\pm±0.6 -- 56.5±plus-or-minus\pm±2.5 --
qso 85.7±plus-or-minus\pm±0.4 84.1±plus-or-minus\pm±0.2 -- -- 83.0±plus-or-minus\pm±0.5 -- -- --
all 80.3±plus-or-minus\pm±0.3 82.7±plus-or-minus\pm±0.1 69.9±plus-or-minus\pm±0.5 82.2±plus-or-minus\pm±0.4 78.1±plus-or-minus\pm±0.4 76.1±plus-or-minus\pm±1.3 78.4±plus-or-minus\pm±0.7 78.1±plus-or-minus\pm±0.5

5 Source classification analysis

We used the dataset described in Section 3 to perform classification studies with supervised learning algorithms. We carried out two different analysis. The first one, reported in Section 5.1, uses the set of conventional features (color indices, spectral indices) extracted from the dataset, as described in Section 4, and gradient-boosted decision trees as classifier method. A second analysis, reported in Section 5.2, employs convolutional neural networks for automated feature extraction and source image classification.
The entire dataset, including data from all radio surveys, was split into three "mixed" survey subsets (train, validation, test sets), containing 55%/15%/30% of the original data, respectively. Five train/validation/test splits were randomly generated to estimate the model performance uncertainties. We also produced additional data splits with exclusively ASKAP data in the test set, and previous radio surveys in train and validation sets (with a 70%/30% data proportion). These samples were used to estimate how the classifier performs on a specific survey, when trained on a mixture of completely different surveys.
In both analysis, we made use of the following metrics373737More details at https://scikit-learn.org/stable/modules/model_evaluation.html., widely adopted in multi-class classification problems, to estimate the achieved classification performances:

  • 1.

    Recall (\mathcal{R}caligraphic_R): Fraction of sources of a given class that were correctly classified by the model out of all sources labelled in that class, computed as:

    =TPTP+FN𝑇𝑃𝑇𝑃𝐹𝑁\mathcal{R}=\frac{TP}{TP+FN}caligraphic_R = divide start_ARG italic_T italic_P end_ARG start_ARG italic_T italic_P + italic_F italic_N end_ARG
  • 2.

    Precision (𝒫𝒫\mathcal{P}caligraphic_P): Fraction of sources correctly classified as belonging to a specific class, out of all sources the model predicted to belong to that class, computed as:

    𝒫=TPTP+FP𝒫𝑇𝑃𝑇𝑃𝐹𝑃\mathcal{P}=\frac{TP}{TP+FP}caligraphic_P = divide start_ARG italic_T italic_P end_ARG start_ARG italic_T italic_P + italic_F italic_P end_ARG
  • 3.

    F1-score: the harmonic mean of precision and recall:

    F1-score=2×𝒫×𝒫+F1-score2𝒫𝒫\text{F1-score}=2\times\frac{\mathcal{P}\times\mathcal{R}}{\mathcal{P}+% \mathcal{R}}F1-score = 2 × divide start_ARG caligraphic_P × caligraphic_R end_ARG start_ARG caligraphic_P + caligraphic_R end_ARG (2)

where TP𝑇𝑃TPitalic_T italic_P, FP𝐹𝑃FPitalic_F italic_P, FN𝐹𝑁FNitalic_F italic_N are the number of true positives, false positives, and false negatives, respectively. These metrics were computed for each source class individually, and cumulatively over all dataset. In the latter case, individual class metrics were first weighted by the number of sources present for each class to account for class unbalance, and then averaged.

Refer to caption
Figure 5: Confusion matrix of the trained LightGBM classifier obtained over the 5-band (radio+MIR) pure ASKAP test datasets.

5.1 LightGBM Classification

5.1.1 Model training

We trained a LightGBM383838https://lightgbm.readthedocs.io/en/latest/index.html (Ke et al, 2017) classifier over the produced 5-band and 7-band dataset splits ("mixed" surveys sets and non-ASKAP survey sets), using the set of feature parameters described in Section 4 as inputs. LightGBM is a distributed and high-performance gradient boosting framework based on decision tree algorithm, widely adopted for classification tasks as known to reach comparable (or even better) performances on tabular data with considerably lower training times and memory usage with respect to other popular libraries (e.g. XGBoost). The most important algorithm hyperparameters controlling the model accuracy and overfitting are: max_depth, num_leaves, min_data_in_leaf, num_iterations393939 max_depth is the maximum depth of each decision tree, typically chosen in the range [2,12], as very deep/shallow trees are more likely to overfit/underfit the training data. max_depth has to be optimized in combination with the num_leaves parameter, controlling the number of decision leaves in a single tree, with optimal num_leaves values lying below the limit 2max_depthmax_depth{}^{\texttt{max\_depth}}start_FLOATSUPERSCRIPT max_depth end_FLOATSUPERSCRIPT. min_data_in_leaf specifies the minimum number of sources that fit the decision criteria in a leaf, allowing to control the model overfitting. Suitable values are typically assumed on the basis of the training sample size. Finally, the num_iterations is the number of boosting iterations performed, often interpreted as the “number of trees” used..
To select suitable hyperparameter values, we performed several training runs in which we varied max_depth values in the [2,12] range, and num_leaves\leq2max_depthmax_depth{}^{\texttt{max\_depth}}start_FLOATSUPERSCRIPT max_depth end_FLOATSUPERSCRIPT, observing the resulting model F1-score on the test set. For each training run, we used early stopping on validation data to select the optimal num_iterations parameter (typically found <100 in all performed runs). For a given tree depth choice, we also scanned different values of min_data_in_leaf from 5 to 100.
Classification results achieved over the available feature subsets and dataset splits are summarized in Fig. 4 and Table 7, and discussed with more details in the following paragraphs.
In Figures B.8, B.9, B.10 and B.11, we inspected the relative importance of each feature provided to trained LightGBM classifiers, finding that radio-infrared colour indices are always ranked among the top most sensitive features, along with the radio spectral index, while morphological parameters (radio-infrared IoUs) are ranked last.

Refer to caption
Refer to caption
Figure 6: Top: F1-score of the LightGBM classifier as a function of radio source signal-to-noise (SNR) obtained over the 5-band (radio+MIR) dataset. Bottom: Fraction of source images available in the 5-band dataset as a function of SNR.

5.1.2 Results on radio+MIR data

In Table 7 (rows 1-3, columns 2, 3), we report the F1-score metric of the trained LightGBM model, obtained on the 5-band "mixed" and "askap" test datasets, for classifying sources into two groups: Galactic (i.e. including target object classes of interest for Galactic science studies, such as PNe, Hii regions, pulsars, YSO, and stars), and Extragalactic (i.e. including radio galaxies and quasars). The model is able to identify sources belonging to the two groups with very high accuracy (above 90%), with a relatively shallow tree configuration (max_depth=1 or 2), even when presented with data observed with a completely different survey (ASKAP) with respect to those used in the training sample. As the Galactic-Extragalactic discrimination analysis can only be done using this dataset, due to the existing survey coverage and catalogue availability, this is a remarkable and encouraging result (e.g. there is no strong need for additional multi-wavelength data).
Discrimination of individual source classes was also studied. A deeper model (max_depth=5) was found to provide the best performances in the parameter scan. Classification metrics obtained over both "mixed" and "askap" test set are reported in Table 7 (rows 4-11, columns 2, 3), while the source confusion matrix obtained over the "mixed" survey test sets is plotted in Fig. 5. In this case, extragalactic sources (radio galaxies, QSO) can be identified with similar-to\sim85% accuracy, with a rate of misclassified sources of the order of 15%, almost entirely in the direction of the other extragalactic source category (e.g. QSO\rightarrowgalaxy, and viceversa). PNe, Hii regions and pulsars are the best classified sources within the Galactic group. Lowest misclassification rates towards other classes are obtained for Hii regions, found below 15%. PNe are more likely (38%) to be misclassified as Hii regions. As reported in previous studies (Anderson et al., 2012), we expect that a better discrimination power between the two types can be achieved by employing far-infrared and 8  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m data (see next paragraph). Poor classification results are obtained on the radio stars and YSO samples, with F1-scores ranging from 20% to 30%. YSOs are largely (similar-to\sim66%) misclassified as Hii regions or PNe. This is somewhat expected, as a fraction of SIMBAD objects classified as YSO (used as a reference for building the training sample) were also found listed in the WISE Hii region and HASH PN catalogues. Future data releases shall therefore focus on assessing the reliability of our YSO candidates, removing the identification ambiguities before repeating the classification analysis. The same labelling issue is also potentially affecting the radio star classification. Poor results on some Galactic class may be therefore not only due to the limited training sample, but also ascribed to the reliability of original source classification present in the literature.
In Fig. 6 (top panel) we reported the F1 classification score for all classes in the 5-band ASKAP test dataset as a function of the computed radio source signal-to-noise ratio (SNR). The overall classification performance is mostly flat over the SNR range, while individual classes do show some dependency on the SNR, e.g. F1-score is increasing with SNR for PNe/QSOs and decreasing for pulsars/radio galaxies. As shown in Fig. 6 (bottom panel), the observed trends for each class seem to correlate with the number of corresponding images available in each SNR bin.

Refer to caption
Figure 7: Confusion matrix of the trained LightGBM classifier obtained over 7-band (radio+MIR+FIR) pure ASKAP test datasets.

5.1.3 Results on radio+MIR+FIR data

In Table 7 (columns 4, 5) we report the F1-score metric of the trained LightGBM model, obtained on the 7-band "mixed" and "askap" test datasets. Only 5 Galactic classes are available in the latter case, but we did not report classification metrics for the "STAR" class, as less than 10 sources are available in the test set. Inclusion of 8  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m and 70  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m data lead to a slight improvement (5-10%) in classification for most classes, except for pulsars that are infrared-quiet at these bands. Misclassification rates, shown in Fig. 7, also improved considerably for Hii regions and PNe, e.g. the fraction of PNe misclassified as Hii regions decreased by similar-to\sim20% compared to the 5-band analysis, highlighting how the far-infrared information is crucial for separation of certain Galactic classes. Although a slight improvement is also seen on radio star and YSO identification, the limitations highlighted in the previous paragraph prevent to eventually obtain an effective classification of both types.

5.1.4 Results with radio spectral index information

In Table 7 (columns 6-9) we reported the classification results obtained on the 5-band and 7-band "mixed survey and pure ASKAP test datasets, after including the radio spectral index α𝛼\alphaitalic_α as an additional input feature. A clear increase in performance was obtained for PNe, Hii regions, and pulsars, while no sensible improvements were observed on the remaining classes. Unfortunately, the training and test samples are very limited in size for some classes, e.g. less than 70 (40) radio stars in the 5-band (7-band) datasets, and therefore their corresponding metrics may not be precisely estimated.

5.2 CNN Classification

In this section we explored the capabilities of supervised classification models, such as CNNs, that automatically extract features directly from images, e.g. they do not require the extra image processing applied in Section 4.1. More importantly, contrarily to the previous analysis, a CNN classifier is less tied to the source compact morphology assumption, and would be thus also potentially suited for extended source classification.

5.2.1 Model training

We considered two alternative CNN architectures: a custom shallow network with only two convolutional layer blocks, and a standard deep ResNet18 architecture. Network configurations are reported in Table 8. We trained six custom model configurations (denoted as custom_v1, custom_v2, …, custom_v6) on our data, varying the convolutional or dense layer structure (e.g. number of filters, kernel or stride size, etc). Columns (2) and (3) report the network backbone and classification head structure, following this notation:

  • 1.

    16C3BnP2-32C3BnP2-32-16: indicate a network with two convolutional layer blocks and two dense layers with 32 and 16 neurons, respectively. Convolution blocks (C) have 16 and 32 3×\times×3 filters respectively, each followed by batch normalization (Bn) 404040Batch normalization layers normalize their inputs by subtracting the batch mean and dividing by the batch standard deviation. They are often inserted in CNN architectures to reduce the internal covariate shift and improve network stability during training. and max pooling (P) layers414141Pooling layers reduce the spatial dimension of the inputs, by applying a pooling operation (e.g. maximum or average) to a set of values in a small region of the input volume. They are commonly used to increase the receptive field of the network, reduce its computational cost, and improve its performance. using 2×\times×2 filter and stride 2;

  • 2.

    16C3-32C5S2-16: indicate a network with two convolutional layer blocks and a single dense layer with 16 neurons. The first convolution block has 16 3×\times×3 filters (no max pooling layer), while the second one has 32 5×\times×5 filters using stride 2.

All configurations were trained (Adam optimizer, learning rate η𝜂\etaitalic_η=5×\times×1044{}^{-4}start_FLOATSUPERSCRIPT - 4 end_FLOATSUPERSCRIPT, batch_size=64) over five multiple train/validation/test dataset splits until overfitting is detected on the validation set (typically after 300 epochs). Classification metrics are finally computed over the test sets. In Fig. B.12 we report the classification F1-score obtained as a function of the training epoch with a representative model (custom_v1) over train (blue graph) and validation (red (graph) 5-band datasets. Shaded areas correspond to the minimum and maximum F1-scores found in different training runs.
To avoid learning features from other nearby sources, we masked pixels not belonging to the source in all input images. Masks for each source were obtained from the radio channel in an automated way using caesar source finder (Riggi et al., 2016, 2019), refined manually (if not accurate enough), and eventually enlarged using a morphological dilation transform424242As we expect the environment surrounding the source can provide valuable information for classification purposes, the source mask was enlarged using a morphological dilation transform with configurable kernel size (21 pixels by default).. The resulting masks were finally applied to radio and infrared channels to produce masked image data434343Masked input data are included in the dataset under version control along with unmasked images. that are provided as CNN inputs.
Different image pre-processing stages were applied to the masked input data during the training and inference stages:

  1. 1.

    Channel max scaling: For each source, we scaled each channel by the maximum pixel value among all channels for that source. This step is introduced to preserve the original radio/infrared flux ratios (very sensitive to the source type) and remove the flux density degeneracy, e.g. two identical sources (e.g. same class and radio/infrared ratios) with just an absolute flux density offset will be treated as the same input by the classifier.

  2. 2.

    Augmentation: we randomly applied a series of transformations to input cutout channels, including horizontal and vertical flipping, and [-90{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT,90{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT] rotation. This step is only applied during training to improve the model generalization capabilities;

  3. 3.

    Resizing: Finally, we resized all image cutouts in the dataset to the same size in pixels (64×\times×64 pixels by default), as the first convolutional layer of the network requires tensor of the same shape in input;

Results are reported in the following paragraphs only for the 5-band dataset, as the 7-band and spectral index datasets are too limited in size for training a deep network.

Table 8: Average F1-score metrics achieved by trained CNN models for multiclass classification, computed over five "mixed" survey 5-band (radio+MIR) test sets.
Model Backbone Head F1-score (%)
custom_v1 16C3P2-32C3P2 16 80.7±plus-or-minus\pm0.3
custom_v2 16C3-32C5S2 16 80.5plus-or-minus\pm0.8
custom_v3 16C5P2-32C5P2 16 81.3plus-or-minus\pm0.2
custom_v4 32C3P2-64C3P2 32 80.2plus-or-minus\pm0.5
custom_v5 16C3BnP2-32C3BnP2 16 78.8plus-or-minus\pm0.1
custom_v6 16C3P2-32C3P2 32-16 81.3plus-or-minus\pm0.4
resnet18 ResNet18 16 80.1plus-or-minus\pm1.1
Table 9: Average F1-score metrics achieved by trained shallow and deep CNN models for source multiclass classification, computed over five "mixed" survey 5-band (radio+MIR) test sets (labelled as "mixed") and pure ASKAP 5-band (radio+MIR) test sets (labelled as "askap").
F1-score (%)
custom_v1 resnet18
mixed askap mixed askap
(2) (3) (4) (5)
pn 68.3plus-or-minus\pm2.3 67.2plus-or-minus\pm2.0 74.6plus-or-minus\pm0.8 66.9plus-or-minus\pm0.1
Hii 86.2plus-or-minus\pm0.9 90.5plus-or-minus\pm0.8 88.8plus-or-minus\pm0.6 92.6plus-or-minus\pm1.1
pulsar 64.0plus-or-minus\pm1.4 63.1plus-or-minus\pm1.2 71.6plus-or-minus\pm0.6 52.4plus-or-minus\pm2.4
yso 41.0plus-or-minus\pm2.2 40.6plus-or-minus\pm2.4 52.1plus-or-minus\pm1.3 49.0plus-or-minus\pm2.1
star 38.8plus-or-minus\pm2.7 52.4plus-or-minus\pm1.4 41.0plus-or-minus\pm1.3 37.5plus-or-minus\pm3.2
rg 85.6plus-or-minus\pm0.3 84.1plus-or-minus\pm1.2 84.2plus-or-minus\pm0.4 85.3plus-or-minus\pm1.2
qso 83.3plus-or-minus\pm0.8 77.8plus-or-minus\pm1.8 78.6±plus-or-minus\pm±2.7 81.9±plus-or-minus\pm±0.9
all 80.7±plus-or-minus\pm±0.3 80.4±plus-or-minus\pm±1.0 80.1±plus-or-minus\pm±1.1 82.0±plus-or-minus\pm±0.8

5.2.2 Results on radio+MIR data

Classifications scores obtained by trained CNN classifiers on "mixed" survey test datasets, reported in Table 8 (column 4), are rather comparable (within 1%) across shallow and deep model configurations and training runs. A larger kernel size (5×\times×5 pixels, custom_v3 model) slightly improved the results, while batch normalization layers (custom_v5 model) produce a similar-to\sim2% decrease in performance. In Table 9 we report the classification scores obtained with the resnet18 model and a representative shallow model (custom_v1) trained on "mixed" survey data over both "mixed" survey and pure ASKAP test sets. Misclassification rates obtained on pure ASKAP test sets are reported in Fig. 8 for the custom_v1 model. Overall, we conclude that the achieved metrics are comparable to those found with the LightGBM classifier (Table 7, columns 2, 3). We also observe that, with regard to the individual classes, the CNN classifiers tend to better classify Galactic sources (similar-to\sim10% improvement in scores and misclassification rates for some classes) with a corresponding performance drop on the extragalactic source group. Despite the already noted dataset limitations, we believe that this analysis represent a first valuable baseline for future studies aiming to explore other image-based classifiers and optimized normalization strategies for multi-wavelength data.

Refer to caption
Figure 8: Confusion matrix of the trained CNN custom_v1 classifier obtained over 5-band (radio+MIR) pure ASKAP test datasets.

6 sclassifier: a radio source classifier tool

We developed a tool, dubbed sclassifier444444https://github.com/SKA-INAF/sclassifier, for performing radio source classification using the dataset and the methods adopted in this work. An end-to-end pipeline was implemented, allowing users to obtain source classification information (e.g. predicted class labels and probabilities) and supplementary products (source image cutouts, feature data tables) from a radio continuum 2D map (FITS format) and a source catalogue (DS9 polygon regions) supplied as inputs. Additional algorithms and models (e.g. convolutional autoencoders, outlier finder, clustering, etc) were also implemented and will be presented in a future work focusing on an unsupervised analysis of the dataset.
sclassifier is developed in python (3.x), and based on several libraries for astronomical data analysis and image processing - Astropy (Astropy Collaboration et al., 2013, 2018, 2022), Montage (Jacob et al., 2010), OpenCV (Bradski, 2000) - and machine learning - TensorFlow (Abadi et al., 2016), Keras (Chollet et al., 2015), scikit-learn (Pedregosa et al., 2011). As some stages, e.g. source cutout provision, regridding/reprojection, are quite computationally intensive for large catalogues, we parallelized them using the mpi4py library (Dalcin et al., 2005), splitting the computation for all sources across multiple computing nodes.

7 Summary

In this work, we carried out a supervised classification analysis of compact radio sources over a large annotated dataset of similar-to\sim20,000 Galactic and Extragalactic objects, extracted from novel ASKAP radio observations and previous radio and infrared surveys. We trained two different classifiers on the produced data. The first uses the LightGBM gradient-boosting framework and is trained on a set of pre-computed features derived from the multi-wavelength data, including the radio-infrared colour indices and the radio spectral index. The second model uses convolutional neural networks and is trained directly on multi-channel images.
The LightGBM classifier achieved very high performances (above 90%) for the identification of Galactic objects against sources belonging to the extragalactic group, using only radio and mid-infrared data. Classification metrics largely vary among individual source classes. Extragalactic objects (radio galaxies, QSO) are best classified, with F1-scores exceeding 85%. PNe, Hii regions, and pulsars are the second group of best classified objects, with F1-scores ranging from 60% to 75%. Poor performances are obtained on radio star group and YSOs, due to the limited sample size, object spectral type heterogeneity, and unreliable classification information reported in the reference catalogues. We also tested how the classification performances changed for Galactic objects when including additional infrared band data (8  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, 70  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m) and the radio spectral index information in the analysis. We obtained a significant boost in performance (similar-to\sim10%) for PNe, Hii regions, and pulsars.
CNN classifier was only trained on 5-band (radio+MIR) data due to the limited number of images available in the FIR band. The classification metrics achieved by trained shallow and deep network architectures are overall comparable to LightGBM, with better classifications observed on the Galactic source group at the expense of the extragalactic source group.
The obtained results motivate further analysis to be done to improve overall source classification results and tackle some reported limitations, before applying the method on unclassified ASKAP sources. Firstly, test data sample size can be slightly increased once new ASKAP observations towards the Galactic plane will be completed. This would increase the reliability of the reported classification metrics for some classes. Secondly, analysis should be repeated with a revised YSO and star reference catalogue, as the ones used in this work may contain spurious association to Hii regions or extragalactic objects, that could partly explain the misclassification rates obtained. In this context, we foresee to carry out a completely unsupervised analysis of the dataset to detect possible label anomalies and perform new classification studies.
As commented in Section 3.1, our training set does not contain star-forming galaxies, expected to contribute with a non-negligible fraction (similar-to\sim25%) in EMU survey data, and therefore our current classifier could potentially misclassify them as Galactic sources, if their radio-infrared colors are similar. Luckily, there are ongoing studies within EMU, aiming to produce a curated sample of SFGs from EMU pilot observations, that would allow us to extend the training dataset, study SFG color parameter distribution and re-train our classifiers.
On a longer term, we also would like to extend our dataset with additional complementary data (e.g. optical, Hα𝛼\alphaitalic_α or radio polarization information), that could potentially lead to improved classification results.
In this work, we produced a python-based tool, enabling users to run source classification on their new data. The implemented methods are rather general-purpose, allowing for the future to include additional image wavelength data, or to perform a similar analysis on extended sources. We plan to integrate it in the list of source analysis applications supported in the caesar-rest service454545https://github.com/SKA-INAF/caesar-rest, developed within the CIRASA (Collaborative and Integrated platform for Radio Astronomical Source Analysis) project (Riggi et al., 2021b) to enable SKA Galactic science teams or high-level service API to run source analysis tasks (source extraction, classification, cross-matching, etc) over an http interface. This service is currently deployed on the European Open Science Cloud (EOSC) prototype, setup for the H2020 NEANIAS (Novel EOSC Services for Emerging Atmosphere, Underwater & Space Challenges) project464646https://www.neanias.eu/ (Sciacca et al, 2021).

Acknowledgements

The Australian SKA Pathfinder is part of the Australia Telescope National Facility which is managed by CSIRO. Operation of ASKAP is funded by the Australian Government with support from the National Collaborative Research Infrastructure Strategy. Establishment of the Murchison Radio-astronomy Observatory was funded by the Australian Government and the Government of Western Australia. This work was supported by resources provided by the Pawsey Supercomputing Centre with funding from the Australian Government and the Government of Western Australia. We acknowledge the Wajarri Yamatji people as the traditional owners of the Observatory site.

This research has made use of the HASH PN database at hashpn.space. This publication makes use of data products from the Wide-field Infrared Survey Explorer, which is a joint project of the University of California, Los Angeles, and the Jet Propulsion Laboratory/California Institute of Technology, funded by the National Aeronautics and Space Administration. Additionally, this research has made use of the SIMBAD database, operated at CDS, Strasbourg, France (Wenger et al., 2000).

Funding Statement

This research was supported by the INAF CIRASA & SCIARADA grants. C.B. acknowledges support from European Commission Horizon 2020 research and innovation programme under the grant agreement No. 863448 (NEANIAS).

Data Availability

The software code used in this work is publicly available under the GNU General Public License v3.0474747https://www.gnu.org/licenses/gpl-3.0.html on the GitHub repository https://github.com/SKA-INAF/sclassifier/. The trained model weights have been made available on Zenodo repository at https://doi.org/10.5281/zenodo.10477860.

References

  • Abadi et al. (2016) Abadi, M., et al., Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’16), November 2–4, 2016, Savannah, GA, USA
  • Akras et al. (2019) Akras, S., et al., 2019, MNRAS, 488, 3238
  • Alegre et al. (2022) Alegre, L., et el., 2022, MNRAS, 516, 4716
  • Ainsworth et al. (2012) Ainsworth, R. E., et al. (AMI Consortium), 2012, MNRAS, 423, 1089
  • Anderson et al. (2012) Anderson, L. D., et al., 2012, A&A, 537, A1
  • Anderson et al. (2014) Anderson, L. D., et al., 2014, APJSS, 212, 1
  • Anglada et al. (1998) Anglada, G., et al., 1998, AJ, 116, 2953
  • Aniyan & Thorat (2017) Aniyan, A. K., Thorat, K., 2017, ApJS, 230, 20
  • Awang Iskandar et al. (2020) Awang Iskandar, D. N. F., et al., 2020, Galaxies, 8, 88
  • Banfield et al. (2015) Banfield, J. K., et al., 2015, MNRAS, 453, 2326
  • Bates et al. (2012) S D Bates, S. D., et al., 2012, MNRAS, 427, 1052
  • Becker et al. (1995) Becker, R. H., et al., 1995, ApJ, 450, 559
  • Benaglia (2010) Benaglia, P., 2010, ASPC, 422, 111
  • Beskin et al. (2015) Beskin, V. S., et al., 2015, Space Sci Rev 191, 207
  • Beskin (2018) Beskin, V. S., Physics–Uspekhi, 2018, 188, 377
  • Best et al. (2005) Best, P. N., et al., 2005, MNRAS, 362, 9
  • Bolton et al. (2012) Bolton, A. S., et al., 2012, AJ, 144, 144
  • Bradski (2000) Bradski, G., 2000, The OpenCV Library, Dr. Dobb’s Journal of Software Tools, 120, 122
  • Brunthaler et al. (2021) Brunthaler, A., et al., 2021, A&A, 651, A85
  • Ching et al. (2017) Ching, J. H. Y., et al., 2017, MNRAS, 464, 1306
  • Chollet et al. (2015) Chollet, F., et al., 2015, https://keras.io
  • Churchwell et al. (2009) Churchwell, E., et al., 2009, PASP, 121, 877
  • Condon et al. (1998) Condon, J. J., et al., 1998, AJ, 115, 1693
  • Condon et al. (2002) Condon, J. J., et al., 2002, AJ, 124, 675
  • Cortes & Vapnik (1995) Cortes, C. & Vapnik, V., 1995, Machine learning, 20(3), 273
  • Cutri et al. (2013) Cutri, R. M., Wright, E. L., Conrow, T., et al. 2013, Explanatory Supplement to the AllWISE Data Release Products, Tech. rep.
  • Dalcin et al. (2005) Dalcin, L., et al., 2005, Journal of Parallel and Distributed Computing, 65, 1108
  • de Gasperin et al. (2018) de Gasperin, F., et al., 2018, MNRAS, 474, 5008
  • Dewdney et al. (2016) Dewdney, P., et al., 2016, SKA1 SYSTEM BASELINE DESIGN V2, SKA-TEL-SKO-0000002
  • Drew et al. (2005) Drew, J. E., et al., 2005, MNRAS, 362, 753
  • Drew et al. (2014) Drew, J. E., et al., 2014, MNRAS, 440, 2036
  • Fanaroff & Riley (1974) Fanaroff, B. L., & Riley, J. M., 1974, MNRAS, 167, 31
  • Fazio et al. (2004) Fazio, G. G., et al., 2004, ApJS, 154, 10
  • Flewelling et al (2010) Flewelling, H. A., et al, 2020, ApJS, 251, 7
  • Frew & Parker (2010) Frew, D., & Parker, Q., 2010, PASA, 27, 129
  • Galvin et al. (2020) Galvin, T. J., et al., 2020, MNRAS, 497, 2730
  • Gómez de Castro (2013) Gómez de Castro, A. I., 2013, pss4.book, 279. doi:10.1007/978-94-007-5615-1_6
  • Gordon et al. (2021) Gordon, Y. A., et al., 2021, ApJS, 255, 30
  • Güdel (2002) Güdel, M., 2002, Annu. Rev. Astron. Astrophys., 40, 217
  • Gupta et al (2022) Gupta, N., et al., 2022, PASA, 39, E051
  • Hajduk et al. (2018) Hajduk, M., et al., 2018, MNRAS, 479, 5657
  • Hale et al. (2021) Hale, C., et al., 2021, PASA, 38, E058
  • Hambly et al. (2001) Hambly, N. C., et al., 2001, MNRAS, 326, 1279
  • Helfand et al. (1999) Helfand, D. J., et al., 1999, AJ, 117, 1568
  • Helfand et al. (2006) Helfand, D. J., et al., 2006, AJ, 131, 2525
  • Helfand, White, & Becker (2015) Helfand D. J., White R. L., Becker R. H., 2015, ApJ, 801, 26
  • Hoare et al. (2012) Hoare, M. G., et al., 2012, PASP, 124, 939
  • Hotan et al. (2021) Hotan, A., et al., 2021, PASA, 38, E009
  • Jacob et al. (2010) Jacob, J. C., et al., 2010, Astrophysics Source Code Library, record ascl:1010.036
  • Johnston et al. (2008) Johnston, S., et al., 2008, Experimental Astronomy, 22, 151
  • Ke et al (2017) Ke, G., et al., 2017, Advances in neural information processing systems, 30, 3146.
  • Kimball and Ivezić (2008) Kimball, A. E. and Ivezić, Z̆., 2008, AJ, 136, 684
  • Kimball et al. (2009) Kimball, A. E., et al., 2009, ApJ, 701, 535
  • Kurtz et al. (2005) Kurtz, S., 2005, in Cesaroni R., Felli M., Churchwell E., Walmsley M., eds, IAU Symposium Vol. 227, Massive Star Birth: A Crossroads of Astrophysics, Cambridge University Press, University Printing House, Shaftesbury Road, Cambridge, UK, p. 111
  • Kwok (2000) Kwok, S., 2000, The Origin and Evolution of Planetary Nebulae. Cambridge University Press
  • Ingallinera et al. (2022) Ingallinera, A., et al., 2022, Monthly Notices of the Royal Astronomical Society: Letters, 512, L21
  • Leto et al. (2020) Leto, P., et al., 2020, MNRAS, 493, 4657
  • Leto et al. (2021) Leto, P., et al., 2021, MNRAS, 507, 1979
  • Liu et al. (2006) Liu, Q. Z., van Paradijs, J., van den Heuvel, E. P.J., 2006, Astron. Astrophys. 455, 1165
  • Liu et al. (2007) Liu, Q. Z., van Paradijs J., van den Heuvel, E. P.J., 2007, Astron. Astrophys., 469, 807
  • Liu et al. (2019) Liu, W., et al., 2019, Res. Astron. Astrophys., 19, 042
  • Lukic et al. (2018) Lukic, V., et al., 2018, MNRAS, 476, 246
  • Lukic et al. (2019) Lukic, V., et al., 2019, MNRAS, 487, 1729
  • Lyon et al. (2016) R. J. Lyon, R. J., et al., 2016, MNRAS, 459, 1104
  • Makai (2017) Makai, Z., et al., 2017, ApJ, 846, 64
  • Manchester et al. (2005) Manchester, R. N., Hobbs, G. B., Teoh, A., & Hobbs, M., 2005, AJ, 129, 1993
  • Mancuso et al. (2017) Mancuso, C., et al., 2017, ApJ, 842, 95
  • Maron (2000) Maron, O., et al., 2000, A&A, 147, 195
  • Maslej-Krešňáková et al. (2021) Maslej-Krešňáková V., et al., 2021, MNRAS, 505, 1464
  • Matthews (2013) Matthews, L. D., 2013, PASP, 125, 313
  • Mauch & Sadler (2007) Mauch, T., Sadler, E. M., 2007, MNRAS, 375, 931
  • McConnell et al. (2020) McConnell, D., et al., PASA, 37, E048
  • Medina et al. (2019) Medina S.-N. X., et al., 2019, A&A, 627, A175
  • Melrose et al. (2021) Melrose, D. B., Rafat, M Z, Mastrano, A, 2021, MNRAS, 500, 4530
  • Molinari et al. (2016) Molinari, S., et al., 2016, A&A, 591, A149
  • Morello et al. (2018) Morello, G., et al., 2018, MNRAS, 473, 2565
  • Murphy et al. (2007) Murphy, T., et al., 2007, MNRAS, 382, 382
  • Nikutta et al. (2014) Nikutta, R., et al., 2014, MNRAS, 442, 3361
  • Norris et al. (2011) Norris, R. P., et al., 2011, PASA, 28, 215
  • Norris et al. (2021) Norris, R., et al., 2021, PASA, 38, E046
  • O’Dea (1998) O’Dea, C. P., 1998, PASP, 110, 493
  • O’Dea & Saikia (2021) O’Dea, C. P., Saikia D. J., 2021, A&ARv, 29, 3
  • Panagia & Felli (1975) Panagia, N., & Felli, M., 1975, A&A, 39, 1
  • Parker et al. (2005) Parker, Q. A., et al., 2005, MNRAS, 362, 689
  • Parker et al. (2016) Parker, Q. A. et al, 2016, J. Phys. Conf. Ser., 728, 032008
  • Pedregosa et al. (2011) Pedregosa, F., et al., 2011, JMLR, 12, 2825
  • Pilbratt et al. (2010) Pilbratt, G. L., et al., 2010, A&A, 518, L1
  • Polsterer (2016) Polsterer, K. L., et al., 2016, in European Symposium on Artificial Neural Networks
  • Pottasch (1984) Pottasch, S. R., Planetary nebulae - A study of late stages of stellar evolution, by S.R. Pottasch. Dordrecht, D. Reidel Publishing Co. (Astrophysics and Space Science Library. Volume 107), 1984, 335 p.
  • Astropy Collaboration et al. (2018) Price-Whelan, A.�M., et al. [Astropy Collaboration], 2018, AJ, 156, 123
  • Astropy Collaboration et al. (2022) Price-Whelan, A.�M., et al. [Astropy Collaboration], 2022, ApJ, 935, 167
  • Purcell (2013) Purcell, C.�R., et al., 2013, ApJS, 205, 1
  • Ralph et al. (2019) Ralph, N.�O., et al., 2019, PASP, 131, 108011
  • Randall et al. (2012) Randall, K.�E., et al., 2012, MNRAS, 421, 1644
  • Reynolds (1986) Reynolds, S.�P., 1986, ApJ, 304, 713
  • Robitaille et al. (2012) Robitaille, T.�P., et al., 2012, A&A, 545, A39
  • Richardson & Mehner (2018) Richardson, N.�D. and Mehner, A., 2018, Res. Notes AAS, 2, 121
  • Riggi et al. (2016) Riggi, S., et al., 2016, MNRAS, 460, 1486
  • Riggi et al. (2019) Riggi, S., et al., 2019, PASA, 36, E037
  • Riggi et al. (2021a) Riggi, S., et al., 2021, MNRAS, 502, 60
  • Riggi et al. (2021b) Riggi, S., et al., 2021, Astronomy and Computing, 37, 100506
  • Riggi et al. (2023) Riggi, S., et al., 2023, Astronomy and Computing, 42, 100682
  • Astropy Collaboration et al. (2013) Robitaille, T.�P., et al. [Astropy Collaboration], 2013, A&A, 558, A33
  • Rosslowe & Crowther (2015) Rosslowe, C.�K., Crowther, P.�A., 2015, MNRAS, 447, 2322
  • Rustige et al. (2022) Rustige, L., et al., 2022, arXiv:2212.08504
  • Sadeghi et al. (2021) Sadeghi, M., et al., 2021, AJ, 161, 94
  • Sadler (2016) Sadler, E.�M., 2016, Astronomische Nachrichten 337, 105
  • Scaife (2012) Scaife, A.M.�M., 2012, Astronomical Review, 7, 26
  • Sciacca et al (2021) Sciacca, E., et al., 2021, arXiv:2101.07639
  • Shultz et al (2022) Shultz, M.�E., et al., 2022, MNRAS, 513, 1429
  • Slijepcevic et al (2022) Slijepcevic, I.�V., el al., 2022, MNRAS, 514, 2599
  • Skrutskie et al. (2006) Skrutskie, M.�F., et al., 2006, AJ, 131, 1163
  • Tan et al. (2018) Tan, C.�M., 2018, MNRAS, 474, 4571
  • Taylor et al. (2003) Taylor, A.�R., et al., 2003, AJ, 125, 3145
  • Tingay & de Kool (2003) Tingay, S.�J., de Kool, M., 2003, AJ, 126, 723
  • Trigilio et al. (2000) Trigilio, C., et al., 2000, A&A, 362, 281
  • Turtle et al (1962) Turtle, A.�J., et al., 1962, MNRAS, 124, 297
  • Umana et al. (2015a) Umana, G., et al., 2015, MNRAS, 454, 902
  • Umana et al. (2015b) Umana, G., et al., 2015, Proc. Sci., Advancing Astrophysics with the Square Kilometre Array (AASKA14), SISSA, Trieste, p. 118
  • Umana et al. (2021) Umana, G., et al., 2021, MNRAS, 506, 2232
  • Urry & Padovani (1995) Urry C.�M., and Padovani, P., 1995, PASP, 107, 803
  • Van der Walt (2014) Van der Walt, S., et al., 2014, scikit-image: image processing in Python, PeerJ, 2, p.e453
  • Yang et al. (2019) Yang, Y., et al., 2019, MNRAS, 482, 2681
  • Yang et al. (2021) Yang, Y., et al., 2021, A&A, 645, A110
  • York et al. (2000) York, D.�G., et al., 2000, AJ, 120, 1579
  • Wachter et al. (2010) Wachter, S., et al., 2010, AJ, 139, 2330
  • Wang et al. (2004) Wang, Z., et al., 2004, IEEE Transactions on Image Processing, 13, 600
  • Wang et al. (2018) Y. Wang, Y., et al., 2018, A&A, 619, A124
  • Wendker (1995) Wendker, H. J., 1995, A&AS, 109, 177; March 2001 update of the catalogue, CDS VIII/99
  • Wenger et al. (2000) Wenger, M., et al., 2000, A&AS, 143, 9
  • Werner et al. (2004) Werner, M. W., et al., 2004, ApJS, 154, 1
  • Wright et al. (2010) Wright, E. L., et al., 2010, AJ, 140, 1868
  • Wu et al. (2019) Wu, C., et al., 2019, MNRAS, 482, 1211

Appendix A Source class physical properties

A.1 Young Stellar Objects (YSOs)

Young Stellar Objects (YSOs) denote the early stages of star development, e.g. protostars and pre-main sequence stars. They have been classified into different classes (0, I, II, III), depending on their evolution phase (Gómez de Castro, 2013). Class 0 objects are characterized by an embedded central core, surrounded by a larger (not visible yet) accreting envelope, typically observed through far-infrared and millimetre wavelength emission from the dust. Class I objects denote the late mass accretion phase, in which the central core grows, a flattened circumstellar accretion disk develops, and the protostar also expels matter via bipolar jets and outflows. Typically, their SEDs rise in the far- and mid-infrared range (αIR>subscript𝛼𝐼𝑅absent\alpha_{IR}>italic_α start_POSTSUBSCRIPT italic_I italic_R end_POSTSUBSCRIPT >0.3). In Class II objects, the majority of the circumstellar material is found in a disk of gas and dust. Flatter infrared spectral indices are typically observed in this stage. In Class III objects, the gas has been cleared out, the young planetary disk is formed, and the stellar atmosphere is recognizable. The emission from the disk becomes now negligible, and the SED is dominated by the pure stellar photosphere contribution (αIR<subscript𝛼𝐼𝑅\alpha_{IR}<-italic_α start_POSTSUBSCRIPT italic_I italic_R end_POSTSUBSCRIPT < -1.6).
YSOs are observed in the radio continuum, mainly through thermal free-free emission from ionized regions in their components (disks, winds, coronae, and jets), specially in massive YSOs. For low mass YSOs, however, the emission is thought to be driven by outflow processes that shock the surrounding material causing the required gas ionization (Anglada et al., 1998). The radio spectral indices α𝛼\alphaitalic_α are expected in the range --0.1<α𝛼\alphaitalic_α<1.1, depending on the evolution phase and emission mechanism (Scaife, 2012). For example, in collimated outflows, typical of early protostar stages (Class 0, I), a radio spectral index αsimilar-to𝛼absent\alpha\simitalic_α ∼0.25 is favoured, while, for standard conical jets, spectral indices around 0.6 are expected (Reynolds, 1986; Anglada et al., 1998). Non-thermal emission can be found around more developed pre-main sequence stages (Class II, III), like T Tauri stars, and may partially contribute to the very negative indices observed in some YSOs (Ainsworth et al., 2012).

A.2 Radio stars

Thermal and non-thermal radio emission has been detected so far from stars of different types and evolution stages across the entire Hertzsprung-Russell (H-R) diagram, among them magnetic stars, early-type stars (e.g. O-B) with winds and strong mass loss, later stages like Wolf-Rayet (WR) stars and LBVs, binaries (also bright in X-rays), and ultra cool-dwarfs (Güdel, 2002; Matthews, 2013). Thermal free-free emission, as in the stellar wind emission, originates from stellar outflows and chromospheres. Stars with spherically symmetric, isothermal, and stationary outflows are expected to have a radio spectrum Sννα{}_{\nu}\propto\nu^{\alpha}start_FLOATSUBSCRIPT italic_ν end_FLOATSUBSCRIPT ∝ italic_ν start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT (α𝛼\alphaitalic_α=0.6) (Panagia & Felli, 1975), although spectral indices deviating from the canonic value may be obtained when variating the wind parameters (e.g. electron densities, velocity gradients, mass loss rate). Non-thermal gyrosynchrotron and synchrotron emission is generated in flares and also found in stars with magnetic activity and colliding winds in binaries. Negative spectral indices (α<𝛼absent\alpha<italic_α <0) are expected in this case (Umana et al., 2015b). The stable non-thermal radio emission from the MCP stars is instead expected with an almost flat spectrum (Leto et al., 2021) and partially polarized, with the circularly polarized emission increasing as the radio frequency increases (Leto et al., 2020). In stars with high radio brightness temperatures, coherent emission mechanisms, like plasma radiation or electron cyclotron maser emission (Trigilio et al., 2000), are also operating.

A.3 Hii regions

Hii regions are discrete ionized clouds surrounding young and massive hot stars (type O-B), thus enabling to trace massive star formation across the entire Galaxy, particularly in their youngest and compact stages, known as hyper-compact Hii (HCHii, diameter\leq0.05 pc) and ultra-compact Hii (UCHii, diameter\leq0.1 pc) regions (Kurtz et al., 2005). They are detected through their bright radio and infrared emission. The observed radio continuum spectrum can be described by a standard thermal free-free model assuming an optically thin regime above similar-to\sim1 GHz, with spectral indices αsimilar-to𝛼\alpha\sim-italic_α ∼ -0.1, and an optically thick scenario at lower frequencies (e.g. below a turnover frequency), leading to increased spectral indices (αsimilar-to𝛼absent\alpha\simitalic_α ∼2) due to self-absorption mechanisms. Younger Hii regions with higher density typically remain optically thick at higher frequencies, with a positive spectral index and turnover at νsimilar-to𝜈absent\nu\simitalic_ν ∼5 GHz for UCHii and ν10÷100similar-to𝜈10100\nu\sim 10\div 100italic_ν ∼ 10 ÷ 100 GHz for HCHii (Yang et al., 2019, 2021).
The infrared emission comes from different dust populations (Robitaille et al., 2012), mostly located in the photodissociation regions, e.g. polycyclic aromatic hydrocarbons (PAHs) at 8  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m and 12  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, small and large grains at 22-24  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m and above 70  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m, respectively.

A.4 Planetary Nebulae

Planetary nebulae are shells of ionized gas, ejected from central hot stars of low to intermediate mass (similar-to\sim1--8 MsubscriptMdirect-product\textup{M}_{\odot}M start_POSTSUBSCRIPT ⊙ end_POSTSUBSCRIPT) at the end of their asymptotic giant branch (AGB) phase. A more precise definition, and a summary of their observational characteristics, were presented in Frew & Parker (2010).
Radio continuum radiation (thermal free–free) is observed from the nebula shell, due to the gas ionized by the ultra-violet radiation produced by the central star (Kwok, 2000). Typical radio spectral indices observed range from similar-toabsent\sim-∼ -0.1 in an optically thin regime, to positive indices, up to similar-to\sim2 (Pottasch, 1984), for optically thick PNe. Infrared emission is due to cool dust (Tsimilar-to\sim100--200 K) material surrounding the ionized region, peaking at similar-to\sim20  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m for most PNe. Polycyclic aromatic hydrocarbon (PAH) emission at 8  µmmicrometer\mathrm{\SIUnitSymbolMicro m}roman_µ roman_m from the photodissociation region (PDR) surrounding the ionized gas, can also be present in more compact objects.

A.5 Pulsars

Pulsars are highly magnetized rotating neutron stars, emitting beams of radiation from their magnetic poles, observed as they point towards Earth, with periods ranging from milliseconds to seconds. The radio emission shows a high degree of linear polarization, and a small fraction of circular polarization. Pulsars are known to have steep flux-density spectra, with observed average spectral indices around --1.8±plus-or-minus\pm±0.2 (Maron, 2000), in some cases (similar-to\sim10%) described by double power-laws with spectral breaks around 1 GHz. The emission is thought to be due to coherent processes, but its origin and generation mechanism is still debated (Beskin et al., 2015; Beskin, 2018; Melrose et al., 2021).

A.6 Active Galactic Nuclei

Active Galactic Nuclei (AGN) of different radio-loud (RL) and radio-quiet (RQ) types (e.g. RL/RQ quasars, FR I/FR II/Seyfert radio galaxies, blazars) dominate the observed counts of continuum radio sources above the mJy level, the latter type representing similar-to\sim90% of all AGNs. In the unification schema (Urry & Padovani, 1995), much of their observational properties (e.g. radio components, multi-wavelength spectral features) arise from the orientation of the accretion disk and the observer’s line-of-sight.
The radio emission at cm wavelengths, explained as synchrotron radiation from GeV electrons, has a relatively steep spectrum for extended regions (jets, lobes) with αsimilar-to𝛼\alpha\sim-italic_α ∼ -0.7, while compact regions have flatter or inverted spectra (α𝛼\alpha\geq-italic_α ≥ -0.5), resulting from the superposition of multiple self-absorbed components. The latter scenario is observed in core-dominated sources, such as blazars or FSRQs. On the other hand, Gigahertz-Peaked Spectrum (GPS) and Compact Steep Spectrum (CSS) compact sources, with their observed steep spectra and well-defined spectral turnovers (around 1 GHz for GPS sources, and 100 MHz for CSS), are a notable exception. They are believed to be young objects eventually evolving into more extended radio objects of type FR I/II, and overall they represent a considerable fraction (around 10% for GPS, and 30% for CSS) of the bright compact radio source population (O’Dea, 1998; O’Dea & Saikia, 2021; Sadler, 2016). A high degree of linear polarization (up to 30%) is also observed, particularly in extended components.

Appendix B Supplementary plots

Refer to caption
(a) RG
Figure B.1: Pearson correlation coefficient matrix computed over for the 5-band color feature sets (see Table 5) (continued on next page)
Refer to caption
(b) QSO
Refer to caption
(c) PN
Figure B.2: Pearson correlation coefficient matrix computed over for the 5-band color feature sets (see Table 5) (continued on next page)
Refer to caption
(d) Hii
Refer to caption
(e) STAR
Figure B.3: Pearson correlation coefficient matrix computed over for the 5-band color feature sets (see Table 5) (continued on next page)
Refer to caption
(f) PULSAR
Refer to caption
(g) YSO
Figure B.4: Pearson correlation coefficient matrix computed over for the 5-band color feature set (see Table 5).
Refer to caption
(h) PN
Refer to caption
(i) Hii
Figure B.5: Pearson correlation coefficient matrix computed over for the 7-band color feature set (see Table 5) (continued on next page)
Refer to caption
(j) STAR
Refer to caption
(k) PULSAR
Figure B.6: Pearson correlation coefficient matrix computed over for the 7-band color feature set (see Table 5) (continued on next page)
Refer to caption
(l) YSO
Figure B.7: Pearson correlation coefficient matrix computed over for the 7-band color feature set (see Table 5)
Refer to caption
Figure B.8: Feature importance for LightGBM classifier trained on 5 bands (MIR+FIR) data.
Refer to caption
Figure B.9: Feature importance for LightGBM classifier trained on 5 bands (MIR+FIR) + α𝛼\alphaitalic_α data.
Refer to caption
Figure B.10: Feature importance for LightGBM classifier trained on 7 bands (MIR+FIR) data.
Refer to caption
Figure B.11: Feature importance for LightGBM classifier trained on 7 bands (MIR+FIR) + α𝛼\alphaitalic_α data.
Refer to caption
Figure B.12: F1-score of CNN classifier (custom_v1 model) computed as a function of the training epoch over five "mixed" survey 5-bands (MIR+FIR) train (blue graph) and validation (red graph) datasets. Shaded areas correspond to the range of minimum and maximum F1-scores obtained in different training runs, each with different train/validation/test data splits.