skip to main content
10.1145/3603719acmotherconferencesBook PagePublication PagesssdbmConference Proceedingsconference-collections
SSDBM '23: Proceedings of the 35th International Conference on Scientific and Statistical Database Management
ACM2023 Proceeding
Publisher:
  • Association for Computing Machinery
  • New York
  • NY
  • United States
Conference:
SSDBM 2023: 35th International Conference on Scientific and Statistical Database Management Los Angeles CA USA July 10 - 12, 2023
ISBN:
979-8-4007-0746-9
Published:
27 August 2023

Reflects downloads up to 17 Oct 2024Bibliometrics
Abstract

No abstract available.

Skip Table Of Content Section
SESSION: Research Full Papers
research-article
Open Access
Evaluating Autoencoders for Dimensionality Reduction of MRI-derived Radiomics and Classification of Malignant Brain Tumors
Article No.: 1, Pages 1–11https://doi.org/10.1145/3603719.3603737

Malignant brain tumors including parenchymal metastatic (MET) lesions, glioblastomas (GBM), and lymphomas (LYM) account for 29.7% of brain cancers. However, the characterization of these tumors from MRI imaging is difficult due to the similarity of ...

research-article
LearnedSort as a learning-augmented SampleSort: Analysis and Parallelization
Article No.: 2, Pages 1–9https://doi.org/10.1145/3603719.3603731

This work analyzes and parallelizes LearnedSort, the novel algorithm that sorts using machine learning models based on the cumulative distribution function. LearnedSort is analyzed under the lens of algorithms with predictions, and it is argued that ...

research-article
Open Access
Best Paper
Best Paper
Indexing Temporal Relations for Range-Duration Queries
Article No.: 3, Pages 1–12https://doi.org/10.1145/3603719.3603732

Temporal information plays a crucial role in many database applications, however support for queries on such data is limited. We present an index structure, termed RD-index, to support range-duration queries over interval timestamped relations, which ...

research-article
Open Access
SciDG: Benchmarking Scientific Dynamic Graph Queries
Article No.: 4, Pages 1–12https://doi.org/10.1145/3603719.3603724

Dynamic graphs are increasingly being utilized in domain knowledge modeling and large-scale scientific data management. Managing dynamic graph data requires a graph database system that can handle constantly changing volumes and data versions, while ...

research-article
Open Access
Data Driven Dimensionality Reduction to Improve Modeling Performance✱
Article No.: 5, Pages 1–16https://doi.org/10.1145/3603719.3603744

In a number of applications, data may be anonymized, obfuscated, or highly noisy. In such cases, it is difficult to use domain knowledge or low-dimensional visualizations to engineer the features for tasks such as machine learning, instead, we explore ...

research-article
Open Access
Privacy-Preserving OLAP via Modeling and Analysis of Query Workloads: Innovative Theories and Theorems
Article No.: 6, Pages 1–12https://doi.org/10.1145/3603719.3603735

This paper proposes innovative theories and theorems in the context of a state-of-the-art paper that computes privacy-preserving OLAP cubes via modeling and analyzing query workloads. The work contributes to actual literature by devising a solid ...

research-article
ESM2-Tree: An maintenance efficient authentication data structure in blockchain
Article No.: 7, Pages 1–12https://doi.org/10.1145/3603719.3603721

Blockchain technology is gaining broader attention. Owing to its immutability property and byzantine fault-tolerance consensus protocol, blockchain offers a brand new trusted data-sharing solution. Some researchers use blockchain to drive autonomous ...

research-article
ST-CopulaGNN : A Multi-View Spatio-Temporal Graph Neural Network for Traffic Forecasting
Article No.: 8, Pages 1–12https://doi.org/10.1145/3603719.3603740

Modern cities heavily rely on complex transportation, making accurate traffic speed prediction crucial for traffic management authorities. Classical methods, including statistical techniques and traditional machine learning techniques, fail to capture ...

research-article
Towards Efficient Discovery of Spatially Interesting Patterns in Geo-referenced Sequential Databases
Article No.: 9, Pages 1–11https://doi.org/10.1145/3603719.3603743

A geo-referenced time series is a crucial form of spatiotemporal data. Useful information that can empower the users to achieve economic development is hidden in this series. When confronted with this problem, researchers modeled this series as a ...

research-article
Multi-representations Space Separation based Graph-level Anomaly-aware Detection
Article No.: 10, Pages 1–11https://doi.org/10.1145/3603719.3603739

Graph structure patterns are widely used to model different area data recently. How to detect anomalous graph information on these graph data has become a popular research problem. The objective of this research is centered on the particular issue that ...

research-article
Federated Learning on Personal Data Management Systems: Decentralized and Reliable Secure Aggregation Protocols
Article No.: 11, Pages 1–12https://doi.org/10.1145/3603719.3603730

The development and adoption of personal data management systems (PDMS) has been fueled by legal and technical means such as smart disclosure, data portability and data altruism. By using a PDMS, individuals can effortlessly gather and share data, ...

research-article
Open Access
A Computer Vision Approach for Detecting Discrepancies in Map Textual Labels
Article No.: 12, Pages 1–9https://doi.org/10.1145/3603719.3603722

Maps provide various sources of information. An important example of such information is textual labels such as cities, neighborhoods, and street names. Although we treat this information as facts, and despite the massive effort done by providers to ...

research-article
Open Access
Accelerating Machine Learning Queries with Linear Algebra Query Processing
Article No.: 13, Pages 1–12https://doi.org/10.1145/3603719.3603726

The rapid growth of large-scale machine learning (ML) models has led numerous commercial companies to utilize ML models for generating predictive results to help business decision-making. As two primary components in traditional predictive pipelines, ...

research-article
A Long-term Time Series Forecasting method with Multiple Decomposition
Article No.: 14, Pages 1–9https://doi.org/10.1145/3603719.3603738

In various real-world applications such as weather forecasting, energy consumption planning, and traffic flow prediction, time serves as a critical variable. These applications can be collectively referred to as time-series prediction problems. Despite ...

research-article
Heterogeneous Graph Neural Network via Knowledge Relations for Fake News Detection
Article No.: 15, Pages 1–11https://doi.org/10.1145/3603719.3603736

The proliferation of fake news in social media has been recognized as a severe problem for society, and substantial attempts have been devoted to fake news detection to alleviate the detrimental impacts. Knowledge graphs (KGs) comprise rich factual ...

research-article
Open Access
Less is More: How Fewer Results Improve Progressive Join Query Processing
Article No.: 16, Pages 1–12https://doi.org/10.1145/3603719.3603728

With the requirements to enable data analytics and exploration interactively and efficiently, progressive data processing, especially progressive join, became essential to data science. Join queries are particularly challenging due to the correlation ...

SESSION: Short Papers
short-paper
Open Access
Fast Algorithm for Embedded Order Dependency Validation
Article No.: 17, Pages 1–4https://doi.org/10.1145/3603719.3603720

Order Dependencies (ODs) have many applications, such as query optimization, data integration, and data cleaning. Although many works addressed the problem of discovering OD (and its variants), they do not consider datasets with missing values, a ...

short-paper
MSLS: Meta-graph Search with Learnable Supernet for Heterogeneous Graph Neural Networks
Article No.: 18, Pages 1–4https://doi.org/10.1145/3603719.3603727

In recent years, heterogeneous graph neural networks (HGNNs) have achieved excellent performance. The efficient HGNNs consist of meta-graphs and aggregation operations. Since manually designing meta-graph is an expert-dependent and time-consuming ...

short-paper
Best Short Paper
Best Short Paper
InfoMoD: Information-theoretic Model Diagnostics
Article No.: 19, Pages 1–4https://doi.org/10.1145/3603719.3603725

Validating and debugging machine learning models is done by testing them on unseen data. Analyzing model performance on various subsets of the data is critical for fairness, trust, bias detection and explainablility. In this paper, we describe a new way ...

short-paper
Decoupled Graph Neural Architecture Search with Variable Propagation Operation and Appropriate Depth
Article No.: 20, Pages 1–4https://doi.org/10.1145/3603719.3603729

To alleviate the over-smoothing problem caused by deep graph neural networks, decoupled graph neural networks (DGNNs) are proposed. DGNNs decouple the graph neural network into two atomic operations, the propagation (P) operation and the transformation ...

short-paper
Early ICU Mortality Prediction with Deep Federated Learning: A Real-World Scenario
Article No.: 21, Pages 1–4https://doi.org/10.1145/3603719.3603723

The generation of large amounts of healthcare data has motivated the use of Machine Learning (ML) to train robust models for clinical tasks. However, limitations of local datasets and restrictions on sharing patient data impede the use of traditional ML ...

short-paper
Privacy-Preserving Redaction of Diagnosis Data through Source Code Analysis
Article No.: 22, Pages 1–4https://doi.org/10.1145/3603719.3603734

Protecting sensitive information in diagnostic data such as logs, is a critical concern in the industrial software diagnosis and debugging process. While there are many tools developed to automatically redact the logs for identifying and removing ...

short-paper
TGSLN : Time-aware Graph Structure Learning Network for Multi-variates Stock Sector Ranking Recommendation
Article No.: 23, Pages 1–4https://doi.org/10.1145/3603719.3603741

In the field of financial prediction, most studies focus on individual stocks or stock indices. Stock sectors are collections of stocks with similar characteristics and the indices of sectors have more stable trends and predictability compared to ...

short-paper
Selecting Efficient Cluster Resources for Data Analytics: When and How to Allocate for In-Memory Processing?
Article No.: 24, Pages 1–4https://doi.org/10.1145/3603719.3603733

Distributed dataflow systems such as Apache Spark or Apache Flink enable parallel, in-memory data processing on large clusters of commodity hardware. Consequently, the appropriate amount of memory to allocate to the cluster is a crucial consideration.

...

SESSION: Demos
demonstration
Interactive Data Mashups for User-Centric Data Analysis
Article No.: 25, Pages 1–4https://doi.org/10.1145/3603719.3603742

Nowadays, the amount of data is growing rapidly. Through data mining and analysis, information and knowledge can be derived based on this growing volume of data. Different tools have been introduced in the past to specify data analysis scenarios in a ...

SESSION: Posters
poster
Four Factors Affecting Missing Data Imputation
Article No.: 26, Pages 1–2https://doi.org/10.1145/3603719.3604285

Missing data is a common problem in datasets and impacts the reliability of data analysis. Numerous methods to impute (i.e., predict and replace) missing values have been proposed. The quality of these imputed values depends on factors like correlation,...

Contributors
  • University of Southern California
  • University of Southern California
  • The University of Chicago

Index Terms

  1. Proceedings of the 35th International Conference on Scientific and Statistical Database Management
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Acceptance Rates

      Overall Acceptance Rate 56 of 146 submissions, 38%
      YearSubmittedAcceptedRate
      SSDBM '18753040%
      SSDBM '14712637%
      Overall1465638%