skip to main content
10.1145/3148055.3149208acmconferencesArticle/Chapter ViewAbstractPublication PagesbdcatConference Proceedingsconference-collections
poster

Case Study: Clustering Big Stellar Data with EM*

Published: 05 December 2017 Publication History

Abstract

Without question, astronomy is about Big Data and clustering is a very common task over astronomy domain. The expectation-maximization algorithm is among the top 10 data mining algorithms used in scientific and industrial applications, however, we observe that astronomical community does not make use of it as a clustering algorithm. In this work, we cluster $\sim$ 1M stellar objects (simulated Galactic spectral data) via the traditional expectation-maximization algorithm for clustering (EM-T) and our extended EM-T algorithm that we call EM* and present the experimental results.

References

[1]
Mark Jenne, Owen Boberg, Hasan Kurban, and Mehmet Dalkilic. 2014. Studying the milky way galaxy using paraheap-k. Computer, Vol. 47, 9 (2014), 26--33.
[2]
Hasan Kurban, Mark Jenne, and Mehmet M Dalkilic. 2016. EM*: An EM Algorithm for Big Data. In Data Science and Advanced Analytics (DSAA), 2016 IEEE International Conference on. IEEE, 312--320.
[3]
Hasan Kurban, Mark Jenne, and Mehmet M Dalkilic. 2017. Using data to build a better EM: EM* for big data. International Journal of Data Science and Analytics (2017), 1--15.

Cited By

View all
  • (2022)DCEM: An R package for clustering big data via data-centric modification of Expectation MaximizationSoftwareX10.1016/j.softx.2021.10094417(100944)Online publication date: Jan-2022

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
BDCAT '17: Proceedings of the Fourth IEEE/ACM International Conference on Big Data Computing, Applications and Technologies
December 2017
288 pages
ISBN:9781450355490
DOI:10.1145/3148055
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 December 2017

Check for updates

Author Tags

  1. astronomy
  2. big data
  3. clustering
  4. expectation maximization
  5. heap

Qualifiers

  • Poster

Conference

UCC '17
Sponsor:

Acceptance Rates

BDCAT '17 Paper Acceptance Rate 27 of 93 submissions, 29%;
Overall Acceptance Rate 27 of 93 submissions, 29%

Upcoming Conference

BDCAT '24

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2022)DCEM: An R package for clustering big data via data-centric modification of Expectation MaximizationSoftwareX10.1016/j.softx.2021.10094417(100944)Online publication date: Jan-2022

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media