skip to main content
10.5555/1251086.1251093acmconferencesArticle/Chapter ViewAbstractPublication PagesimcConference Proceedingsconference-collections
Article

Joint data streaming and sampling techniques for detection of super sources and destinations

Published: 19 October 2005 Publication History

Abstract

Detecting the sources or destinations that have communicated with a large number of distinct destinations or sources during a small time interval is an important problem in network measurement and security. Previous detection approaches are not able to deliver the desired accuracy at high link speeds (10 to 40 Gbps). In this work, we propose two novel algorithms that provide accurate and efficient solutions to this problem. Their designs are based on the insight that sampling and data streaming are often suitable for capturing different and complementary regions of the information spectrum, and a close collaboration between them is an excellent way to recover the complete information. Our first solution builds on the standard hash-based flow sampling algorithm. Its main innovation is that the sampled traffic is further filtered by a data streaming module which allows for much higher sampling rate and hence much higher accuracy. Our second solution is more sophisticated but offers higher accuracy. It combines the power of data streaming in efficiently estimating quantities associated with a given identity, and the power of sampling in collecting a list of candidate identities. The performance of both solutions are evaluated using both mathematical analysis and trace-driven experiments on real-world Internet traffic.

References

[1]
{1} B. H. Bloom. Space/time trade-offs in hash coding with allowable errors. CACM, 13(7):422-426, 1970.
[2]
{2} J. Carter and M. Wegman. Universal classes of hash functions. Journal of Computer and System Sciences, pages 143-154, 1979.
[3]
{3} N. Duffield and M. Grossglauser. Trajectory sampling for direct traffic observation. IEEE transaction of Networking , pages 280-292, June 2001.
[4]
{4} N. Duffield, C. Lund, and M. Thorup. Estimating flow distribution from sampled flow statistics. In Proc. ACM SIGCOMM, August 2003.
[5]
{5} C. Estan and G. Varghese. New Directions in Traffic Measurement and Accounting. In Proc. ACM SIGCOMM , August 2002.
[6]
{6} C. Estan and G. Varghese. Bitmap algorithms for counting active flows on high speed links. In Proc. ACM/SIGCOMM IMC, October 2003.
[7]
{7} W. Fang and L. Peterson. Inter-AS traffic patterns and their implications. In Proc. IEEE GLOBECOM, December 1999.
[8]
{8} N. Hohn and D. Veitch. Inverting sampled traffic. In Proc. ACM/SIGCOMM IMC, October 2003.
[9]
{9} J. Jung, B. Krishnamurthy, and M. Rabinovich. Flash crowds and denial of service attacks: Characterization and implications for cdn and web sites. In Proc. World Wide Web Conference, May 2002.
[10]
{10} R. Karp, S. Shenker, and C. Papadimitriou. A simple algorithm for finding frequent elements in streams and bags. ACM Transactions on Database Systems (TODS), 28:51-55, 2003.
[11]
{11} A. Kumar, M. Sung, J. Xu, and J. Wang. Data streaming algorithms for efficient and accurate estimation of flow size distribution. In Proc. ACM SIGMETRICS, 2004.
[12]
{12} A. Kumar, J. Xu, J. Wang, O. Spatschek, and L. Li. Space-Code Bloom Filter for Efficient per-flow Traffic Measurement. In Proc. IEEE INFOCOM, March 2004.
[13]
{13} R. Motwani and P. Raghavan. Randomized Algorithms . Cambridge University Press, 1995.
[14]
{14} CISCO Tech Notes. Cisco netflow. available at http://www.cisco.com/warp/public/732/netflow/ index.html.
[15]
{15} V. Paxon. An analysis of using reflectors for distributed denial-of-service attacks. Computer Communication Review, 2001.
[16]
{16} D. S. Phatak and T. Goff. A novel mechanism for data streaming across multiple IP links for improving throughput and reliability in mobile environments. In Proc. IEEE INFOCOM, June 2002.
[17]
{17} D. Plonka. Flowscan: A network traffic flow reporting and visualization tool. In Proc. USENIX LISA, 2000.
[18]
{18} M. Ramakrishna, E. Fu, and E. Bahcekapili. Efficient hardware hashing functions for high performance computers. IEEE Transactions on Computers, pages 1378-1381, 1997.
[19]
{19} M. Roesch. Snort-lightweight intrusion detection for networks. In Proc. USENIX Systems Administration Conference, 1999.
[20]
{20} S. Venkataraman, D. Song, P. Gibbons, and A. Blum. New streaming algorithms for fast detection of superspreaders. In Proc. NDSS, 2005.
[21]
{21} K.Y. Whang, B.T. Vander-zanden, and H.M. Taylor. A linear-time probabilistic counting algorithm for database applications. IEEE transaction of Database Systems, pages 208-229, June 1990.
[22]
{22} Y. Zhang, S. Singh, S. Sen, N. Duffield, and C. Lund. Online identification of hierarchical heavy hitters: Algorithms, evaluation, and application. In Proc. ACM/SIGCOMM IMC, October 2004.
[23]
{23} Q. Zhao, A. Kumar, J. Wang, and J. Xu. Data streaming algorithms for accurate and efficient measurement of traffic and flow matrices. In Proc. ACM SIGMETRICS , June 2005.
[24]
{24} Q. Zhao, A. Kumar, and J. Xu. Joint data streaming and sampling techniques for detection of super sources and destinations. In Technical Report, July 2005.

Cited By

View all
  • (2020)Efficient Online Classification and Tracking on Resource-constrained IoT DevicesACM Transactions on Internet of Things10.1145/33920511:3(1-29)Online publication date: 13-Jul-2020
  • (2019)Continuously Distinct Sampling over Centralized and Distributed High Speed Data StreamsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2018.286545230:2(300-314)Online publication date: 1-Feb-2019
  • (2019)Efficient Identification of TOP-K Heavy Hitters over Sliding WindowsMobile Networks and Applications10.1007/s11036-018-1051-x24:5(1732-1741)Online publication date: 1-Oct-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
IMC '05: Proceedings of the 5th ACM SIGCOMM conference on Internet measurement
October 2005
389 pages

Sponsors

Publisher

USENIX Association

United States

Publication History

Published: 19 October 2005

Check for updates

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 277 of 1,083 submissions, 26%

Upcoming Conference

IMC '24
ACM Internet Measurement Conference
November 4 - 6, 2024
Madrid , AA , Spain

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2020)Efficient Online Classification and Tracking on Resource-constrained IoT DevicesACM Transactions on Internet of Things10.1145/33920511:3(1-29)Online publication date: 13-Jul-2020
  • (2019)Continuously Distinct Sampling over Centralized and Distributed High Speed Data StreamsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2018.286545230:2(300-314)Online publication date: 1-Feb-2019
  • (2019)Efficient Identification of TOP-K Heavy Hitters over Sliding WindowsMobile Networks and Applications10.1007/s11036-018-1051-x24:5(1732-1741)Online publication date: 1-Oct-2019
  • (2017)A high-performance algorithm for identifying frequent items in data streamsProceedings of the 2017 Internet Measurement Conference10.1145/3131365.3131407(268-282)Online publication date: 1-Nov-2017
  • (2016)Identifying High-Cardinality Hosts from Network-Wide Traffic MeasurementsIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2015.242367513:5(547-558)Online publication date: 1-Sep-2016
  • (2015)A Software Defined Sketch System for Traffic MonitoringProceedings of the Eleventh ACM/IEEE Symposium on Architectures for networking and communications systems10.5555/2772722.2772755(197-198)Online publication date: 7-May-2015
  • (2014)Superspreader detection system on NetFPGA platformProceedings of the tenth ACM/IEEE symposium on Architectures for networking and communications systems10.1145/2658260.2661766(247-248)Online publication date: 20-Oct-2014
  • (2013)Line speed accurate superspreader identification using dynamic error compensationComputer Communications10.1016/j.comcom.2013.05.00636:13(1460-1470)Online publication date: 1-Jul-2013
  • (2011)Finding heavy distinct hitters in data streamsProceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures10.1145/1989493.1989541(299-308)Online publication date: 4-Jun-2011
  • (2009)A new data streaming method for locating hosts with large connection degreeProceedings of the 28th IEEE conference on Global telecommunications10.5555/1811982.1812446(6421-6426)Online publication date: 30-Nov-2009
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media