맨땅에 코딩

AdGraph: A Graph-Based Approach to Ad and Tracker Blocking(I. INTRODUCTION) 본문

UNIST 2025 (겨울방학)/AdGraph

AdGraph: A Graph-Based Approach to Ad and Tracker Blocking(I. INTRODUCTION)

나는 푸딩 2025. 6. 10. 15:16

I. INTRODUCTION

Abstract

The need for online content blocking continues to grow.

  • Studies show that blocking ads and tracking resources can lead to:
    • Improved performance
    • Enhanced privacy
    • Stronger security
    • Better user experience
  • Browser vendors are increasingly integrating content blocking features directly into browsers.

Limitations of existing content blocking tools

  • Traditional tools rely on URL patterns or JavaScript behavior/code structure to block suspicious content.
    • Limitations:
      • Can be bypassed via domain generation algorithms (DGAs)
      • Can be evaded using trusted domains (e.g., first-party or CDN proxies)
      • Can be obfuscated through JavaScript code manipulation

Limitations of alternative approaches

  • Previous research has proposed filter lists, predefined heuristics, and machine learning based on network/code analysis, but these are often incomplete or vulnerable to simple evasion techniques.

ADGRAPH Contributions

What is ADGRAPH?

  • ADGRAPH is a novel machine learning-based approach that detects and blocks advertising and tracking resources by constructing a graph based on HTML structure, JavaScript behavior, and network request interactions.
  • It provides a context-rich blocking strategy:
    • Takes into account both past and current context of network requests to detect resources missed by conventional methods
    • More robust against single-feature-based evasion techniques

Key Features

  1. Supports both online (real-time) and offline usage
    • Online: Blocks ads and trackers in real time within the browser
    • Offline: Assists in generating and maintaining filter lists
  2. High performance
    • Demonstrates faster page load times compared to Adblock Plus

Summary of ADGRAPH’s Contributions

  1. Graph-based machine learning approach
    • Utilizes HTML structure, JavaScript behavior, and network requests to identify ad/tracking resources
  2. Large-scale evaluation
    • Reproduced filter list labels with 95.33% accuracy across the Alexa Top 10,000 websites
    • Successfully identified resources missed by filter lists and distinguished overblocked legitimate resources
  3. Efficient implementation
    • Modified Chromium’s Blink and V8 engines to monitor document behavior
    • Outperforms Adblock Plus, and is even faster than vanilla Chromium on 42% of tested websites
  4. Site breakage evaluation
    • Causes comparable site breakage to filter lists (15.0% vs. 11.4%)
    • Induces fewer severe breakages (5.9% vs. 6.4%)

Paper Structure

  1. Section II: Reviews existing work on ad and tracker blocking and its limitations
  2. Section III: Describes the design and implementation of ADGRAPH
  3. Section IV: Evaluates ADGRAPH in terms of blocking accuracy, performance, and site breakage
  4. Section V: Discusses limitations, improvements, and potential for offline use
  5. Section VI: Conclusion