Provably Extending PageRank-based Local Clustering Algorithm to Weighted Directed Graphs with Self-Loops and to Hypergraphs
Li, Zihao, Fu, Dongqi, Liu, Hengyu, He, Jingrui
–arXiv.org Artificial Intelligence
Local clustering aims to find a compact cluster near the given starting instances. This work focuses on graph local clustering, which has broad applications beyond graphs because of the internal connectivities within various modalities. While most existing studies on local graph clustering adopt the discrete graph setting (i.e., unweighted graphs without self-loops), real-world graphs can be more complex. In this paper, we extend the non-approximating Andersen-Chung-Lang ("ACL") algorithm beyond discrete graphs and generalize its quadratic optimality to a wider range of graphs, including weighted, directed, and self-looped graphs and hypergraphs. Specifically, leveraging PageRank, we propose two algorithms: GeneralACL for graphs and HyperACL for hypergraphs. We theoretically prove that, under two mild conditions, both algorithms can identify a quadratically optimal local cluster in terms of conductance with at least 1/2 probability. On the property of hypergraphs, we address a fundamental gap in the literature by defining conductance for hypergraphs from the perspective of hypergraph random walks. Additionally, we provide experiments to validate our theoretical findings.
arXiv.org Artificial Intelligence
Dec-3-2024
- Country:
- Africa > Middle East
- Tunisia > Monastir Governorate > Monastir (0.04)
- Asia
- China > Heilongjiang Province
- Harbin (0.04)
- Myanmar > Tanintharyi Region
- Dawei (0.04)
- Singapore (0.04)
- Vietnam (0.04)
- China > Heilongjiang Province
- Europe
- Austria > Vienna (0.14)
- Czechia > South Moravian Region
- Brno (0.04)
- France > Île-de-France
- Italy > Veneto
- Venice (0.04)
- Middle East > Malta
- Eastern Region > Northern Harbour District > St. Julian's (0.04)
- Slovenia > Central Slovenia
- Municipality of Ljubljana > Ljubljana (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- United Kingdom > England
- West Midlands > Birmingham (0.04)
- North America
- Canada
- Mexico > Mexico City
- Mexico City (0.04)
- United States
- California
- Alameda County > Berkeley (0.04)
- Los Angeles County > Long Beach (0.04)
- San Francisco County > San Francisco (0.14)
- Santa Cruz County > Santa Cruz (0.04)
- Massachusetts (0.04)
- District of Columbia > Washington (0.04)
- Washington > King County
- Seattle (0.04)
- Virginia > Alexandria County
- Alexandria (0.04)
- New York
- Bronx County > New York City (0.04)
- Kings County > New York City (0.04)
- New York County > New York City (0.14)
- Queens County > New York City (0.04)
- Richmond County > New York City (0.04)
- Hawaii > Honolulu County
- Honolulu (0.04)
- Texas
- Harris County > Houston (0.04)
- Travis County > Austin (0.04)
- Maryland > Baltimore (0.04)
- Illinois > Champaign County
- Urbana (0.04)
- California
- Oceania > Australia
- Queensland (0.04)
- Africa > Middle East
- Genre:
- Research Report (0.50)
- Industry:
- Information Technology (0.46)
- Technology:
- Information Technology
- Artificial Intelligence
- Machine Learning > Statistical Learning
- Clustering (1.00)
- Natural Language (1.00)
- Representation & Reasoning (1.00)
- Machine Learning > Statistical Learning
- Communications (1.00)
- Data Science > Data Mining (1.00)
- Information Management (1.00)
- Artificial Intelligence
- Information Technology