Goto

Collaborating Authors

 Zhou, Xiaotian


Auto-Drafting Police Reports from Noisy ASR Outputs: A Trust-Centered LLM Approach

arXiv.org Artificial Intelligence

Achieving a delicate balance between fostering trust in law en- forcement and protecting the rights of both officers and civilians continues to emerge as a pressing research and product challenge in the world today. In the pursuit of fairness and transparency, this study presents an innovative AI-driven system designed to generate police report drafts from complex, noisy, and multi-role dialogue data. Our approach intelligently extracts key elements of law enforcement interactions and includes them in the draft, producing structured narratives that are not only high in quality but also reinforce accountability and procedural clarity. This frame- work holds the potential to transform the reporting process, ensur- ing greater oversight, consistency, and fairness in future policing practices. A demonstration video of our system can be accessed at https://drive.google.com/file/d/1kBrsGGR8e3B5xPSblrchRGj-Y-kpCHNO/view?usp=sharing


Viral Marketing in Social Networks with Competing Products

arXiv.org Artificial Intelligence

Consider a directed network where each node is either red (using the red product), blue (using the blue product), or uncolored (undecided). Then in each round, an uncolored node chooses red (resp. blue) with some probability proportional to the number of its red (resp. blue) out-neighbors. What is the best strategy to maximize the expected final number of red nodes given the budget to select $k$ red seed nodes? After proving that this problem is computationally hard, we provide a polynomial time approximation algorithm with the best possible approximation guarantee, building on the monotonicity and submodularity of the objective function and exploiting the Monte Carlo method. Furthermore, our experiments on various real-world and synthetic networks demonstrate that our proposed algorithm outperforms other algorithms. Additionally, we investigate the convergence time of the aforementioned process both theoretically and experimentally. In particular, we prove several tight bounds on the convergence time in terms of different graph parameters, such as the number of nodes/edges, maximum out-degree and diameter, by developing novel proof techniques.


Optimization on the smallest eigenvalue of grounded Laplacian matrix via edge addition

arXiv.org Artificial Intelligence

The grounded Laplacian matrix $\LL_{-S}$ of a graph $\calG=(V,E)$ with $n=|V|$ nodes and $m=|E|$ edges is a $(n-s)\times (n-s)$ submatrix of its Laplacian matrix $\LL$, obtained from $\LL$ by deleting rows and columns corresponding to $s=|S| \ll n $ ground nodes forming set $S\subset V$. The smallest eigenvalue of $\LL_{-S}$ plays an important role in various practical scenarios, such as characterizing the convergence rate of leader-follower opinion dynamics, with a larger eigenvalue indicating faster convergence of opinion. In this paper, we study the problem of adding $k \ll n$ edges among all the nonexistent edges forming the candidate edge set $Q = (V\times V)\backslash E$, in order to maximize the smallest eigenvalue of the grounded Laplacian matrix. We show that the objective function of the combinatorial optimization problem is monotone but non-submodular. To solve the problem, we first simplify the problem by restricting the candidate edge set $Q$ to be $(S\times (V\backslash S))\backslash E$, and prove that it has the same optimal solution as the original problem, although the size of set $Q$ is reduced from $O(n^2)$ to $O(n)$. Then, we propose two greedy approximation algorithms. One is a simple greedy algorithm with an approximation ratio $(1-e^{-\alpha\gamma})/\alpha$ and time complexity $O(kn^4)$, where $\gamma$ and $\alpha$ are, respectively, submodularity ratio and curvature, whose bounds are provided for some particular cases. The other is a fast greedy algorithm without approximation guarantee, which has a running time $\tilde{O}(km)$, where $\tilde{O}(\cdot)$ suppresses the ${\rm poly} (\log n)$ factors. Numerous experiments on various real networks are performed to validate the superiority of our algorithms, in terms of effectiveness and efficiency.


Large Language Model Soft Ideologization via AI-Self-Consciousness

arXiv.org Artificial Intelligence

Large language models (LLMs) have demonstrated human-level performance on a vast spectrum of natural language tasks. However, few studies have addressed the LLM threat and vulnerability from an ideology perspective, especially when they are increasingly being deployed in sensitive domains, e.g., elections and education. In this study, we explore the implications of GPT soft ideologization through the use of AI-self-consciousness. By utilizing GPT self-conversations, AI can be granted a vision to "comprehend" the intended ideology, and subsequently generate finetuning data for LLM ideology injection. When compared to traditional government ideology manipulation techniques, such as information censorship, LLM ideologization proves advantageous; it is easy to implement, cost-effective, and powerful, thus brimming with risks.


Adaptive Margin Ranking Loss for Knowledge Graph Embeddings via a Correntropy Objective Function

arXiv.org Artificial Intelligence

Translation-based embedding models have gained significant attention in link prediction tasks for knowledge graphs. TransE is the primary model among translation-based embeddings and is well-known for its low complexity and high efficiency. Therefore, most of the earlier works have modified the score function of the TransE approach in order to improve the performance of link prediction tasks. Nevertheless, proven theoretically and experimentally, the performance of TransE strongly depends on the loss function. Margin Ranking Loss (MRL) has been one of the earlier loss functions which is widely used for training TransE. However, the scores of positive triples are not necessarily enforced to be sufficiently small to fulfill the translation from head to tail by using relation vector (original assumption of TransE). To tackle this problem, several loss functions have been proposed recently by adding upper bounds and lower bounds to the scores of positive and negative samples. Although highly effective, previously developed models suffer from an expansion in search space for a selection of the hyperparameters (in particular the upper and lower bounds of scores) on which the performance of the translation-based models is highly dependent. In this paper, we propose a new loss function dubbed Adaptive Margin Loss (AML) for training translation-based embedding models. The formulation of the proposed loss function enables an adaptive and automated adjustment of the margin during the learning process. Therefore, instead of obtaining two values (upper bound and lower bound), only the center of a margin needs to be determined. During learning, the margin is expanded automatically until it converges. In our experiments on a set of standard benchmark datasets including Freebase and WordNet, the effectiveness of AML is confirmed for training TransE on link prediction tasks.