Goto

Collaborating Authors

 Joseph, Kenneth


A Recipe For Building a Compliant Real Estate Chatbot

arXiv.org Artificial Intelligence

In recent years, there has been significant effort to align large language models with human preferences. This work focuses on developing a chatbot specialized in the real estate domain, with an emphasis on incorporating compliant behavior to ensure it can be used without perpetuating discriminatory practices like steering and redlining, which have historically plagued the real estate industry in the United States. Building on prior work, we present a method for generating a synthetic general instruction-following dataset, along with safety data. Through extensive evaluations and benchmarks, we fine-tuned a llama-3-8B-instruct model and demonstrated that we can enhance it's performance significantly to match huge closed-source models like GPT-4o while making it safer and more compliant. We open-source the model, data and code to support further development and research in the community.


Domain Specific Question Answering Over Knowledge Graphs Using Logical Programming and Large Language Models

arXiv.org Artificial Intelligence

Question Answering over Knowledge Graphs We propose an approach that utilizes LLMs to represent (KGQA) poses significant challenges in the field questions within a specific domain, extracting of Natural Language Processing (NLP). As structured their meanings, while employing logical programming knowledge graphs capturing rich semantic techniques for reasoning and knowledge information become prevalent, there is a pressing representation. Our objective is to demonstrate need for intelligent systems that can reason effectively how this integration enables robust and adaptable and provide accurate answers to intricate KGQA systems that can navigate domain-specific questions within specific domains. The primary knowledge graphs and provide accurate answers to focus of KGQA is to bridge the gap between human complex questions. To evaluate the effectiveness language and structured knowledge representations. of our proposed approach, we conduct experiments When presented with a question in natural using the MetaQA dataset (Zhang et al., 2018), language, KGQA systems aim to traverse the a widely adopted benchmark in KGQA research.


A Bayesian Graphical Model to Discover Latent Events from Twitter

AAAI Conferences

Online social networks like Twitter and Facebook produce an overwhelming amount of information every day. However, research suggests that much of this content focuses on a reasonably sized set of ongoing events or topics that are both temporally and geographically situated. These patterns are especially observable when the data that is generated contains geospatial information, usually generated by a location enabled device such as a smartphone. In this paper, we consider a data set of 1.4 million geo-tagged tweets from a country during a large social movement, where social events and demonstrations occurred frequently. We use a probabilistic graphical model to discover these events within the data in a way that informs us of their spatial, temporal and topical focus. Quantitative analysis suggests that the streaming algorithm proposed in the paper uncovers both well-known events and lesser-known but important events that occurred within the timeframe of the dataset. In addition, the model can be used to predict the location and time of texts that do not have these pieces of information, which accounts for the much of the data on the web.