AITopics | Sterbentz, Marko

Collaborating Authors

Sterbentz, Marko

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Satyrn: A Platform for Analytics Augmented Generation

Sterbentz, Marko, Barrie, Cameron, Shahi, Shubham, Dutta, Abhratanu, Hooshmand, Donna, Pack, Harper, Hammond, Kristian J.

arXiv.org Artificial IntelligenceJun-17-2024

Large language models (LLMs) are capable of producing documents, and retrieval augmented generation (RAG) has shown itself to be a powerful method for improving accuracy without sacrificing fluency. However, not all information can be retrieved from text. We propose an approach that uses the analysis of structured data to generate fact sets that are used to guide generation in much the same way that retrieved documents are used in RAG. This analytics augmented generation (AAG) approach supports the ability to utilize standard analytic techniques to generate facts that are then converted to text and passed to an LLM. We present a neurosymbolic platform, Satyrn that leverages AAG to produce accurate, fluent, and coherent reports grounded in large scale databases. In our experiments, we find that Satyrn generates reports in which over 86% accurate claims while maintaining high levels of fluency and coherence, even when using smaller language models such as Mistral-7B, as compared to GPT-4 Code Interpreter in which just 57% of claims are accurate.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2406.12069

Country:

North America > United States > Illinois > Lake County (0.15)
North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report (1.00)

Industry:

Education (1.00)
Government > Regional Government > North America Government > United States Government (0.93)
Law (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Lightweight Knowledge Representations for Automating Data Analysis

Sterbentz, Marko, Barrie, Cameron, Hooshmand, Donna, Shahi, Shubham, Dutta, Abhratanu, Pack, Harper, Zhao, Andong Li, Paley, Andrew, Einarsson, Alexander, Hammond, Kristian

arXiv.org Artificial IntelligenceOct-15-2023

The principal goal of data science is to derive meaningful information from data. To do this, data scientists develop a space of analytic possibilities and from it reach their information goals by using their knowledge of the domain, the available data, the operations that can be performed on those data, the algorithms/models that are fed the data, and how all of these facets interweave. In this work, we take the first steps towards automating a key aspect of the data science pipeline: data analysis. We present an extensible taxonomy of data analytic operations that scopes across domains and data, as well as a method for codifying domain-specific knowledge that links this analytics taxonomy to actual data. We validate the functionality of our analytics taxonomy by implementing a system that leverages it, alongside domain labelings for 8 distinct domains, to automatically generate a space of answerable questions and associated analytic plans. In this way, we produce information spaces over data that enable complex analyses and search over this data and pave the way for fully automated data analysis.

artificial intelligence, data mining, natural language, (19 more...)

arXiv.org Artificial Intelligence

2311.12848

Country:

North America > United States > Illinois > Cook County (0.14)
North America > United States > Illinois > DuPage County (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.50)

Industry:

Law (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Education (1.00)
(2 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Summarization from Leaderboards to Practice: Choosing A Representation Backbone and Ensuring Robustness

Demeter, David, Agarwal, Oshin, Igeri, Simon Ben, Sterbentz, Marko, Molino, Neil, Conroy, John M., Nenkova, Ani

arXiv.org Artificial IntelligenceJun-18-2023

Academic literature does not give much guidance on how to build the best possible customer-facing summarization system from existing research components. Here we present analyses to inform the selection of a system backbone from popular models; we find that in both automatic and human evaluation, BART performs better than PEGASUS and T5. We also find that when applied cross-domain, summarizers exhibit considerably worse performance. At the same time, a system fine-tuned on heterogeneous domains performs well on all domains and will be most suitable for a broad-domain summarizer. Our work highlights the need for heterogeneous domain summarization benchmarks. We find considerable variation in system output that can be captured only with human evaluation and are thus unlikely to be reflected in standard leaderboards with only automatic evaluation.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2306.10555

Country: North America > United States > Minnesota (0.14)

Genre: Research Report (0.82)

Industry:

Government (0.46)
Transportation > Air (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.70)

Add feedback

Requirements for Open Political Information: Transparency Beyond Open Data

Zhao, Andong Luis Li, Paley, Andrew, Adler, Rachel, Pack, Harper, Servantez, Sergio, Einarsson, Alexander, Barrie, Cameron, Sterbentz, Marko, Hammond, Kristian

arXiv.org Artificial IntelligenceDec-6-2021

A politically informed citizenry is imperative for a welldeveloped democracy. While the US government has pursued policies for open data, these efforts have been insufficient in achieving an open government because only people with technical and domain knowledge can access information in the data. In this work, we conduct user interviews to identify wants and needs among stakeholders. We further use this information to sketch out the foundational requirements for a functional political information technical system.

artificial intelligence, information, north america government, (18 more...)

arXiv.org Artificial Intelligence

2112.03119

Country: North America > United States > Illinois (0.18)

Genre:

Questionnaire & Opinion Survey (0.72)
Research Report (0.64)

Industry:

Government > Regional Government > North America Government > United States Government (0.90)
Government > Voting & Elections (0.71)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Information Management (0.71)
Information Technology > Human Computer Interaction (0.68)

Add feedback