AITopics | Honke, Garrett

Collaborating Authors

Honke, Garrett

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LMD3: Language Model Data Density Dependence

Kirchenbauer, John, Honke, Garrett, Somepalli, Gowthami, Geiping, Jonas, Ippolito, Daphne, Lee, Katherine, Goldstein, Tom, Andre, David

arXiv.org Artificial IntelligenceMay-10-2024

We develop a methodology for analyzing language model task performance at the individual example level based on training data density estimation. Experiments with paraphrasing as a controlled intervention on finetuning data demonstrate that increasing the support in the training distribution for specific test queries results in a measurable increase in density, which is also a significant predictor of the performance increase caused by the intervention. Experiments with pretraining data demonstrate that we can explain a significant fraction of the variance in model perplexity via density measurements. We conclude that our framework can provide statistical evidence of the dependence of a target model's predictions on subsets of its training data, and can more generally be used to characterize the support (or lack thereof) in the training data for a given test task.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2405.06331

Country: Europe > Austria > Vienna (0.14)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)

Add feedback

FLawN-T5: An Empirical Examination of Effective Instruction-Tuning Data Mixtures for Legal Reasoning

Niklaus, Joel, Zheng, Lucia, McCarthy, Arya D., Hahn, Christopher, Rosen, Brian M., Henderson, Peter, Ho, Daniel E., Honke, Garrett, Liang, Percy, Manning, Christopher

arXiv.org Artificial IntelligenceApr-2-2024

Instruction tuning is an important step in making language models useful for direct user interaction. However, many legal tasks remain out of reach for most open LLMs and there do not yet exist any large scale instruction datasets for the domain. This critically limits research in this application area. In this work, we curate LawInstruct, a large legal instruction dataset, covering 17 jurisdictions, 24 languages and a total of 12M examples. We present evidence that domain-specific pretraining and instruction tuning improve performance on LegalBench, including improving Flan-T5 XL by 8 points or 16\% over the baseline. However, the effect does not generalize across all tasks, training regimes, model sizes, and other factors. LawInstruct is a resource for accelerating the development of models with stronger information processing and decision making capabilities in the legal domain.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2404.02127

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.45)

Industry:

Education > Curriculum > Subject-Specific Education (1.00)
Law > Litigation (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Using Deep Networks and Transfer Learning to Address Disinformation

Dhamani, Numa, Azunre, Paul, Gleason, Jeffrey L., Corcoran, Craig, Honke, Garrett, Kramer, Steve, Morgan, Jonathon

arXiv.org Artificial IntelligenceMay-24-2019

We apply an ensemble pipeline composed of a character-level convolutional neural network (CNN) and a long short-term memory (LSTM) as a general tool for addressing a range of disinformation problems. We also demonstrate the ability to use this architecture to transfer knowledge from labeled data in one domain to related (supervised and unsupervised) tasks. Character-level neural networks and transfer learning are particularly valuable tools in the disinformation space because of the messy nature of social media, lack of labeled data, and the multi-channel tactics of influence campaigns. We demonstrate their effectiveness in several tasks relevant for detecting disinformation: spam emails, review bombing, political sentiment, and conversation clustering.

deep learning, neural network, transfer learning, (21 more...)

arXiv.org Artificial Intelligence

1905.10412

Country: North America > United States > New York (0.29)

Genre: Research Report (0.50)

Industry:

Media > News (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback