AITopics | Bala, Kavita

Collaborating Authors

Bala, Kavita

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DiSciPLE: Learning Interpretable Programs for Scientific Visual Discovery

Mall, Utkarsh, Phoo, Cheng Perng, Chiquier, Mia, Hariharan, Bharath, Bala, Kavita, Vondrick, Carl

arXiv.org Artificial IntelligenceFeb-14-2025

Visual data is used in numerous different scientific workflows ranging from remote sensing to ecology. As the amount of observation data increases, the challenge is not just to make accurate predictions but also to understand the underlying mechanisms for those predictions. Good interpretation is important in scientific workflows, as it allows for better decision-making by providing insights into the data. This paper introduces an automatic way of obtaining such interpretable-by-design models, by learning programs that interleave neural networks. We propose DiSciPLE (Discovering Scientific Programs using LLMs and Evolution) an evolutionary algorithm that leverages common sense and prior knowledge of large language models (LLMs) to create Python programs explaining visual data. Additionally, we propose two improvements: a program critic and a program simplifier to improve our method further to synthesize good programs. On three different real-world problems, DiSciPLE learns state-of-the-art programs on novel tasks with no prior literature. For example, we can learn programs with 35% lower error than the closest non-interpretable baseline for population density estimation.

evolutionary algorithm, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2502.1006

Country: North America > United States (1.00)

Genre: Research Report (0.64)

Industry:

Government > Regional Government > North America Government > United States Government (0.68)
Education (0.66)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.89)

Add feedback

AllClear: A Comprehensive Dataset and Benchmark for Cloud Removal in Satellite Imagery

Zhou, Hangyu, Kao, Chia-Hsiang, Phoo, Cheng Perng, Mall, Utkarsh, Hariharan, Bharath, Bala, Kavita

arXiv.org Artificial IntelligenceOct-31-2024

Clouds in satellite imagery pose a significant challenge for downstream applications. A major challenge in current cloud removal research is the absence of a comprehensive benchmark and a sufficiently large and diverse training dataset. To address this problem, we introduce the largest public dataset -- $\textit{AllClear}$ for cloud removal, featuring 23,742 globally distributed regions of interest (ROIs) with diverse land-use patterns, comprising 4 million images in total. Each ROI includes complete temporal captures from the year 2022, with (1) multi-spectral optical imagery from Sentinel-2 and Landsat 8/9, (2) synthetic aperture radar (SAR) imagery from Sentinel-1, and (3) auxiliary remote sensing products such as cloud masks and land cover maps. We validate the effectiveness of our dataset by benchmarking performance, demonstrating the scaling law -- the PSNR rises from $28.47$ to $33.87$ with $30\times$ more data, and conducting ablation studies on the temporal length and the importance of individual modalities. This dataset aims to provide comprehensive coverage of the Earth's surface and promote better cloud removal results.

artificial intelligence, dataset, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.23891

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Law (1.00)
Information Technology (1.00)
Government (1.00)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.93)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Remote Sensing Vision-Language Foundation Models without Annotations via Ground Remote Alignment

Mall, Utkarsh, Phoo, Cheng Perng, Liu, Meilin Kelsey, Vondrick, Carl, Hariharan, Bharath, Bala, Kavita

arXiv.org Artificial IntelligenceDec-11-2023

We introduce a method to train vision-language models for remote-sensing images without using any textual annotations. Our key insight is to use co-located internet imagery taken on the ground as an intermediary for connecting remote-sensing images and language. Specifically, we train an image encoder for remote sensing images to align with the image encoder of CLIP using a large amount of paired internet and satellite images. Our unsupervised approach enables the training of a first-of-its-kind large-scale vision language model (VLM) for remote sensing images at two different resolutions. We show that these VLMs enable zero-shot, open-vocabulary image classification, retrieval, segmentation and visual question answering for satellite images. On each of these tasks, our VLM trained without textual annotations outperforms existing VLMs trained with supervision, with gains of up to 20% for classification and 80% for segmentation. Our planet is constantly captured by an extensive array of remote sensors such as satellites or drones. These earth observation images enable the monitoring of various events on the earth such as deforestation, forest fires, and droughts so that rapid actions can be taken to protect our environment. While these images can shed light on various insights about our planet, the scale of such data is huge. This has prompted the development of automatic analysis models that could extract relevant information from a large amount of remotely sensed images. While useful, these models are often specialized and can only recognize a pre-defined set of concepts. Besides, they could be complex, decreasing their accessibility to experts outside of the domain of artificial intelligence. Researchers developing automatic analysis methods for internet imagery encountered a similar problem a few years ago. One promising solution is to leverage large-scale vision-language models (VLMs) that are trained on millions or even billions of text-image pairs collected on the internet (Radford et al., 2021; Li et al., 2023). These models have demonstrated remarkable abilities to perform open-vocabulary recognition (Gu et al., 2022; Kuo et al., 2023) and enhance accessibility to non-AI experts (Alayrac et al., 2022; Surís et al., 2023). It would be incredibly valuable for a range of applications to replicate the success of openvocabulary recognition for satellite images as well, allowing an analyst to simply query, say, "Where are all the farmlands in the state of Massachusetts?" without requiring any new training or annotation for farms.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2312.0696

Country: North America > United States > Massachusetts (0.24)

Genre: Research Report > Promising Solution (0.34)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.50)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Interactive Consensus Agreement Games for Labeling Images

Upchurch, Paul (Cornell University) | Sedra, Daniel (Cornell University) | Mullen, Andrew (Cornell University) | Hirsh, Haym (Cornell University) | Bala, Kavita (Cornell University)

AAAI ConferencesSep-24-2016

Scene understanding algorithms in computer vision are improving dramatically by training deep convolutional neural networks on millions of accurately annotated images. Collecting large-scale datasets for this kind of training is challenging, and the learning algorithms are only as good as the data they train on. Training annotations are often obtained by taking the majority label from independent crowdsourced workers using platforms such as Amazon Mechanical Turk. However, the accuracy of the resulting annotations can vary, with the hardest-to-annotate samples having prohibitively low accuracy. Our insight is that in cases where independent worker annotations are poor more accurate results can be obtained by having workers collaborate. This paper introduces consensus agreement games, a novel method for assigning annotations to images by the agreement of multiple consensuses of small cliques of workers. We demonstrate that this approach reduces error by 37.8% on two different datasets at a cost of $0.10 or $0.17 per annotation. The higher cost is justified because our method does not need to be run on the entire dataset. Ultimately, our method enables us to more accurately annotate images and build more challenging training datasets for learning algorithms.

interactive consensus agreement game

AAAI Conferences

Fourth AAAI Conference on Human Computation and Crowdsourcing

Technology:

Information Technology > Artificial Intelligence > Vision (0.87)
Information Technology > Communications > Social Media > Crowdsourcing (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.53)

Add feedback