AITopics | Shah, Ankit

Collaborating Authors

Shah, Ankit

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Importance of negative sampling in weak label learning

Shah, Ankit, Tang, Fuyu, Ye, Zelin, Singh, Rita, Raj, Bhiksha

arXiv.org Artificial IntelligenceSep-22-2023

Weak-label learning is a challenging task that requires learning from data "bags" containing positive and negative instances, but only the bag labels are known. The pool of negative instances is usually larger than positive instances, thus making selecting the most informative negative instance critical for performance. Such a selection strategy for negative instances from each bag is an open problem that has not been well studied for weak-label learning. In this paper, we study several sampling strategies that can measure the usefulness of negative instances for weak-label learning and select them accordingly. We test our method on CIFAR-10 and AudioSet datasets and show that it improves the weak-label classification performance and reduces the computational cost compared to random sampling methods. Our work reveals that negative instances are not all equally irrelevant, and selecting them wisely can benefit weak-label learning.

importance, weak label

arXiv.org Artificial Intelligence

2309.13227

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Deep PackGen: A Deep Reinforcement Learning Framework for Adversarial Network Packet Generation

Hore, Soumyadeep, Ghadermazi, Jalal, Paudel, Diwas, Shah, Ankit, Das, Tapas K., Bastian, Nathaniel D.

arXiv.org Artificial IntelligenceMay-18-2023

Recent advancements in artificial intelligence (AI) and machine learning (ML) algorithms, coupled with the availability of faster computing infrastructure, have enhanced the security posture of cybersecurity operations centers (defenders) through the development of ML-aided network intrusion detection systems (NIDS). Concurrently, the abilities of adversaries to evade security have also increased with the support of AI/ML models. Therefore, defenders need to proactively prepare for evasion attacks that exploit the detection mechanisms of NIDS. Recent studies have found that the perturbation of flow-based and packet-based features can deceive ML models, but these approaches have limitations. Perturbations made to the flow-based features are difficult to reverse-engineer, while samples generated with perturbations to the packet-based features are not playable. Our methodological framework, Deep PackGen, employs deep reinforcement learning to generate adversarial packets and aims to overcome the limitations of approaches in the literature. By taking raw malicious network packets as inputs and systematically making perturbations on them, Deep PackGen camouflages them as benign packets while still maintaining their functionality. In our experiments, using publicly available data, Deep PackGen achieved an average adversarial success rate of 66.4\% against various ML models and across different attack types. Our investigation also revealed that more than 45\% of the successful adversarial samples were out-of-distribution packets that evaded the decision boundaries of the classifiers. The knowledge gained from our study on the adversary's ability to make specific evasive perturbations to different types of malicious packets can help defenders enhance the robustness of their NIDS against evolving adversarial attacks.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

arXiv.org Artificial Intelligence

2305.11039

Country: North America > United States > Florida > Hillsborough County > Tampa (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Automated Audio Captioning and Language-Based Audio Retrieval

Gomes, Clive, Park, Hyejin, Kollman, Patrick, Song, Yi, Houndayi, Iffanice, Shah, Ankit

arXiv.org Artificial IntelligenceMay-15-2023

This project involved participation in the DCASE 2022 Competition (Task 6) which had two subtasks: (1) Automated Audio Captioning and (2) Language-Based Audio Retrieval. The first subtask involved the generation of a textual description for audio samples, while the goal of the second was to find audio samples within a fixed dataset that match a given description. For both subtasks, the Clotho dataset was used. The models were evaluated on BLEU1, BLEU2, BLEU3, ROUGEL, METEOR, CIDEr, SPICE, and SPIDEr scores for audio captioning and R1, R5, R10 and mARP10 scores for audio retrieval. We have conducted a handful of experiments that modify the baseline models for these tasks. Our final architecture for Automated Audio Captioning is slightly better than the baseline performance, while our model for Language-Based Audio Retrieval has surpassed its counterpart.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2207.04156

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.15)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Improving Perceptual Quality, Intelligibility, and Acoustics on VoIP Platforms

Konan, Joseph, Bhargave, Ojas, Agnihotri, Shikhar, Lee, Hojeong, Shah, Ankit, Han, Shuo, Zeng, Yunyang, Shu, Amanda, Liu, Haohui, Chang, Xuankai, Khalid, Hamza, Gwak, Minseon, Lee, Kawon, Kim, Minjeong, Raj, Bhiksha

arXiv.org Artificial IntelligenceMar-15-2023

In this paper, we present a method for fine-tuning models trained on the Deep Noise Suppression (DNS) 2020 Challenge to improve their performance on Voice over Internet Protocol (VoIP) applications. Our approach involves adapting the DNS 2020 models to the specific acoustic characteristics of VoIP communications, which includes distortion and artifacts caused by compression, transmission, and platform-specific processing. To this end, we propose a multi-task learning framework for VoIP-DNS that jointly optimizes noise suppression and VoIP-specific acoustics for speech enhancement. We evaluate our approach on a diverse VoIP scenarios and show that it outperforms both industry performance and state-of-the-art methods for speech enhancement on VoIP applications. Our results demonstrate the potential of models trained on DNS-2020 to be improved and tailored to different VoIP platforms using VoIP-DNS, whose findings have important applications in areas such as speech recognition, voice assistants, and telecommunication.

application, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2303.09048

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Telecommunications (0.34)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Exploiting Contextual Structure to Generate Useful Auxiliary Tasks

Quartey, Benedict, Shah, Ankit, Konidaris, George

arXiv.org Artificial IntelligenceMar-9-2023

Reinforcement learning requires interaction with an environment, which is expensive for robots. This constraint necessitates approaches that work with limited environmental interaction by maximizing the reuse of previous experiences. We propose an approach that maximizes experience reuse while learning to solve a given task by generating and simultaneously learning useful auxiliary tasks. To generate these tasks, we construct an abstract temporal logic representation of the given task and leverage large language models to generate context-aware object embeddings that facilitate object replacements. Counterfactual reasoning and off-policy methods allow us to simultaneously learn these auxiliary tasks while solving the given target task. We combine these insights into a novel framework for multitask reinforcement learning and experimentally show that our generated auxiliary tasks share similar underlying exploration requirements as the given task, thereby maximizing the utility of directed exploration. Our approach allows agents to automatically learn additional useful policies without extra environment interaction.

machine learning, natural language, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2303.05038

Genre: Research Report (0.64)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Approach to Learning Generalized Audio Representation Through Batch Embedding Covariance Regularization and Constant-Q Transforms

Shah, Ankit, Chen, Shuyi, Zhou, Kejun, Chen, Yue, Raj, Bhiksha

arXiv.org Artificial IntelligenceMar-6-2023

General-purpose embedding is highly desirable for few-shot even zero-shot learning in many application scenarios, including the audio tasks. In order to understand representations better, we conducted thorough error analysis and visualization of HEAR 2021 submission results. Inspired by the analysis, this work experiments with different front-end audio preprocessing methods, including Constant-Q Transform (CQT) and Short-time Fourier transform (STFT), and proposes a Batch Embedding Covariance Regularization (BECR) term to uncover a more holistic simulation of the frequency information received by the human auditory system. We tested the models on the suite of HEAR 2021 tasks, which encompass a broad category of tasks. Preliminary results show (1) the proposed BECR can incur a more dispersed embedding on the test set, (2) BECR improves the PaSST model without extra computation complexity, and (3) STFT preprocessing outperforms CQT in all tasks we tested.

artificial intelligence, gini index, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2303.03591

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.16)
Asia (0.15)
Oceania > Australia (0.14)
Europe > Spain (0.14)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Natural Language (0.66)

Add feedback

Skill Transfer for Temporally-Extended Task Specifications

Liu, Jason Xinyu, Shah, Ankit, Rosen, Eric, Konidaris, George, Tellex, Stefanie

arXiv.org Artificial IntelligenceMar-5-2023

Deploying robots in real-world domains, such as households and flexible manufacturing lines, requires the robots to be taskable on demand. Linear temporal logic (LTL) is a widely-used specification language with a compositional grammar that naturally induces commonalities across tasks. However, the majority of prior research on reinforcement learning with LTL specifications treats every new formula independently. We propose LTL-Transfer, a novel algorithm that enables subpolicy reuse across tasks by segmenting policies for training tasks into portable transition-centric skills capable of satisfying a wide array of unseen LTL specifications while respecting safety-critical constraints. Experiments in a Minecraft-inspired domain show that LTL-Transfer can satisfy over 90% of 500 unseen tasks after training on only 50 task specifications and never violating a safety constraint. We also deployed LTL-Transfer on a quadruped mobile manipulator in an analog household environment to demonstrate its ability to transfer to many fetch and delivery tasks in a zero-shot fashion.

logic & formal reasoning, machine learning, specification, (20 more...)

arXiv.org Artificial Intelligence

2206.05096

Genre: Research Report (0.40)

Industry:

Education (0.50)
Leisure & Entertainment > Games > Computer Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.93)

Add feedback

Temporal Logic Imitation: Learning Plan-Satisficing Motion Policies from Demonstrations

Wang, Yanwei, Figueroa, Nadia, Li, Shen, Shah, Ankit, Shah, Julie

arXiv.org Artificial IntelligenceDec-14-2022

In prior work, learning from demonstration (LfD) [1, 2] has successfully enabled robots to accomplish multi-step tasks by segmenting demonstrations (primarily of robot end-effector or tool trajectories) into sub-tasks/goals [3, 4, 5, 6, 7, 8], phases [9, 10], keyframes [11, 12], or skills/primitives/options [13, 14, 15, 16]. Most of these abstractions assume reaching subgoals sequentially will deliver the desired outcomes; however, successful imitation of many manipulation tasks with spatial/temporal constraints cannot be reduced to imitation at the motion level unless the learned motion policy also satisfies these constraints. This becomes highly relevant if we want robots to not only imitate but also generalize, adapt and be robust to perturbations imposed by humans, who are in the loop of task learning and execution. LfD techniques that learn stable motion policies with convergence guarantees (e.g., Dynamic Movement Primitives (DMP) [17], Dynamical Systems (DS) [18]) are capable of providing such desired properties but only at the motion level. As shown in Figure 1 (a-b) a robot can successfully replay a soup-scooping task while being robust to physical perturbations with a learned DS. Nevertheless, if the spoon orientation is perturbed to a state where all material is dropped, as seen in Figure 1 (c), the motion policy will still lead the robot to the target, unaware of the task-level failure or how to recover from it.

artificial intelligence, demonstration, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2206.04632

Country: Europe (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.34)

Add feedback

On the pragmatism of using binary classifiers over data intensive neural network classifiers for detection of COVID-19 from voice

Shah, Ankit, Dhamyal, Hira, Gao, Yang, Arancibia, Daniel, Arancibia, Mario, Raj, Bhiksha, Singh, Rita

arXiv.org Artificial IntelligenceOct-25-2022

In a self-assesment study, COVID patients reported difficulty producing certain voiced sounds and noticed changes in Lately, there has been a global effort by multiple research groups their voice [8]. to detect COVID-19 from voice. Different researchers use different Consequently, a number of research groups around the world kinds of information from the voice signal to achieve this. Various have initiated efforts on attempting to diagnose potential Covid infections types of phonated sounds and the sound of cough and breath have from recordings of vocalizations [9, 5]. While most groups all been used with varying degree of success in automated voice have focused on cough sounds [10, 11, 12] as they are a frequent based COVID-19 detection apps. In this paper, we show that detecting symptom of Covid-19, several groups have also considered other COVID-19 from voice does not require custom made nonstandard vocalizations, such as breathing sounds [10, 13] extended vowels features or complicated neural network classifiers rather it [14, 15, 16], and counts. Yet other teams have analyzed free-form can be successfully done with just standard features and simple binary speech such as those obtainable from YouTube recordings[17].

artificial intelligence, classifier, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2204.04802

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

Add feedback

An Overview of Techniques for Biomarker Discovery in Voice Signal

Singh, Rita, Shah, Ankit, Dhamyal, Hira

arXiv.org Artificial IntelligenceOct-9-2021

This paper reflects on the effect of several categories of medical conditions on human voice, focusing on those that may be hypothesized to have effects on voice, but for which the changes themselves may be subtle enough to have eluded observation in standard analytical examinations of the voice signal. It presents three categories of techniques that can potentially uncover such elusive biomarkers and allow them to be measured and used for predictive and diagnostic purposes. These approaches include proxy techniques, model-based analytical techniques and data-driven AI techniques.

artificial intelligence, health & medicine, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2110.04678

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.98)
Health & Medicine > Therapeutic Area > Musculoskeletal (0.95)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback