AITopics | ramakrishnan

Collaborating Authors

ramakrishnan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ORIENT: SubmodularMutualInformationMeasures forDataSubsetSelectionunderDistributionShift

Neural Information Processing SystemsFeb-11-2026, 23:56:01 GMT

The recent success of deep learning frameworks in applications such as image classification [9], speech recognition [20], and object detection [13] stems primarily from the availability of large amounts of labeled data.

artificial intelligence, machine learning, orient, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Automatic Speech Recognition for Sanskrit with Transfer Learning

Sadhukhan, Bidit, Punyeshwarananda, Swami

arXiv.org Artificial IntelligenceJan-17-2025

Sanskrit, one of humanity's most ancient languages, has a vast collection of books and manuscripts on diverse topics that have been accumulated over millennia. However, its digital content (audio and text), which is vital for the training of AI systems, is profoundly limited. Furthermore, its intricate linguistics make it hard to develop robust NLP tools for wider accessibility. Given these constraints, we have developed an automatic speech recognition model for Sanskrit by employing transfer learning mechanism on OpenAI's Whisper model. After carefully optimising the hyper-parameters, we obtained promising results with our transfer-learned model achieving a word error rate of 15.42% on Vaksancayah dataset. An online demo of our model is made available for the use of public and to evaluate its performance firsthand thereby paving the way for improved accessibility and technological support for Sanskrit learning in the modern era.

machine learning, natural language, sanskrit, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/C3IT60531.2024.10829416

2501.10024

Country:

Asia > India (0.05)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Southeast Asia (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Automatic Speech Recognition for Hindi

Saha, Anish, Ramakrishnan, A. G.

arXiv.org Artificial IntelligenceJun-26-2024

Automatic speech recognition (ASR) is a key area in computational linguistics, focusing on developing technologies that enable computers to convert spoken language into text. This field combines linguistics and machine learning. ASR models, which map speech audio to transcripts through supervised learning, require handling real and unrestricted text. Text-to-speech systems directly work with real text, while ASR systems rely on language models trained on large text corpora. High-quality transcribed data is essential for training predictive models. The research involved two main components: developing a web application and designing a web interface for speech recognition. The web application, created with JavaScript and Node.js, manages large volumes of audio files and their transcriptions, facilitating collaborative human correction of ASR transcripts. It operates in real-time using a client-server architecture. The web interface for speech recognition records 16 kHz mono audio from any device running the web app, performs voice activity detection (VAD), and sends the audio to the recognition engine. VAD detects human speech presence, aiding efficient speech processing and reducing unnecessary processing during non-speech intervals, thus saving computation and network bandwidth in VoIP applications. The final phase of the research tested a neural network for accurately aligning the speech signal to hidden Markov model (HMM) states. This included implementing a novel backpropagation method that utilizes prior statistics of node co-activations.

activity detection, detection, transcript, (14 more...)

arXiv.org Artificial Intelligence

2406.18135

Country:

Europe > Austria > Vienna (0.14)
Asia > India > Karnataka > Bengaluru (0.14)
Asia > Indonesia > Bali (0.05)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.89)

Add feedback

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

Grauman, Kristen, Westbury, Andrew, Torresani, Lorenzo, Kitani, Kris, Malik, Jitendra, Afouras, Triantafyllos, Ashutosh, Kumar, Baiyya, Vijay, Bansal, Siddhant, Boote, Bikram, Byrne, Eugene, Chavis, Zach, Chen, Joya, Cheng, Feng, Chu, Fu-Jen, Crane, Sean, Dasgupta, Avijit, Dong, Jing, Escobar, Maria, Forigua, Cristhian, Gebreselasie, Abrham, Haresh, Sanjay, Huang, Jing, Islam, Md Mohaiminul, Jain, Suyog, Khirodkar, Rawal, Kukreja, Devansh, Liang, Kevin J, Liu, Jia-Wei, Majumder, Sagnik, Mao, Yongsen, Martin, Miguel, Mavroudi, Effrosyni, Nagarajan, Tushar, Ragusa, Francesco, Ramakrishnan, Santhosh Kumar, Seminara, Luigi, Somayazulu, Arjun, Song, Yale, Su, Shan, Xue, Zihui, Zhang, Edward, Zhang, Jinxu, Castillo, Angela, Chen, Changan, Fu, Xinzhu, Furuta, Ryosuke, Gonzalez, Cristina, Gupta, Prince, Hu, Jiabo, Huang, Yifei, Huang, Yiming, Khoo, Weslie, Kumar, Anush, Kuo, Robert, Lakhavani, Sach, Liu, Miao, Luo, Mi, Luo, Zhengyi, Meredith, Brighid, Miller, Austin, Oguntola, Oluwatumininu, Pan, Xiaqing, Peng, Penny, Pramanick, Shraman, Ramazanova, Merey, Ryan, Fiona, Shan, Wei, Somasundaram, Kiran, Song, Chenan, Southerland, Audrey, Tateno, Masatoshi, Wang, Huiyu, Wang, Yuchen, Yagi, Takuma, Yan, Mingfei, Yang, Xitong, Yu, Zecheng, Zha, Shengxin Cindy, Zhao, Chen, Zhao, Ziwei, Zhu, Zhifan, Zhuo, Jeff, Arbelaez, Pablo, Bertasius, Gedas, Crandall, David, Damen, Dima, Engel, Jakob, Farinella, Giovanni Maria, Furnari, Antonino, Ghanem, Bernard, Hoffman, Judy, Jawahar, C. V., Newcombe, Richard, Park, Hyun Soo, Rehg, James M., Sato, Yoichi, Savva, Manolis, Shi, Jianbo, Shou, Mike Zheng, Wray, Michael

arXiv.org Artificial IntelligenceDec-14-2023

We present Ego-Exo4D, a diverse, large-scale multimodal multiview video dataset and benchmark challenge. Ego-Exo4D centers around simultaneously-captured egocentric and exocentric video of skilled human activities (e.g., sports, music, dance, bike repair). More than 800 participants from 13 cities worldwide performed these activities in 131 different natural scene contexts, yielding long-form captures from 1 to 42 minutes each and 1,422 hours of video combined. The multimodal nature of the dataset is unprecedented: the video is accompanied by multichannel audio, eye gaze, 3D point clouds, camera poses, IMU, and multiple paired language descriptions -- including a novel "expert commentary" done by coaches and teachers and tailored to the skilled-activity domain. To push the frontier of first-person video understanding of skilled human activity, we also present a suite of benchmark tasks and their annotations, including fine-grained activity understanding, proficiency estimation, cross-view translation, and 3D hand/body pose. All resources will be open sourced to fuel new research in the community.

atomic action description, demonstration proficiency estimation, skill demonstration, (16 more...)

arXiv.org Artificial Intelligence

2311.18259

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.13)
(23 more...)

Genre:

Research Report > New Finding (1.00)
Instructional Material (1.00)

Industry:

Leisure & Entertainment > Sports > Soccer (1.00)
Education > Educational Setting (0.92)
Leisure & Entertainment > Sports > Basketball (0.92)
(4 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Communications (1.00)
(9 more...)

Add feedback

Learning-Augmented Model-Based Planning for Visual Exploration

Li, Yimeng, Debnath, Arnab, Stein, Gregory, Kosecka, Jana

arXiv.org Artificial IntelligenceAug-9-2023

We consider the problem of time-limited robotic exploration in previously unseen environments where exploration is limited by a predefined amount of time. We propose a novel exploration approach using learning-augmented model-based planning. We generate a set of subgoals associated with frontiers on the current map and derive a Bellman Equation for exploration with these subgoals. Visual sensing and advances in semantic mapping of indoor scenes are exploited for training a deep convolutional neural network to estimate properties associated with each frontier: the expected unobserved area beyond the frontier and the expected timesteps (discretized actions) required to explore it. The proposed model-based planner is guaranteed to explore the whole scene if time permits. We thoroughly evaluate our approach on a large-scale pseudo-realistic indoor dataset (Matterport3D) with the Habitat simulator. We compare our approach with classical and more recent RL-based exploration methods. Our approach surpasses the greedy strategies by 2.1% and the RL-based exploration methods by 8.4% in terms of coverage.

exploration, frontier, navigation, (13 more...)

arXiv.org Artificial Intelligence

2211.07898

Country:

North America > United States > Virginia > Fairfax County > Fairfax (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Robots (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Low-Resource End-to-end Sanskrit TTS using Tacotron2, WaveGlow and Transfer Learning

Debnath, Ankur, Patil, Shridevi S, Nadiger, Gangotri, Ganesan, Ramakrishnan Angarai

arXiv.org Artificial IntelligenceDec-7-2022

End-to-end text-to-speech (TTS) systems have been developed for European languages like English and Spanish with state-of-the-art speech quality, prosody, and naturalness. However, development of end-to-end TTS for Indian languages is lagging behind in terms of quality. The challenges involved in such a task are: 1) scarcity of quality training data; 2) low efficiency during training and inference; 3) slow convergence in the case of large vocabulary size. In our work reported in this paper, we have investigated the use of fine-tuning the English-pretrained Tacotron2 model with limited Sanskrit data to synthesize natural sounding speech in Sanskrit in low resource settings. Our experiments show encouraging results, achieving an overall MOS of 3.38 from 37 evaluators with good Sanskrit spoken knowledge. This is really a very good result, considering the fact that the speech data we have used is of duration 2.5 hours only.

artificial intelligence, deep learning, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2212.03558

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.05)
Asia > India > Karnataka > Bengaluru (0.05)
Asia > Malaysia (0.04)
Asia > Indonesia > Bali (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.45)
Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.38)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

IT enters the era of intelligent automation

#artificialintelligenceSep-29-2022, 12:54:02 GMT

Since the outset of the pandemic, organizations have been increasingly launching initiatives aimed at automating business processes, turning to technologies such as robotic process automation (RPA) in efforts to reduce costs, speed up tasks, and improve accuracy of core business operations. Some leading organizations, however, are not stopping there. Seeking to push their automation agendas forward, they are embracing a move toward broader "intelligent automation" (IA), a strategy that weaves capabilities such as artificial intelligence (AI) and machine learning (ML) into standard RPA to enhance its functionality. In addition to RPA, AI, and ML, intelligent automation strategies can also incorporate a mix of technologies such as natural language processing, chatbots, and others that complement each other, says Lakshmanan Chidambaram, president of Americas strategic verticals at global IT consulting firm Tech Mahindra. "These technologies together allow us to automate business processes to a larger extent, when compared to simple RPA automations," Chidambaram says.

automation, intelligent automation, nallapati, (16 more...)

#artificialintelligence

Industry: Information Technology > Software (0.30)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Towards Improving Adversarial Training of NLP Models

Yoo, Jin Yong, Qi, Yanjun

arXiv.org Artificial IntelligenceSep-11-2021

Adversarial training, a method for learning robust deep neural networks, constructs adversarial examples during training. However, recent methods for generating NLP adversarial examples involve combinatorial search and expensive sentence encoders for constraining the generated instances. As a result, it remains challenging to use vanilla adversarial training to improve NLP models' performance, and the benefits are mainly uninvestigated. This paper proposes a simple and improved vanilla adversarial training process for NLP models, which we name Attacking to Training (A2T). The core part of A2T is a new and cheaper word substitution attack optimized for vanilla adversarial training. We use A2T to train BERT and RoBERTa models on IMDB, Rotten Tomatoes, Yelp, and SNLI datasets. Our results empirically show that it is possible to train robust NLP models using a much cheaper adversary. We demonstrate that vanilla adversarial training with A2T can improve an NLP model's robustness to the attack it was originally trained with and also defend the model against other types of word substitution attacks. Furthermore, we show that A2T can improve NLP models' standard accuracy, cross-domain generalization, and interpretability. Code is available at https://github.com/QData/Textattack-A2T .

adversarial example, adversarial training, computational linguistic, (14 more...)

arXiv.org Artificial Intelligence

2109.00544

Country:

North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
North America > United States > Oregon (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
(3 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

AI's human protein database a 'great leap' for research - Tech Wire Asia

#artificialintelligenceAug-2-2021, 08:25:51 GMT

Scientists last month unveiled the most exhaustive database yet of the proteins that form the building blocks of life, in a breakthrough where observers said would "fundamentally change biological research". Every cell in every living organism is triggered to perform its function by proteins that deliver constant instructions to maintain health and ward off infection. Unlike the genome -- the complete sequence of human genes that encode cellular life -- the human proteome is constantly changing in response to genetic instructions and environmental stimuli. Understanding how proteins operate -- the shape in which they end up, or "fold" into -- within cells has fascinated scientists for decades. But determining each protein's precise function through direct experimentation is painstaking.

amino acid sequence, great leap, protein, (13 more...)

#artificialintelligence

Country:

Asia (0.40)
Europe > France (0.05)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.32)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

AI's human protein database a 'great leap' for research

#artificialintelligenceJul-24-2021, 04:40:25 GMT

Scientists on Thursday unveiled the most exhaustive database yet of the proteins that form the building blocks of life, in a breakthrough observers said would "fundamentally change biological research". Every cell in every living organism is triggered to perform its function by proteins that deliver constant instructions to maintain health and ward off infection. Unlike the genome -- the complete sequence of human genes that encode cellular life -- the human proteome is constantly changing in response to genetic instructions and environmental stimuli. Understanding how proteins operate -- the shape in which they end up, or "fold" into -- within cells has fascinated scientists for decades. But determining each protein's precise function through direct experimentation is painstaking.

amino acid sequence, great leap, protein, (14 more...)

#artificialintelligence

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.32)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback