Sikkim
Intelligent Systems and Robotics: Revolutionizing Engineering Industries
Anumula, Sathish Krishna, Ponnarangan, Sivaramkumar, Nujumudeen, Faizal, Deka, Ms. Nilakshi, Balamuralitharan, S., Venkatesh, M
-- A mix of intelligent systems and robotics is making engineering industries much more efficient, precise and able to adapt. How artificial intelligence (AI), machine learning (ML) and autonomous robotic technologies are changing manufacturing, civil, electrical and mechanical engineering is discussed in this paper. Based on recent findings and a sugges ted way to evaluate intelligent robotic systems in industry, we give an overview of how their use impacts productivity, safety an d operational costs. Experience and case studies confirm the benefits this area brings and the problems that have yet to be sol ved. The findings indicate that intelligent robotics involves more than a technology change; it introduces important new methods in engineering . I. INTRODUCTION Because of rapid advancements in technology, engineering industries have changed a lot.
- Overview (1.00)
- Research Report (0.82)
- Construction & Engineering (1.00)
- Energy > Power Industry (0.47)
SimpleQA Verified: A Reliable Factuality Benchmark to Measure Parametric Knowledge
Haas, Lukas, Yona, Gal, D'Antonio, Giovanni, Goldshtein, Sasha, Das, Dipanjan
We introduce SimpleQA Verified, a 1,000-prompt benchmark for evaluating Large Language Model (LLM) short-form factuality based on OpenAI's SimpleQA. It addresses critical limitations in OpenAI's benchmark, including noisy and incorrect labels, topical biases, and question redundancy. SimpleQA Verified was created through a rigorous multi-stage filtering process involving de-duplication, topic balancing, and source reconciliation to produce a more reliable and challenging evaluation set, alongside improvements in the autorater prompt. On this new benchmark, Gemini 2.5 Pro achieves a state-of-the-art F1-score of 55.6, outperforming other frontier models, including GPT-5. This work provides the research community with a higher-fidelity tool to track genuine progress in parametric model factuality and to mitigate hallucinations. The benchmark dataset, evaluation code, and leaderboard are available at: https://www.kaggle.com/benchmarks/deepmind/simpleqa-verified.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- South America > Colombia (0.04)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- (7 more...)
- Leisure & Entertainment (1.00)
- Government (0.69)
- Media > Television (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.46)
Zero-Shot KWS for Children's Speech using Layer-Wise Features from SSL Models
Kutum, Subham, Sinha, Abhijit, Kathania, Hemant Kumar, Kadiri, Sudarsana Reddy, Govil, Mahesh Chandra
Numerous methods have been proposed to enhance Keyword Spotting (KWS) in adult speech, but children's speech presents unique challenges for KWS systems due to its distinct acoustic and linguistic characteristics. This paper introduces a zero-shot KWS approach that leverages state-of-the-art self-supervised learning (SSL) models, including Wav2Vec2, HuBERT and Data2Vec. Features are extracted layer-wise from these SSL models and used to train a Kaldi-based DNN KWS system. The WSJCAM0 adult speech dataset was used for training, while the PFSTAR children's speech dataset was used for testing, demonstrating the zero-shot capability of our method. Our approach achieved state-of-the-art results across all keyword sets for children's speech. Notably, the Wav2Vec2 model, particularly layer 22, performed the best, delivering an ATWV score of 0.691, a MTWV score of 0.7003 and probability of false alarm and probability of miss of 0.0164 and 0.0547 respectively, for a set of 30 keywords. Furthermore, age-specific performance evaluation confirmed the system's effectiveness across different age groups of children. To assess the system's robustness against noise, additional experiments were conducted using the best-performing layer of the best-performing Wav2Vec2 model. The results demonstrated a significant improvement over traditional MFCC-based baseline, emphasizing the potential of SSL embeddings even in noisy conditions. To further generalize the KWS framework, the experiments were repeated for an additional CMU dataset. Overall the results highlight the significant contribution of SSL features in enhancing Zero-Shot KWS performance for children's speech, effectively addressing the challenges associated with the distinct characteristics of child speakers.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Asia > India > Sikkim (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.91)
- (2 more...)
Can Layer-wise SSL Features Improve Zero-Shot ASR Performance for Children's Speech?
Sinha, Abhijit, Kathania, Hemant Kumar, Kadiri, Sudarsana Reddy, Narayanan, Shrikanth
Automatic Speech Recognition (ASR) systems often struggle to accurately process children's speech due to its distinct and highly variable acoustic and linguistic characteristics. While recent advancements in self-supervised learning (SSL) models have greatly enhanced the transcription of adult speech, accurately transcribing children's speech remains a significant challenge. This study investigates the effectiveness of layer-wise features extracted from state-of-the-art SSL pre-trained models - specifically, Wav2Vec2, HuBERT, Data2Vec, and WavLM in improving the performance of ASR for children's speech in zero-shot scenarios. A detailed analysis of features extracted from these models was conducted, integrating them into a simplified DNN-based ASR system using the Kaldi toolkit. The analysis identified the most effective layers for enhancing ASR performance on children's speech in a zero-shot scenario, where WSJCAM0 adult speech was used for training and PFSTAR children speech for testing. Experimental results indicated that Layer 22 of the Wav2Vec2 model achieved the lowest Word Error Rate (WER) of 5.15%, representing a 51.64% relative improvement over the direct zero-shot decoding using Wav2Vec2 (WER of 10.65%). Additionally, age group-wise analysis demonstrated consistent performance improvements with increasing age, along with significant gains observed even in younger age groups using the SSL features. Further experiments on the CMU Kids dataset confirmed similar trends, highlighting the generalizability of the proposed approach.
- North America > United States > California (0.14)
- Europe > Russia > Northwestern Federal District > Leningrad Oblast > Saint Petersburg (0.04)
- Asia > Russia (0.04)
- Asia > India > Sikkim (0.04)
Layer-Wise Analysis of Self-Supervised Representations for Age and Gender Classification in Children's Speech
Sinha, Abhijit, Kumar, Harishankar, Joshi, Mohit, Kathania, Hemant Kumar, Narayanan, Shrikanth, Kadiri, Sudarsana Reddy
Children's speech presents challenges for age and gender classification due to high variability in pitch, articulation, and developmental traits. While self-supervised learning (SSL) models perform well on adult speech tasks, their ability to encode speaker traits in children remains underexplored. This paper presents a detailed layer-wise analysis of four Wav2Vec2 variants using the PFSTAR and CMU Kids datasets. Results show that early layers (1-7) capture speaker-specific cues more effectively than deeper layers, which increasingly focus on linguistic information. Applying PCA further improves classification, reducing redundancy and highlighting the most informative components. The Wav2Vec2-large-lv60 model achieves 97.14% (age) and 98.20% (gender) on CMU Kids; base-100h and large-lv60 models reach 86.05% and 95.00% on PFSTAR. These results reveal how speaker traits are structured across SSL model depth and support more targeted, adaptive strategies for child-aware speech interfaces.
- North America > United States > California (0.14)
- Europe > Spain (0.04)
- Asia > India > Sikkim (0.04)
- Asia > China > Liaoning Province > Dalian (0.04)
A Novel Graph Transformer Framework for Gene Regulatory Network Inference
The inference of gene regulatory networks (GRNs) is a foundational stride towards deciphering the fundamentals of complex biological systems. Inferring a possible regulatory link between two genes can be formulated as a link prediction problem. Inference of GRNs via gene coexpression profiling data may not always reflect true biological interactions, as its susceptibility to noise and misrepresenting true biological regulatory relationships. Most GRN inference methods face several challenges in the network reconstruction phase. Therefore, it is important to encode gene expression values, leverege the prior knowledge gained from the available inferred network structures and positional informations of the input network nodes towards inferring a better and more confident GRN network reconstruction. In this paper, we explore the integration of multiple inferred networks to enhance the inference of Gene Regulatory Networks (GRNs). Primarily, we employ autoencoder embeddings to capture gene expression patterns directly from raw data, preserving intricate biological signals. Then, we embed the prior knowledge from GRN structures transforming them into a text-like representation using random walks, which are then encoded with a masked language model, BERT, to generate global embeddings for each gene across all networks. Additionally, we embed the positional encodings of the input gene networks to better identify the position of each unique gene within the graph. These embeddings are integrated into graph transformer-based model, termed GT-GRN, for GRN inference. The GT-GRN model effectively utilizes the topological structure of the ground truth network while incorporating the enriched encoded information. Experimental results demonstrate that GT-GRN significantly outperforms existing GRN inference methods, achieving superior accuracy and highlighting the robustness of our approach.
- North America > United States (0.14)
- Europe > Netherlands > South Holland > Leiden (0.04)
- Asia > India > Sikkim (0.04)
- (2 more...)
Indian Voters Are Being Bombarded With Millions of Deepfakes. Political Candidates Approve
On a stifling April afternoon in Ajmer, in the Indian state of Rajasthan, local politician Shakti Singh Rathore sat down in front of a greenscreen to shoot a short video. It was his first time being cloned. Wearing a crisp white shirt and a ceremonial saffron scarf bearing a lotus flower--the logo of the BJP, the country's ruling party--Rathore pressed his palms together and greeted his audience in Hindi. Before he could continue, the director of the shoot walked into the frame. Divyendra Singh Jadoun, a 31-year-old with a bald head and a thick black beard, told Rathore he was moving around too much on camera.
- Government > Voting & Elections (1.00)
- Government > Regional Government > Asia Government > India Government (0.84)
Comparing skill of historical rainfall data based monsoon rainfall prediction in India with NCEP-NWP forecasts
Narula, Apoorva, Jain, Aastha, Batra, Jatin, Juneja, Sandeep
In this draft we consider the problem of forecasting rainfall across India during the four monsoon months, one day as well as three days in advance. We train neural networks using historical daily gridded precipitation data for India obtained from IMD for the time period $1901- 2022$, at a spatial resolution of $1^{\circ} \times 1^{\circ}$. This is compared with the numerical weather prediction (NWP) forecasts obtained from NCEP (National Centre for Environmental Prediction) available for the period 2011-2022. We conduct a detailed country wide analysis and separately analyze some of the most populated cities in India. Our conclusion is that forecasts obtained by applying deep learning to historical rainfall data are more accurate compared to NWP forecasts as well as predictions based on persistence. On average, compared to our predictions, forecasts from NCEP-NWP model have about 34% higher error for a single day prediction, and over 68% higher error for a three day prediction. Similarly, persistence estimates report a 29% higher error in a single day forecast, and over 54% error in a three day forecast. We further observe that data up to 20 days in the past is useful in reducing errors of one and three day forecasts, when a transformer based learning architecture, and to a lesser extent when an LSTM is used. A key conclusion suggested by our preliminary analysis is that NWP forecasts can be substantially improved upon through more and diverse data relevant to monsoon prediction combined with carefully selected neural network architecture.
- Asia > India > Maharashtra > Mumbai (0.05)
- Asia > India > Tamil Nadu > Chennai (0.05)
- Asia > India > West Bengal > Kolkata (0.05)
- (7 more...)
Constrained Twin Variational Auto-Encoder for Intrusion Detection in IoT Systems
Dinh, Phai Vu, Nguyen, Quang Uy, Hoang, Dinh Thai, Nguyen, Diep N., Bao, Son Pham, Dutkiewicz, Eryk
Intrusion detection systems (IDSs) play a critical role in protecting billions of IoT devices from malicious attacks. However, the IDSs for IoT devices face inherent challenges of IoT systems, including the heterogeneity of IoT data/devices, the high dimensionality of training data, and the imbalanced data. Moreover, the deployment of IDSs on IoT systems is challenging, and sometimes impossible, due to the limited resources such as memory/storage and computing capability of typical IoT devices. To tackle these challenges, this article proposes a novel deep neural network/architecture called Constrained Twin Variational Auto-Encoder (CTVAE) that can feed classifiers of IDSs with more separable/distinguishable and lower-dimensional representation data. Additionally, in comparison to the state-of-the-art neural networks used in IDSs, CTVAE requires less memory/storage and computing power, hence making it more suitable for IoT IDS systems. Extensive experiments with the 11 most popular IoT botnet datasets show that CTVAE can boost around 1% in terms of accuracy and Fscore in detection attack compared to the state-of-the-art machine learning and representation learning methods, whilst the running time for attack detection is lower than 2E-6 seconds and the model size is lower than 1 MB. We also further investigate various characteristics of CTVAE in the latent space and in the reconstruction representation to demonstrate its efficacy compared with current well-known methods.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Asia > Vietnam > Hanoi > Hanoi (0.04)
- South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
- (11 more...)
Machine Learning, Deep Learning and Data Preprocessing Techniques for Detection, Prediction, and Monitoring of Stress and Stress-related Mental Disorders: A Scoping Review
Razavi, Moein, Ziyadidegan, Samira, Jahromi, Reza, Kazeminasab, Saber, Janfaza, Vahid, Mahmoudzadeh, Ahmadreza, Baharlouei, Elaheh, Sasangohar, Farzan
This comprehensive review systematically evaluates Machine Learning (ML) methodologies employed in the detection, prediction, and analysis of mental stress and its consequent mental disorders (MDs). Utilizing a rigorous scoping review process, the investigation delves into the latest ML algorithms, preprocessing techniques, and data types employed in the context of stress and stress-related MDs. The findings highlight that Support Vector Machine (SVM), Neural Network (NN), and Random Forest (RF) models consistently exhibit superior accuracy and robustness among all machine learning algorithms examined. Furthermore, the review underscores that physiological parameters, such as heart rate measurements and skin response, are prevalently used as stress predictors in ML algorithms. This is attributed to their rich explanatory information concerning stress and stress-related MDs, as well as the relative ease of data acquisition. Additionally, the application of dimensionality reduction techniques, including mappings, feature selection, filtering, and noise reduction, is frequently observed as a crucial step preceding the training of ML algorithms. The synthesis of this review identifies significant research gaps and outlines future directions for the field. These encompass areas such as model interpretability, model personalization, the incorporation of naturalistic settings, and real-time processing capabilities for detection and prediction of stress and stress-related MDs.
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Singapore (0.04)
- North America > United States > Texas (0.04)
- (22 more...)
- Overview (1.00)
- Research Report > New Finding (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)