AITopics | Manandhar, Suresh

Collaborating Authors

Manandhar, Suresh

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Domain-adaptative Continual Learning for Low-resource Tasks: Evaluation on Nepali

Duwal, Sharad, Prasai, Suraj, Manandhar, Suresh

arXiv.org Artificial IntelligenceDec-18-2024

Continual learning has emerged as an important research direction due to the infeasibility of retraining large language models (LLMs) from scratch in the event of new data availability. Of great interest is the domain-adaptive pre-training (DAPT) paradigm, which focuses on continually training a pre-trained language model to adapt it to a domain it was not originally trained on. In this work, we evaluate the feasibility of DAPT in a low-resource setting, namely the Nepali language. We use synthetic data to continue training Llama 3 8B to adapt it to the Nepali language in a 4-bit QLoRA setting. We evaluate the adapted model on its performance, forgetting, and knowledge acquisition. We compare the base model and the final model on their Nepali generation abilities, their performance on popular benchmarks, and run case-studies to probe their linguistic knowledge in Nepali. We see some unsurprising forgetting in the final model, but also surprisingly find that increasing the number of shots during evaluation yields better percent increases in the final model (as high as 19.29% increase) compared to the base model (4.98%), suggesting latent retention. We also explore layer-head self-attention heatmaps to establish dependency resolution abilities of the final model in Nepali.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2412.1386

Country:

Asia (0.93)
Europe (0.68)
North America > United States (0.68)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Cross-modal Contrastive Learning with Asymmetric Co-attention Network for Video Moment Retrieval

Panta, Love, Shrestha, Prashant, Sapkota, Brabeem, Bhattarai, Amrita, Manandhar, Suresh, Sah, Anand Kumar

arXiv.org Artificial IntelligenceDec-12-2023

Video moment retrieval is a challenging task requiring fine-grained interactions between video and text modalities. Recent work in image-text pretraining has demonstrated that most existing pretrained models suffer from information asymmetry due to the difference in length between visual and textual sequences. We question whether the same problem also exists in the video-text domain with an auxiliary need to preserve both spatial and temporal information. Thus, we evaluate a recently proposed solution involving the addition of an asymmetric co-attention network for video grounding tasks. Additionally, we incorporate momentum contrastive loss for robust, discriminative representation learning in both modalities. We note that the integration of these supplementary modules yields better performance compared to state-of-the-art models on the TACoS dataset and comparable results on ActivityNet Captions, all while utilizing significantly fewer parameters with respect to baseline.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2312.07435

Country: Europe > Italy (0.14)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

NEREL-BIO: A Dataset of Biomedical Abstracts Annotated with Nested Named Entities

Loukachevitch, Natalia, Manandhar, Suresh, Baral, Elina, Rozhkov, Igor, Braslavski, Pavel, Ivanov, Vladimir, Batura, Tatiana, Tutubalina, Elena

arXiv.org Artificial IntelligenceOct-21-2022

This paper describes NEREL-BIO -- an annotation scheme and corpus of PubMed abstracts in Russian and smaller number of abstracts in English. NEREL-BIO extends the general domain dataset NEREL by introducing domain-specific entity types. NEREL-BIO annotation scheme covers both general and biomedical domains making it suitable for domain transfer experiments. NEREL-BIO provides annotation for nested named entities as an extension of the scheme employed for NEREL. Nested named entities may cross entity boundaries to connect to shorter entities nested within longer entities, making them harder to detect. NEREL-BIO contains annotations for 700+ Russian and 100+ English abstracts. All English PubMed annotations have corresponding Russian counterparts. Thus, NEREL-BIO comprises the following specific features: annotation of nested named entities, it can be used as a benchmark for cross-domain (NEREL -> NEREL-BIO) and cross-language (English -> Russian) transfer. We experiment with both transformer-based sequence models and machine reading comprehension (MRC) models and report their results. The dataset is freely available at https://github.com/nerel-ds/NEREL-BIO.

artificial intelligence, entity type, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1093/bioinformatics/btad161

2210.11913

Country:

Europe (0.48)
Asia > Russia (0.29)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)
Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

Visualising Argumentation Graphs with Graph Embeddings and t-SNE

Malmqvist, Lars, Yuan, Tommy, Manandhar, Suresh

arXiv.org Artificial IntelligenceJul-1-2021

This paper applies t-SNE, a visualisation technique familiar from Deep Neural Network research to argumentation graphs by applying it to the output of graph embeddings generated using several different methods. It shows that such a visualisation approach can work for argumentation and show interesting structural properties of argumentation graphs, opening up paths for further research in the area.

artificial intelligence, graph, natural language, (16 more...)

arXiv.org Artificial Intelligence

2107.00528

Country: Asia > Nepal (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Multimodal deep learning for short-term stock volatility prediction

Sardelich, Marcelo, Manandhar, Suresh

arXiv.org Machine LearningDec-25-2018

Stock market volatility forecasting is a task relevant to assessing market risk. We investigate the interaction between news and prices for the one-day-ahead volatility prediction using state-of-the-art deep learning approaches. The proposed models are trained either end-to-end or using sentence encoders transfered from other tasks. We evaluate a broad range of stock market sectors, namely Consumer Staples, Energy, Utilities, Heathcare, and Financials. Our experimental results show that adding news improves the volatility forecasting as compared to the mainstream models that rely only on price data. In particular, our model outperforms the widely-recognized GARCH(1,1) model for all sectors in terms of coefficient of determination $R^2$, $MSE$ and $MAE$, achieving the best performance when training from both news and price data.

deep learning, sentence encoder, upstream oil & gas, (23 more...)

arXiv.org Machine Learning

1812.10479

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Energy > Power Industry > Utilities (1.00)
Energy > Oil & Gas > Upstream (1.00)
Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Evaluation of Complex-Valued Neural Networks on Real-Valued Classification Tasks

Mönning, Nils, Manandhar, Suresh

arXiv.org Machine LearningNov-29-2018

Complex-valued neural networks are not a new concept, however, the use of real-valued models has often been favoured over complex-valued models due to difficulties in training and performance. When comparing real-valued versus complex-valued neural networks, existing literature often ignores the number of parameters, resulting in comparisons of neural networks with vastly different sizes. We find that when real and complex neural networks of similar capacity are compared, complex models perform equal to or slightly worse than real-valued models for a range of real-valued classification tasks. The use of complex numbers allows neural networks to handle noise on the complex plane. When classifying real-valued data with a complex-valued neural network, the imaginary parts of the weights follow their real parts. This behaviour is indicative for a task that does not require a complex-valued model. We further investigated this in a synthetic classification task. We can transfer many activation functions from the real to the complex domain using different strategies. The weight initialisation of complex neural networks, however, remains a significant problem.

deep learning, identity 0, neural network, (18 more...)

arXiv.org Machine Learning

1811.12351

Country:

Europe > United Kingdom (0.14)
Europe > Italy (0.14)
Asia > Japan (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Vehicle classification using ResNets, localisation and spatially-weighted pooling

Watkins, Rohan, Pears, Nick, Manandhar, Suresh

arXiv.org Machine LearningOct-15-2018

We investigate whether ResNet architectures can outperform more traditional Convolutional Neural Networks on the task of fine-grained vehicle classification. We train and test ResNet-18, ResNet-34 and ResNet-50 on the Comprehensive Cars dataset without pre-training on other datasets. We then modify the networks to use Spatially Weighted Pooling. Finally, we add a localisation step before the classification process, using a network based on ResNet-50. We find that using Spatially Weighted Pooling and localisation both improve classification accuracy of ResNet50. Our method achieves higher accuracy than a range of methods including those that use traditional CNNs. However, our method does not perform quite as well as pre-trained networks that use Spatially Weighted Pooling. Keywords: Vehicle recognition, Intelligent surveillance, ResNets 1. Introduction In the fine-grained vehicle classification problem, a class consists of both make and model attributes, with the optional addition of the year that a particular model version was released (e.g. If such a'year' attribute is required, the difficulty of the problem increases significantly, due to the similarity of updated models. This problem differs from more coarse recognition, which may categorise by vehicle type (car, van, bus, etc) and have far fewer classes. Several methods have been used to try and solve fine-grained vehicle classification. The main limitation of these approaches is the inability to differentiate between similar car models.

accuracy, ground transportation, neural network, (20 more...)

arXiv.org Machine Learning

1810.10329

Country: Europe > Italy (0.14)

Genre: Research Report > New Finding (0.47)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Automobiles & Trucks > Manufacturer (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Stochastic Constraint Programming: A Scenario-Based Approach

Tarim, S. Armagan, Manandhar, Suresh, Walsh, Toby

arXiv.org Artificial IntelligenceMar-5-2009

To model combinatorial decision problems involving uncertainty and probability, we introduce scenario based stochastic constraint programming. Stochastic constraint programs contain both decision variables, which we can set, and stochastic variables, which follow a discrete probability distribution. We provide a semantics for stochastic constraint programs based on scenario trees. Using this semantics, we can compile stochastic constraint programs down into conventional (non-stochastic) constraint programs. This allows us to exploit the full power of existing constraint solvers. We have implemented this framework for decision making under uncertainty in stochastic OPL, a language which is based on the OPL constraint modelling language [Hentenryck et al., 1999]. To illustrate the potential of this framework, we model a wide range of problems in areas as diverse as portfolio diversification, agricultural planning and production/inventory management.

banking & finance, constraint-based reasoning, scenario, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s10601-006-6849-7

0903.1150

Country:

Europe (0.46)
Oceania > Australia > New South Wales (0.14)

Genre: Research Report (0.82)

Industry: Banking & Finance > Trading (0.34)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)

Add feedback