Goto

Collaborating Authors

 Vadodara


Robot DOG makes an appearance at the Met Gala - dressed in a tuxedo and adorned with a 1,000-carat diamond leash

Daily Mail - Science & tech

At New York's Met Gala, guests are known for attention-grabbing outfits, from Katy Perry's human chandelier dress to Kim Kardashian's all-black body suit. But one attendant in particular has stolen the limelight this year – and he's not even human. Indian-American entrepreneur Mona Patel rocked up to the annual event on Monday night with an adorable robotic dachshund in tow. Vector the robo-dog, developed by scientists at MIT, has a 1,000-carat diamond-studded leash and his own cute little specially-fitted tuxedo. Powered by AI and equipped with sensors, Vector has customised movement patterns and'just the right amount of sass', Vogue India reports.


YINYANG-ALIGN: Benchmarking Contradictory Objectives and Proposing Multi-Objective Optimization based DPO for Text-to-Image Alignment

arXiv.org Artificial Intelligence

Precise alignment in Text-to-Image (T2I) systems is crucial to ensure that generated visuals not only accurately encapsulate user intents but also conform to stringent ethical and aesthetic benchmarks. Incidents like the Google Gemini fiasco, where misaligned outputs triggered significant public backlash, underscore the critical need for robust alignment mechanisms. In contrast, Large Language Models (LLMs) have achieved notable success in alignment. Building on these advancements, researchers are eager to apply similar alignment techniques, such as Direct Preference Optimization (DPO), to T2I systems to enhance image generation fidelity and reliability. We present YinYangAlign, an advanced benchmarking framework that systematically quantifies the alignment fidelity of T2I systems, addressing six fundamental and inherently contradictory design objectives. Each pair represents fundamental tensions in image generation, such as balancing adherence to user prompts with creative modifications or maintaining diversity alongside visual coherence. YinYangAlign includes detailed axiom datasets featuring human prompts, aligned (chosen) responses, misaligned (rejected) AI-generated outputs, and explanations of the underlying contradictions.


Real Time Monitoring and Forecasting of COVID 19 Cases using an Adjusted Holt based Hybrid Model embedded with Wavelet based ANN

arXiv.org Machine Learning

Since the inception of the SARS - CoV - 2 (COVID - 19) novel coronavirus, a lot of time and effort is being allocated to estimate the trajectory and possibly, forecast with a reasonable degree of accuracy, the number of cases, recoveries, and deaths due to the same. The model proposed in this paper is a mindful step in the same direction. The primary model in question is a Hybrid Holt's Model embedded with a Wavelet-based ANN. To test its forecasting ability, we have compared three separate models, the first, being a simple ARIMA model, the second, also an ARIMA model with a wavelet-based function, and the third, being the proposed model. We have also compared the forecast accuracy of this model with that of a modern day Vanilla LSTM recurrent neural network model. We have tested the proposed model on the number of confirmed cases (daily) for the entire country as well as 6 hotspot states. We have also proposed a simple adjustment algorithm in addition to the hybrid model so that daily and/or weekly forecasts can be meted out, with respect to the entirety of the country, as well as a moving window performance metric based on out-of-sample forecasts. In order to have a more rounded approach to the analysis of COVID-19 dynamics, focus has also been given to the estimation of the Basic Reproduction Number, $R_0$ using a compartmental epidemiological model (SIR). Lastly, we have also given substantial attention to estimating the shelf-life of the proposed model. It is obvious yet noteworthy how an accurate model, in this regard, can ensure better allocation of healthcare resources, as well as, enable the government to take necessary measures ahead of time.


Automatic location detection based on deep learning

arXiv.org Artificial Intelligence

The proliferation of digital images and the advancements in deep learning have paved the way for innovative solutions in various domains, especially in the field of image classification. Our project presents an in-depth study and implementation of an image classification system specifically tailored to identify and classify images of Indian cities. Drawing from an extensive dataset, our model classifies images into five major Indian cities: Ahmedabad, Delhi, Kerala, Kolkata, and Mumbai to recognize the distinct features and characteristics of each city/state. To achieve high precision and recall rates, we adopted two approaches. The first, a vanilla Convolutional Neural Network (CNN) and then we explored the power of transfer learning by leveraging the VGG16 model. The vanilla CNN achieved commendable accuracy and the VGG16 model achieved a test accuracy of 63.6%. Evaluations highlighted the strengths and potential areas of improvement, positioning our model as not only competitive but also scalable for broader applications. With an emphasis on open-source ethos, our work aims to contribute to the community, encouraging further development and diverse applications. Our findings demonstrate the potential applications in tourism, urban planning, and even real-time location identification systems, among others.


A Strictly Bounded Deep Network for Unpaired Cyclic Translation of Medical Images

arXiv.org Artificial Intelligence

Medical image translation is an ill-posed problem. Unlike existing paired unbounded unidirectional translation networks, in this paper, we consider unpaired medical images and provide a strictly bounded network that yields a stable bidirectional translation. We propose a patch-level concatenated cyclic conditional generative adversarial network (pCCGAN) embedded with adaptive dictionary learning. It consists of two cyclically connected CGANs of 47 layers each; where both generators (each of 32 layers) are conditioned with concatenation of alternate unpaired patches from input and target modality images (not ground truth) of the same organ. The key idea is to exploit cross-neighborhood contextual feature information that bounds the translation space and boosts generalization. The generators are further equipped with adaptive dictionaries learned from the contextual patches to reduce possible degradation. Discriminators are 15-layer deep networks that employ minimax function to validate the translated imagery. A combined loss function is formulated with adversarial, non-adversarial, forward-backward cyclic, and identity losses that further minimize the variance of the proposed learning machine. Qualitative, quantitative, and ablation analysis show superior results on real CT and MRI.


An Annexure to the Paper "Driving the Technology Value Stream by Analyzing App Reviews"

arXiv.org Artificial Intelligence

This paper presents a novel framework that utilizes Natural Language Processing (NLP) techniques to understand user feedback on mobile applications. The framework allows software companies to drive their technology value stream based on user reviews, which can highlight areas for improvement. The framework is analyzed in depth, and its modules are evaluated for their effectiveness. The proposed approach is demonstrated to be effective through an analysis of reviews for sixteen popular Android Play Store applications over a long period of time.


FedGrad: Optimisation in Decentralised Machine Learning

arXiv.org Artificial Intelligence

Federated Learning is a machine learning paradigm where we aim to train machine learning models in a distributed fashion. Many clients/edge devices collaborate with each other to train a single model on the central. Clients do not share their own datasets with each other, decoupling computation and data on the same device. In this paper, we propose yet another adaptive federated optimization method and some other ideas in the field of federated learning. We also perform experiments using these methods and showcase the improvement in the overall performance of federated learning.


Exploration of Interpretability Techniques for Deep COVID-19 Classification using Chest X-ray Images

arXiv.org Artificial Intelligence

The outbreak of COVID-19 has shocked the entire world with its fairly rapid spread and has challenged different sectors. One of the most effective ways to limit its spread is the early and accurate diagnosis of infected patients. Medical imaging such as X-ray and Computed Tomography (CT) combined with the potential of Artificial Intelligence (AI) plays an essential role in supporting the medical staff in the diagnosis process. Thereby, five different deep learning models (ResNet18, ResNet34, InceptionV3, InceptionResNetV2, and DenseNet161) and their Ensemble have been used in this paper to classify COVID-19, pneumoni{\ae} and healthy subjects using Chest X-Ray images. Multi-label classification was performed to predict multiple pathologies for each patient, if present. Foremost, the interpretability of each of the networks was thoroughly studied using local interpretability methods - occlusion, saliency, input X gradient, guided backpropagation, integrated gradients, and DeepLIFT, and using a global technique - neuron activation profiles. The mean Micro-F1 score of the models for COVID-19 classifications ranges from 0.66 to 0.875, and is 0.89 for the Ensemble of the network models. The qualitative results depicted the ResNets to be the most interpretable models. This research demonstrates the importance of using interpretability methods to compare different models before making the decision regarding the best-performing model.


Design of Economical Fuzzy Logic Controller for Washing Machine

arXiv.org Artificial Intelligence

Things are becoming more advanced as technology advances, and machines now perform the majority of the manual work. The most often used home appliance is the washing machine for cloths. Modification and research in this field is essential since it pertains to the amount of time, water, and electricity required for washing. In this work, a Fuzzy Logic Controller has been developed for smart washing machines. The objective of this paper is to optimize the consumption of electricity, water, and detergent for washing machines. The type of dirt, volume of clothes, and type of cloth play a vital role in saving water, electricity, and detergent. However, none of the work on the Fuzzy Logic Controller provided a design procedure endowed with the specified inputs and outputs implemented in Python. In this paper, we used the Mamdani approach and created an algorithm based on multi-input multi-output. The algorithm is implemented in Python. The results of this simulation show that the washing machine provides better execution at a low computation cost.


Infographics Wizard: Flexible Infographics Authoring and Design Exploration

arXiv.org Artificial Intelligence

Infographics are an aesthetic visual representation of information following specific design principles of human perception. Designing infographics can be a tedious process for non-experts and time-consuming, even for professional designers. With the help of designers, we propose a semi-automated infographic framework for general structured and flow-based infographic design generation. For novice designers, our framework automatically creates and ranks infographic designs for a user-provided text with no requirement for design input. However, expert designers can still provide custom design inputs to customize the infographics. We will also contribute an individual visual group (VG) designs dataset (in SVG), along with a 1k complete infographic image dataset with segmented VGs in this work. Evaluation results confirm that by using our framework, designers from all expertise levels can generate generic infographic designs faster than existing methods while maintaining the same quality as hand-designed infographics templates.