Goto

Collaborating Authors

 scratch


Distilled Wasserstein Learning for Word Embedding and Topic Modeling

Hongteng Xu, Wenlin Wang, Wei Liu, Lawrence Carin

Neural Information Processing Systems

Theworddistributions of topics, their optimal transports to the word distributions of documents, and the embeddings of words are learned in a unified framework. When learning thetopic model, weleverage adistilled underlying distance matrix toupdate the topic distributions and smoothly calculate the corresponding optimal transports.


TacticZero: Learning to Prove Theorems from Scratch with Deep Reinforcement Learning

Neural Information Processing Systems

We propose a novel approach to interactive theorem-proving (ITP) using deep reinforcement learning. The proposed framework is able to learn proof search strategies as well as tactic and arguments prediction in an end-to-end manner. We formulate the process of ITP as a Markov decision process (MDP) in which each state represents a set of potential derivation paths. This structure allows us to introduce a novel backtracking mechanism which enables the agent to efficiently discard (predicted) dead-end derivations and restart the derivation from promising alternatives. We implement the framework in the HOL theorem prover. Experimental results show that the framework using learned search strategies outperforms existing automated theorem provers (i.e., hammers) available in HOL when evaluated on unseen problems. We further elaborate the role of key components of the framework using ablation studies.


Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision

Neural Information Processing Systems

Recent AI-assistant agents, such as ChatGPT, predominantly rely on supervised fine-tuning (SFT) with human annotations and reinforcement learning from human feedback (RLHF) to align the output of large language models (LLMs) with human intentions, ensuring they are helpful, ethical, and reliable. However, this dependence can significantly constrain the true potential of AI-assistant agents due to the high cost of obtaining human supervision and the related issues on quality, reliability, diversity, self-consistency, and undesirable biases. To address these challenges, we propose a novel approach called SELF-ALIGN, which combines principle-driven reasoning and the generative power of LLMs for the self-alignment of AI agents with minimal human supervision. Our approach encompasses four stages: first, we use an LLM to generate synthetic prompts, and a topic-guided method to augment the prompt diversity; second, we use a small set of human-written principles for AI models to follow, and guide the LLM through in-context learning from demonstrations (of principles application) to produce helpful, ethical, and reliable responses to user's queries; third, we fine-tune the original LLM with the high-quality self-aligned responses so that the resulting model can generate desirable responses for each query directly without the principle set and the demonstrations anymore; and finally, we offer a refinement step to address the issues of overly-brief or indirect responses. Applying SELF-ALIGN to the LLaMA-65b base language model, we develop an AI assistant named Dromedary. With fewer than 300 lines of human annotations (including < 200 seed prompts, 16 generic principles, and 5 exemplars for in-context learning).



DeepRAG: Building a Custom Hindi Embedding Model for Retrieval Augmented Generation from Scratch

M, Nandakishor

arXiv.org Artificial Intelligence

In this paper, I present our work on DeepRAG, a specialized embedding model we built specifically for Hindi language in RAG systems. While LLMs have gotten really good at generating text, their performance in retrieval tasks still depends heavily on having quality embeddings - something that's been lacking for Hindi despite being one of the world's most spoken languages. We tackled this by creating embeddings from the ground up rather than just fine-tuning existing models. Our process involved collecting diverse Hindi texts (over 2.7M samples), training a custom SentencePiece tokenizer that actually understands Hindi morphology, designing transformer architecture with Hindi-specific attention mechanisms, and optimizing with contrastive learning. Results were honestly better than I expected - we saw a 23% improvement in retrieval precision compared to the multilingual models everyone's been using. The paper details our methodology, which I think could help others working with low-resource languages where the one-size-fits-all multilingual models fall short. We've also integrated our embeddings with LangChain to build complete Hindi RAG systems, which might be useful for practitioners. While there's still tons more to explore, I believe this work addresses a critical gap for Hindi NLP and demonstrates why language-specific approaches matter.


Code Point Net from Scratch in Pytorch

#artificialintelligence

In this article we will learn how to code Point Net from scratch in PyTorch. This version of Point Net will allow for classification or semantic segmentation. If you are not familiar with Point Net please see this article. If you would just like to code it, please proceed forward, we will break down Point Net and try to understand it piece by piece. The code for this article is stored in this repository.


Data Science Roadmap

#artificialintelligence

It's easy to feel overwhelmed by the amount of tools and skills required to become a data scientist. While it can take years to master everything, there are clear steps you can take to get started towards your goal. As with any big goal, keep in mind that it might not be possible to get there overnight: much like climbing a mountain or running a marathon, becoming a data scientist will require patience, grit, and practice. But if you're motivated by the prospect of working with data for a living, let this guide serve as the map for the journey ahead. Programming is an important part of working as a data scientist.


How to Approach CNN Architecture from Scratch? - Analytics Vidhya

#artificialintelligence

This article was published as a part of the Data Science Blogathon. As a consequence of the large quantity of data accessible, particularly in the form of photographs and videos, the need for Deep Learning is growing by the day. Many advanced designs have been observed for diverse objectives, but Convolution Neural Network – Deep Learning techniques are the foundation for everything. So that'll be the topic of today's piece. Deep learning is a machine learning and artificial intelligence (AI) area that mimics how people learn.


Python and Data Science from Scratch With RealLife Exercises

#artificialintelligence

Welcome to my "Python and Data Science from Scratch With Real Life Exercises" course. OAK Academy offers highly-rated data science courses that will help you learn how to visualize and respond to new data, as well as develop innovative new technologies. Whether you're interested in machine learning, data mining, or data analysis, Udemy has a course for you. Better data science practices are allowing corporations to cut unnecessary costs, automate computing, and analyze markets. Essentially, data science is the key to getting ahead in a competitive global climate. Python instructors on OAK Academy specialize in everything from software development to data analysis and are known for their effective, friendly instruction for students of all levels. Whether you work in machine learning or finance or are pursuing a career in web development or data science, Python is one of the most important skills you can learn. Python's simple syntax is especially suited for desktop, web, and business applications. Python's design philosophy emphasizes readability and usability.


Theoretical Machine Learning From Scratch - Linear Models

#artificialintelligence

This course will be your guide to learning how to use the power of theory, math and python to create linear regression and logistic regression, two of most popular and useful machine learning models from scratch. This course is designed for folks with some programming experience or experienced developers looking to make the jump to data science and machine learning, I'll teach you how to dive deep into the math behind the linear models in an easy and understandable way. Once, you have understood the inner workings of the linear models and uncovered the black box, you are ready to code everything from the ground up without using any fancy ready made machine learning libraries and yes you will be taught that too! The course is beneficial for understanding the machine learning concepts deeply rather than just using some library to get results, it will guide you in the right direction for learning many other machine learning and deep learning algorithms, as this course covers all the basics required, you will be well on your way to becoming an expert Data Scientist! Since this course goes deep into the math and has coding from scratch, a basic to intermediate knowledge of coding is a must, also good idea of derivatives(calculus), linear algebra(matrix multiplication) and basic probability is required to get the full out of this course.