scratch
- Asia > South Korea > Seoul > Seoul (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Learning Multimodal Behaviors from Scratch with Diffusion Policy Gradient
Deep reinforcement learning (RL) algorithms typically parameterize the policy as a deep network that outputs either a deterministic action or a stochastic one modeled as a Gaussian distribution, hence restricting learning to a single behavioral mode. Meanwhile, diffusion models emerged as a powerful framework for multimodal learning. However, the use of diffusion policies in online RL is hindered by the intractability of policy likelihood approximation, as well as the greedy objective of RL methods that can easily skew the policy to a single mode. This paper presents Deep Diffusion Policy Gradient (DDiffPG), a novel actor-critic algorithm that learns from scratch multimodal policies parameterized as diffusion models while discovering and maintaining versatile behaviors. DDiffPG explores and discovers multiple modes through off-the-shelf unsupervised clustering combined with novelty-based intrinsic motivation.
DeepRAG: Building a Custom Hindi Embedding Model for Retrieval Augmented Generation from Scratch
In this paper, I present our work on DeepRAG, a specialized embedding model we built specifically for Hindi language in RAG systems. While LLMs have gotten really good at generating text, their performance in retrieval tasks still depends heavily on having quality embeddings - something that's been lacking for Hindi despite being one of the world's most spoken languages. We tackled this by creating embeddings from the ground up rather than just fine-tuning existing models. Our process involved collecting diverse Hindi texts (over 2.7M samples), training a custom SentencePiece tokenizer that actually understands Hindi morphology, designing transformer architecture with Hindi-specific attention mechanisms, and optimizing with contrastive learning. Results were honestly better than I expected - we saw a 23% improvement in retrieval precision compared to the multilingual models everyone's been using. The paper details our methodology, which I think could help others working with low-resource languages where the one-size-fits-all multilingual models fall short. We've also integrated our embeddings with LangChain to build complete Hindi RAG systems, which might be useful for practitioners. While there's still tons more to explore, I believe this work addresses a critical gap for Hindi NLP and demonstrates why language-specific approaches matter.
Eagle 2: Building Post-Training Data Strategies from Scratch for Frontier Vision-Language Models
Li, Zhiqi, Chen, Guo, Liu, Shilong, Wang, Shihao, VS, Vibashan, Ji, Yishen, Lan, Shiyi, Zhang, Hao, Zhao, Yilin, Radhakrishnan, Subhashree, Chang, Nadine, Sapra, Karan, Deshmukh, Amala Sanjay, Rintamaki, Tuomas, Le, Matthieu, Karmanov, Ilia, Voegtle, Lukas, Fischer, Philipp, Huang, De-An, Roman, Timo, Lu, Tong, Alvarez, Jose M., Catanzaro, Bryan, Kautz, Jan, Tao, Andrew, Liu, Guilin, Yu, Zhiding
Recently, promising progress has been made by open-source vision-language models (VLMs) in bringing their capabilities closer to those of proprietary frontier models. However, most open-source models only publish their final model weights, leaving the critical details of data strategies and implementation largely opaque. In this work, we address VLM post-training from a data-centric perspective, showing the key role of data strategy in developing frontier VLMs. By studying and building our post-training data strategy from scratch, we share detailed insights into the development processes, aiming to benefit the development of competitive models for the open-source community. Our introduced data strategy, together with training recipes and model design, leads to a family of performant VLMs named Eagle2. Specifically, Eagle2-9B achieves state-of-the-art results across various multimodal benchmarks, matching certain competitive models with up to 70B parameters.
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- North America > United States > New York (0.04)
- Asia > Middle East > Jordan (0.04)
- (2 more...)
TacticZero: Learning to Prove Theorems from Scratch with Deep Reinforcement Learning
We propose a novel approach to interactive theorem-proving (ITP) using deep reinforcement learning. The proposed framework is able to learn proof search strategies as well as tactic and arguments prediction in an end-to-end manner. We formulate the process of ITP as a Markov decision process (MDP) in which each state represents a set of potential derivation paths. This structure allows us to introduce a novel backtracking mechanism which enables the agent to efficiently discard (predicted) dead-end derivations and restart the derivation from promising alternatives. We implement the framework in the HOL theorem prover.
Building A Recurrent Neural Network From Scratch In Python
We can think of the RNN model shown in figure 1 as a repeated use of a single cell shown in figure 2. First, we will implement a single cell and then we can loop through it to stack multiple of these single cells over each other and create the forward pass of the RNN model. The basic RNN cell takes as input? Let's implement the RNN cell shown in figure 2 through these four main steps: Let's first implement the softmax activation function:
Code Point Net from Scratch in Pytorch
In this article we will learn how to code Point Net from scratch in PyTorch. This version of Point Net will allow for classification or semantic segmentation. If you are not familiar with Point Net please see this article. If you would just like to code it, please proceed forward, we will break down Point Net and try to understand it piece by piece. The code for this article is stored in this repository.
[FREE] Deep Learning: Neural Networks In Javascript From Scratch
This course will teach how to build and train an Artificial Neural Network from scratch using only Javascript(No library). We will use only an IDEA and a browser. It is structured to help you genuinely learn Deep Learning by starting from the basics until advanced concepts. We will learn and code every component of a Deep learning architecture from scratch, uncovering all the magic behind Artificial Neural Networks. To prepare the students for real life, we will develop our ANN framework following the TensorFlow API, and we will compare our implementation with Tensorflow.js, this way you will know what is under the hood of the Deep learning libraries.