mojo
B'MOJO: Hybrid State Space Realizations of Foundation Models with Eidetic and Fading Memory
We leverage ideas from Stochastic Realization Theory to develop a class of models called B'MOJO to seamlessly combine eidetic and fading memory within an elementary composable module. The overall architecture can be used to implement models that can access short-term eidetic memory "in-context," permanent structural memory "in-weights,"
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- Asia > Middle East > Jordan (0.04)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.68)
- Media (0.46)
- Government (0.46)
B'MOJO: Hybrid State Space Realizations of Foundation Models with Eidetic and Fading Memory
We describe a family of architectures to support transductive inference by allowing memory to grow to a finite but a-priori unknown bound while making efficient use of finite resources for inference. Current architectures use such resources to represent data either eidetically over a finite span ('context' in Transformers), or fading over an infinite span (in State Space Models, or SSMs). Recent hybrid architectures have combined eidetic and fading memory, but with limitations that do not allow the designer or the learning process to seamlessly modulate the two, nor to extend the eidetic memory span. We leverage ideas from Stochastic Realization Theory to develop a class of models called B'MOJO to seamlessly combine eidetic and fading memory within an elementary composable module. The overall architecture can be used to implement models that can access short-term eidetic memory'in-context,' permanent structural memory'in-weights,' fading memory'in-state,' and long-term eidetic memory'in-storage' by natively incorporating retrieval from an asynchronously updated memory. We show that Transformers, existing SSMs such as Mamba, and hybrid architectures such as Jamba are special cases of B'MOJO and describe a basic implementation that can be stacked and scaled efficiently in hardware. We test B'MOJO on transductive inference tasks, such as associative recall, where it outperforms existing SSMs and Hybrid models; as a baseline, we test ordinary language modeling where B'MOJO achieves perplexity comparable to similarly-sized Transformers and SSMs up to 1.4B parameters, while being up to 10% faster to train. Finally, we test whether models trained inductively on a-priori bounded sequences (up to 8K tokens) can still perform transductive inference on sequences many-fold longer. B'MOJO's ability to modulate eidetic and fading memory results in better inference on longer sequences tested up to 32K tokens, four-fold the length of the longest sequences seen during training.
B'MOJO: Hybrid State Space Realizations of Foundation Models with Eidetic and Fading Memory
We leverage ideas from Stochastic Realization Theory to develop a class of models called B'MOJO to seamlessly combine eidetic and fading memory within an elementary composable module. The overall architecture can be used to implement models that can access short-term eidetic memory "in-context," permanent structural memory "in-weights,"
- North America > Mexico > Gulf of Mexico (0.04)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- Asia > Middle East > Jordan (0.04)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.68)
- Media (0.46)
- Government (0.46)
B'MOJO: Hybrid State Space Realizations of Foundation Models with Eidetic and Fading Memory
We describe a family of architectures to support transductive inference by allowing memory to grow to a finite but a-priori unknown bound while making efficient use of finite resources for inference. Current architectures use such resources to represent data either eidetically over a finite span ('context' in Transformers), or fading over an infinite span (in State Space Models, or SSMs). Recent hybrid architectures have combined eidetic and fading memory, but with limitations that do not allow the designer or the learning process to seamlessly modulate the two, nor to extend the eidetic memory span. We leverage ideas from Stochastic Realization Theory to develop a class of models called B'MOJO to seamlessly combine eidetic and fading memory within an elementary composable module. The overall architecture can be used to implement models that can access short-term eidetic memory'in-context,' permanent structural memory'in-weights,' fading memory'in-state,' and long-term eidetic memory'in-storage' by natively incorporating retrieval from an asynchronously updated memory.
MojoBench: Language Modeling and Benchmarks for Mojo
Raihan, Nishat, Santos, Joanna C. S., Zampieri, Marcos
The recently introduced Mojo programming language (PL) by Modular, has received significant attention in the scientific community due to its claimed significant speed boost over Python. Despite advancements in code Large Language Models (LLMs) across various PLs, Mojo remains unexplored in this context. To address this gap, we introduce MojoBench, the first framework for Mojo code generation. MojoBench includes HumanEval-Mojo, a benchmark dataset designed for evaluating code LLMs on Mojo, and Mojo-Coder, the first LLM pretrained and finetuned for Mojo code generation, which supports instructions in 5 natural languages (NLs). Our results show that Mojo-Coder achieves a 30-35% performance improvement over leading models like GPT-4o and Claude-3.5-Sonnet. Furthermore, we provide insights into LLM behavior with underrepresented and unseen PLs, offering potential strategies for enhancing model adaptability. MojoBench contributes to our understanding of LLM capabilities and limitations in emerging programming paradigms fostering more robust code generation systems.
- North America > United States > Virginia > Fairfax County > Fairfax (0.04)
- North America > United States > Indiana > St. Joseph County > Notre Dame (0.04)
B'MOJO: Hybrid State Space Realizations of Foundation Models with Eidetic and Fading Memory
Zancato, Luca, Seshadri, Arjun, Dukler, Yonatan, Golatkar, Aditya, Shen, Yantao, Bowman, Benjamin, Trager, Matthew, Achille, Alessandro, Soatto, Stefano
We describe a family of architectures to support transductive inference by allowing memory to grow to a finite but a-priori unknown bound while making efficient use of finite resources for inference. Current architectures use such resources to represent data either eidetically over a finite span ("context" in Transformers), or fading over an infinite span (in State Space Models, or SSMs). Recent hybrid architectures have combined eidetic and fading memory, but with limitations that do not allow the designer or the learning process to seamlessly modulate the two, nor to extend the eidetic memory span. We leverage ideas from Stochastic Realization Theory to develop a class of models called B'MOJO to seamlessly combine eidetic and fading memory within an elementary composable module. The overall architecture can be used to implement models that can access shortterm eidetic memory "in-context," permanent structural memory "in-weights," fading memory "in-state," and long-term eidetic memory "in-storage" by natively incorporating retrieval from an asynchronously updated memory. We show that Transformers, existing SSMs such as Mamba, and hybrid architectures such as Jamba are special cases of B'MOJO and describe a basic implementation, to be open sourced, that can be stacked and scaled efficiently in hardware. We test B'MOJO on transductive inference tasks, such as associative recall, where it outperforms existing SSMs and Hybrid models; as a baseline, we test ordinary language modeling where B'MOJO achieves perplexity comparable to similarlysized Transformers and SSMs up to 1.4B parameters, while being up to 10% faster to train. Finally, we test whether models trained inductively on a-priori bounded sequences (up to 8K tokens) can still perform transductive inference on sequences many-fold longer. B'MOJO's ability to modulate eidetic and fading memory results in better inference on longer sequences tested up to 32K tokens, four-fold the length of the longest sequences seen during training.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- Asia > Middle East > Jordan (0.04)
Revamping Python for an AI World
Python is one of the most popular programming languages in existence. Easy to learn and easy to use, it has been around for years, so there is a large community of Python developers to support each other, and it has built up an ecosystem of libraries that allow users to drop in the functionalities they need. It does, however, come with downsides: its programs tend to run slowly, and because it is inefficient at running processes in parallel, it is not well suited to some of the latest artificial intelligence (AI) programming. Hoping to overcome those difficulties, computer scientist Chris Lattner set out to create a new language, Mojo, which offers the ease of use of Python, but the performance of more complex languages such as C or Rust. He teamed up with Tim Davis, whom he had met when they both worked for Google, to form Modular in January 2022.
- Oceania > Australia > Queensland (0.05)
- North America > United States > Massachusetts > Middlesex County > Lowell (0.05)
- North America > United States > Illinois (0.05)
Top Startups developing AI for Recruiting
Recruiting is a tough job. Navigating through a pool of talent and selecting the candidate best suited for a particular role is a difficult and time-taking process. However, the application of Artificial Intelligence is now largely automating the process and making it simpler. The use of AI streamlines the process and reduces much of the manual tasks involved in recruitment. Artificial Intelligence is here to not only change the role of recruiters but also to improve the relationship between candidates and hiring managers.
This AI-powered microscope is really good at counting sperm
Mohamed Taha's ambition to disrupt the fertility industry began after his first sperm test. He had just been diagnosed with a kidney disease (which later turned out to be a misdiagnosis) and his doctor advised him to freeze his sperm as a precaution. According to the World Health Organisation, a normal sperm count is around 15 million sperm per millilitre (m/ml) of semen. Taha's sperm count was 15 times lower. Concerned, he chose to do a second analysis at a different clinic and, to his surprise, the result was far more positive: 20 m/ml.
- Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (0.97)
- Health & Medicine > Therapeutic Area > Urology (0.85)
- Health & Medicine > Therapeutic Area > Endocrinology (0.85)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Robots (0.75)
A medtech perspective: how can male fertility testing be improved?
There has been little innovation in the analysis of male fertility since the 1950s. Why? Tobias Boecker, Paris-based medtech entrepreneur is co-founder of Nanovare – a system that uses AI to help medical professionals to improve and speed up male fertility analysis – explores the current crisis in this field and what can be done to change things Birth rates are dropping, notably in the Western world. Between 1960 and 2015, the birth rate per woman in Germany decreased from 2.4 to 1.5, in the UK from 2.7 to 1.8 and in the US from 3.7 to 1.8. There are many external factors that influence this trend such as the increased age of marriage, the focus on professional careers, or the modernisation of social systems which eliminate parents' need to have many children to help support them in old age. Another factor that is becoming increasingly important is our own reproductive health, which depends on our lifestyle choices (eg smoking), environmental influences (eg pollution), and medical factors (eg genetic preconditions).
- North America > United States (0.25)
- Europe > United Kingdom (0.25)
- Europe > Germany (0.25)
- Asia > China > Guangdong Province > Shenzhen (0.05)