Melbourne
- North America > United States (0.14)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Asia > China > Hong Kong (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- North America > United States > South Carolina (0.04)
- Asia > Middle East > Oman > Muscat Governorate > Muscat (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology (1.00)
- Government > Voting & Elections (0.67)
- Media > News (0.53)
- Government > Regional Government > North America Government > United States Government (0.45)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Ukraine > Kyiv Oblast > Kyiv (0.05)
- Asia > Middle East > Syria (0.04)
- (18 more...)
- Government (1.00)
- Law (0.93)
- Leisure & Entertainment > Sports > Basketball (0.67)
- Law Enforcement & Public Safety (0.67)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Asia > China > Hong Kong > Sha Tin (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (5 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
Welcome to the dark side of crypto's permissionless dream
Jean-Paul Thorbjornsen is a leader of THORChain, a blockchain that is not supposed to have any leaders--and is reeling from a series of expensive controversies. We can do whatever we want," Jean-Paul Thorbjornsen tells me from the pilot's seat of his Aston Martin helicopter. As we fly over suburbs outside Melbourne, Australia, it's becoming clear that doing whatever he wants is Thorbjornsen's MO. Upper-middle-class homes give way to vineyards, and Thorbjornsen points out our landing spot outside a winery. "They're going to ask for a shot now," he says, used to the attention drawn by his luxury helicopter, emblazoned with the tail letters "BTC" for bitcoin (the price tag of $5 million in Australian dollars--$3.5 million in US dollars today--was perhaps reasonable for someone who claims a previous crypto project made more than AU$400 million, although he also says those funds were tied up in the company). Thorbjornsen is a founder of THORChain, a blockchain through which users can swap ...
- Asia > North Korea (0.47)
- Oceania > Australia > Victoria > Melbourne (0.24)
- Europe > Germany (0.14)
- (7 more...)
- Government > Regional Government > North America Government > United States Government (1.00)
- Banking & Finance > Trading (1.00)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Data Science > Data Quality (0.94)
- (6 more...)
- Europe > Italy (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Asia > Middle East > Jordan (0.04)
- (11 more...)
- Research Report > Experimental Study (0.93)
- Overview (0.68)
- Information Technology (0.46)
- Health & Medicine (0.46)
Parallelizing Linear Transformers with the Delta Rule over Sequence Length Songlin Y ang Bailin Wang Y u Zhang Yikang Shen Y oon Kim Massachusetts Institute of Technology Soochow University
Transformers with linear attention (i.e., linear transfor mers) and state-space models have recently been suggested as a viable linear-time alt ernative to transformers with softmax attention. However, these models still underp erform transformers especially on tasks that require in-context retrieval. Whil e more expressive variants of linear transformers which replace the additive upda te in linear transformers with the delta rule [DeltaNet; 101 ] have been found to be more effective at associative recall, existing algorithms for training such mode ls do not parallelize over sequence length and are thus inefficient to train on modern ha rdware. This work describes a hardware-efficient algorithm for training line ar transformers with the delta rule, which exploits a memory-efficient representati on for computing products of Householder matrices [ 11 ]. This algorithm allows us to scale up DeltaNet to standard language modeling settings. We train a 1.3B mode l for 100B tokens and find that it outperforms recent linear-time baselines su ch as Mamba [ 31 ] and GLA [ 124 ] in terms of perplexity and zero-shot performance on downst ream tasks. We also experiment with two hybrid models which combine Delt aNet layers with (1) sliding-window attention layers every other layer or (2) two global attention layers, and find that these hybrids outperform strong transf ormer baselines.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Africa > Rwanda > Kigali > Kigali (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- (19 more...)
- Education (0.67)
- Health & Medicine (0.48)
- Oceania > Australia > Victoria > Melbourne (0.04)
- North America > United States > Michigan (0.04)
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.93)
- Information Technology (0.67)
- Banking & Finance (0.67)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.46)
- Information Technology > Data Science (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- Europe > Italy (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- South America > Colombia > Meta Department > Villavicencio (0.04)
- (9 more...)
- Law (1.00)
- Banking & Finance (0.92)
- Government (0.92)
- (3 more...)