Goto

Collaborating Authors

 group


ExpandNets: LinearOver-parameterization toTrainCompactConvolutionalNetworks-SupplementaryMaterial-AComplementaryExperiments

Neural Information Processing Systems

However,withdeep networks, initialization can have an important effect on the final results. While designing an initialization strategy specifically for compact networks is an unexplored research direction, our ExpandNets can be initialized in a natural manner. Note that this strategy yields an additional accuracy boost to our approach. Theoutput ofthelastlayer ispassed through afully-connected layer with 64 units, followed by a logit layer with either 10 or 100 units. Weusedstandard stochastic gradient descent (SGD) withamomentum of0.9 and a learning rate of0.01, divided by10 at epochs 50 and 100.


Simultaneous Missing Value Imputation and Structure Learning with Groups

Neural Information Processing Systems

Learning structures between groups of variables from data with missing values is an important task in the real world, yet difficult to solve. One typical scenario is discovering the structure among topics in the education domain to identify learning pathways. Here, the observations are student performances for questions under each topic which contain missing values. However, most existing methods focus on learning structures between a few individual variables from the complete data. In this work, we propose VISL, a novel scalable structure learning approach that can simultaneously infer structures between groups of variables under missing data and perform missing value imputations with deep learning. Particularly, we propose a generative model with a structured latent space and a graph neural network-based architecture, scaling to a large number of variables. Empirically, we conduct extensive experiments on synthetic, semi-synthetic, and real-world education data sets. We show improved performances on both imputation and structure learning accuracy compared to popular and recent approaches.



SoftBank shares surge on AI hope and sign of Stargate progress

The Japan Times

SoftBank Group's shares jumped as much as 8% on Tuesday on bets that the tech investor would be able to capitalize on its yearslong focus on artificial intelligence. The Tokyo-based company is the unnamed buyer of Foxconn Technology Group's electric vehicle plant in Ohio and plans to incorporate the facility into its 500 billion Stargate data center project with OpenAI and Oracle. That's spurring optimism that SoftBank may be able to kick-start the stalled Stargate endeavor and benefit from a rush to build AI hardware in the U.S. Its stock is up for the fifth straight day and on track to close at a record. SoftBank has been gradually cashing in on some of its Vision Fund bets in recent years. SoftBank's stock also received a boost after a report emerged that the Japanese company has picked investment banks for a possible initial public offering for Japanese payments app operator PayPay.


"Mountainhead" Channels the Absurdity of the Tech Bro

The New Yorker

Four tech billionaires walk into a mansion. It sounds like the setup for a punch line, but it also forms nearly the entire conceit behind "Mountainhead," a savagely entertaining but somewhat shallow new satire written and directed by Jesse Armstrong, the creator of "Succession." The film, which is streaming on HBO's Max, is a sort of chamber play, its stage a modernist castle in Utah--the Mountainhead of the title--overlooking snowy peaks. The players are a quartet of friends, or, more accurately, frenemies, who resemble a mishmash of real-world Silicon Valley founders. Steve Carell plays Randall Garrett, the group's Peter Thiel-esque mentor who, not unlike the late Steve Jobs, has cancer that his doctor tells him is incurable.


Amazon profits surge on strong trading season and cloud computing growth

The Guardian

Profits at Amazon have surged on strong seasonal trading and robust growth in its powerhouse cloud computing business. The world's largest retailer generated revenue of 170bn in the three months to December, up 14% on the same period of 2022, and clearing expectations on Wall Street of some 166bn. Net income hit 10.6bn in the fourth quarter, from 278m a year previously, after the company moved to cut costs and draw a line under years of rapid expansion following the onset of the pandemic. Earnings per share hit 1.03. Shares in the business rose 5.5% during out-of-hours trading in New York.


Yemen's Houthis say they targeted two Israeli ships in Red Sea: Report

Al Jazeera

Yemen's Houthi movement says it has targeted two Israeli ships with an armed drone and a naval missile, reports a spokesperson for the group's military. The spokesperson said the two ships, Unity Explorer and Number Nine, were targeted after they rejected warnings from the group's navy, the Reuters news agency reported on Sunday. British maritime security company Ambrey said a bulk carrier ship had been hit by at least two drones while sailing in the Red Sea. Another container ship reportedly suffered damage from a drone attack about 101km (63 miles) northwest of the northern Yemeni port of Hodeida, it added. The Pentagon also said a US warship and multiple commercial ships came under attack in the Red Sea, potentially marking a major escalation in a series of maritime attacks since the Israel-Hamas war began on October 7. "We are aware of reports regarding attacks on the USS Carney and commercial vessels in the Red Sea and will provide information as it becomes available," the Pentagon said.


Listen to the 'final' Beatles track, made with machine learning and archival recordings

Engadget

The Beatles are back, sort of. The fab four just released a new song, the group's first since 1995. "Now and Then" is being advertised as the final Beatles track, which makes sense given that two of the members have passed and the other two are well over 80 years old. The song was built using a demo track from John Lennon dating back to the 1970s and a guitar track from George Harrison from 1995. The surviving Beatles, Paul McCartney and Ringo Starr, finished off the tune with the help of modern machine learning technology.


Ternary Singular Value Decomposition as a Better Parameterized Form in Linear Mapping

Chen, Boyu, Chen, Hanxuan, He, Jiao, Sun, Fengyu, Jui, Shangling

arXiv.org Artificial Intelligence

We present a simple yet novel parameterized form of linear mapping to achieves remarkable network compression performance: a pseudo SVD called Ternary SVD (TSVD). Unlike vanilla SVD, TSVD limits the $U$ and $V$ matrices in SVD to ternary matrices form in $\{\pm 1, 0\}$. This means that instead of using the expensive multiplication instructions, TSVD only requires addition instructions when computing $U(\cdot)$ and $V(\cdot)$. We provide direct and training transition algorithms for TSVD like Post Training Quantization and Quantization Aware Training respectively. Additionally, we analyze the convergence of the direct transition algorithms in theory. In experiments, we demonstrate that TSVD can achieve state-of-the-art network compression performance in various types of networks and tasks, including current baseline models such as ConvNext, Swim, BERT, and large language model like OPT.


Google touts AI supercomputer; Nvidia tops MLPerf 3.0 tests

#artificialintelligence

The war of words among AI supercomputer vendors escalated this week with Google claiming that its TPU-based system is faster and more efficient than Nvidia's A100-based entry, according to its own testing. Nvidia countered that its H100 system is faster based on testing conducted by the independent MLCommons using MLPerf 3.0. Google researchers reported that its Tensor Processing Unit-based supercomputer v4 is 1.2 to 1.7 times faster than Nvidia's 3-year-old A100 system and uses between 1.3 to 1.9 times less power. The MLPerf 3.0 benchmarks measured Nvidia's newer H100 against systems entered by 25 organizations, but Google's TPU-based v4 system was not one of them. A direct system-to-system comparison of the two companies' latest systems would have to be conducted by an independent organization running a variety of AI-based workloads for any benchmarks to be definitive, analysts said.