AITopics | adan

Collaborating Authors

adan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

Xie, Xingyu, Zhou, Pan, Li, Huan, Lin, Zhouchen, Yan, Shuicheng

arXiv.org Artificial IntelligenceFeb-27-2023

In deep learning, different kinds of deep networks typically need different optimizers, which have to be chosen after multiple trials, making the training process inefficient. To relieve this issue and consistently improve the model training speed across deep networks, we propose the ADAptive Nesterov momentum algorithm, Adan for short. Adan first reformulates the vanilla Nesterov acceleration to develop a new Nesterov momentum estimation (NME) method, which avoids the extra overhead of computing gradient at the extrapolation point. Then Adan adopts NME to estimate the gradient's first- and second-order moments in adaptive gradient algorithms for convergence acceleration. Besides, we prove that Adan finds an $\epsilon$-approximate first-order stationary point within $O(\epsilon^{-3.5})$ stochastic gradient complexity on the non-convex stochastic problems (e.g., deep learning problems), matching the best-known lower bound. Extensive experimental results show that Adan consistently surpasses the corresponding SoTA optimizers on vision, language, and RL tasks and sets new SoTAs for many popular networks and frameworks, e.g., ResNet, ConvNext, ViT, Swin, MAE, DETR, GPT-2, Transformer-XL, and BERT. More surprisingly, Adan can use half of the training cost (epochs) of SoTA optimizers to achieve higher or comparable performance on ViT, GPT-2, MAE, e.t.c., and also shows great tolerance to a large range of minibatch size, e.g., from 1k to 32k. Code is released at https://github.com/sail-sg/Adan, and has been used in multiple popular deep learning frameworks or projects.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2208.06677

Country:

Europe > Russia (0.04)
Asia > Russia (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Education (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Adan

AAAI ConferencesFeb-8-2022, 11:27:44 GMT

This complex manufacturing environment is characterized by a large product and batch size variety, numerous parallel machines with large capacity differences, sequence and machine dependent setup times and machine eligibility constraints. A hybrid genetic algorithm is proposed to improve the scheduling process, the main features of which are a local search enhanced crossover mechanism, two additional fast local search procedures and a user-controlled multi-objective fitness function. Testing with real-life production data shows that this multi-objective approach can strike the desired balance between production time, setup time and tardiness, yielding high-quality practically feasible production schedules.

adan, setup time

AAAI Conferences

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.71)

Add feedback

Adversarial Domain Adaptation for Stable Brain-Machine Interfaces

Farshchian, Ali, Gallego, Juan A., Cohen, Joseph P., Bengio, Yoshua, Miller, Lee E., Solla, Sara A.

arXiv.org Machine LearningSep-28-2018

Brain-Machine Interfaces (BMIs) have recently emerged as a clinically viable option to restore voluntary movements after paralysis. These devices are based on the ability to extract information about movement intent from neural signals recorded using multi-electrode arrays chronically implanted in the motor cortices of the brain. However, the inherent loss and turnover of recorded neurons requires repeated recalibrations of the interface, which can potentially alter the day-to-day user experience. The resulting need for continued user adaptation interferes with the natural, subconscious use of the BMI. Here, we introduce a new computational approach that decodes movement intent from a low-dimensional latent representation of the neural data. We implement various domain adaptation methods to stabilize the interface over significantly long times. This includes Canonical Correlation Analysis used to align the latent variables across days; this method requires prior point-to-point correspondence of the time series across domains. Alternatively, we match the empirical probability distributions of the latent variables across days through the minimization of their Kullback-Leibler divergence. These two methods provide a significant and comparable improvement in the performance of the interface. However, implementation of an Adversarial Domain Adaptation Network trained to match the empirical probability distribution of the residuals of the reconstructed neural signals outperforms the two methods based on latent variables, while requiring remarkably few data points to solve the domain adaptation problem.

artificial intelligence, latent variable, machine learning, (19 more...)

arXiv.org Machine Learning

1810.00045

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Noon in the antilibrary

MIT Technology ReviewAug-18-2018, 14:25:08 GMT

Marius cursed and jammed a mic stand between the crash bars of the TV studio door. "If SWAT's on its way, we don't have much time," he said. Michaela, who up until a couple of minutes ago had been streaming their interview live, still sat on one of the oval chairs under the hot lights. "What are they talking about?" The cube-shaped television studio had black-painted walls surrounding the bright stage area. Big monitors on the walls were showing the same "live" feed as they had five minutes ago, but now a red banner flashed at the bottom of the screens: ACTIVE SHOOTER AT COMPLETE PICTURES BUILDING. Michaela pointed at a moving figure on the screen. Apparently I like assault rifles." Adan, their cameraman, had called up a local news feed after the first shouts of panic and confusion filtered through the studio's thick doors. What it showed was entirely and completely not what the three of them were seeing. Marius was inside the windowless second-floor studio, empty-handed, yet the monitors showed what looked like a drone feed of him moving into and out of view through the building's windows on the 10th floor. He was armed, and every now and then he would pause and shoot, calmly and methodically. Marius shook his head in disgust. "Hey, Adan, could you give me a hand with this?" The cameraman was hunched over his laptop. "The same people who own the SWAT team," said Marius. "But forget what I said.

artificial intelligence, marius, michaela, (15 more...)

MIT Technology Review

Country:

North America > United States > New York (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)
(2 more...)

Industry: Media > News (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence (0.69)

Add feedback