window
Deep Multimodal Multilinear Fusion with High-order Polynomial Pooling
Ming Hou, Jiajia Tang, Jianhai Zhang, Wanzeng Kong, Qibin Zhao
More importantly, simply fusing features all at once ignores the complex local intercorrelations, leading to the deterioration of prediction. In this work, we first propose a polynomial tensor pooling (PTP) block for integrating multimodal features by considering high-order moments, followed by a tensorized fully connected layer. Treating PTP as a building block, we further establish a hierarchical polynomial fusion network (HPFN) to recursively transmit local correlations into global ones.
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Asia > Japan (0.04)
- Asia > China (0.04)
Windows' Copilot AI can now read your Gmail and Google Calendar
When you purchase through links in our articles, we may earn a small commission. Windows' Copilot AI can now read your Gmail and Google Calendar The newest test version of Copilot can connect to your Google account to analyze data in Gmail, Drive, and Calendar. Microsoft really wants you to use Copilot, its branded AI platform. In fact, Microsoft wants you to use Copilot so much that it's fine if you want to use it on Google services, like Gmail, Google Drive, and Google Calendar. A new Windows Insider update facilitates that.
- North America > United States > Pennsylvania (0.05)
- North America > United States > California (0.05)
- Europe (0.05)
- Information Technology > Security & Privacy (0.39)
- Information Technology > Services (0.37)
Greener Deep Reinforcement Learning: Analysis of Energy and Carbon Efficiency Across Atari Benchmarks
Gardner, Jason, Dutta, Ayan, Roy, Swapnoneel, Kreidl, O. Patrick, Boloni, Ladislau
The growing computational demands of deep reinforcement learning (DRL) have raised concerns about the environmental and economic costs of training large-scale models. While algorithmic efficiency in terms of learning performance has been extensively studied, the energy requirements, greenhouse gas emissions, and monetary costs of DRL algorithms remain largely unexplored. In this work, we present a systematic benchmarking study of the energy consumption of seven state-of-the-art DRL algorithms, namely DQN, TRPO, A2C, ARS, PPO, RecurrentPPO, and QR-DQN, implemented using Stable Baselines. Each algorithm was trained for one million steps each on ten Atari 2600 games, and power consumption was measured in real-time to estimate total energy usage, CO2-Equivalent emissions, and electricity cost based on the U.S. national average electricity price. Our results reveal substantial variation in energy efficiency and training cost across algorithms, with some achieving comparable performance while consuming up to 24% less energy (ARS vs. DQN), emitting nearly 68% less CO2, and incurring almost 68% lower monetary cost (QR-DQN vs. RecurrentPPO) than less efficient counterparts. We further analyze the trade-offs between learning performance, training time, energy use, and financial cost, highlighting cases where algorithmic choices can mitigate environmental and economic impact without sacrificing learning performance. This study provides actionable insights for developing energy-aware and cost-efficient DRL practices and establishes a foundation for incorporating sustainability considerations into future algorithmic design and evaluation.
- North America > United States > Florida > Orange County > Orlando (0.14)
- North America > United States > Florida > Duval County > Jacksonville (0.14)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (2 more...)
- Energy > Power Industry (1.00)
- Leisure & Entertainment > Games > Computer Games (0.69)
- Government > Regional Government > North America Government > United States Government (0.68)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Cog-GA: A Large Language Models-based Generative Agent for Vision-Language Navigation in Continuous Environments
Li, Zhiyuan, Lu, Yanfeng, Mu, Yao, Qiao, Hong
Vision Language Navigation in Continuous Environments (VLN-CE) represents a frontier in embodied AI, demanding agents to navigate freely in unbounded 3D spaces solely guided by natural language instructions. This task introduces distinct challenges in multimodal comprehension, spatial reasoning, and decision-making. To address these challenges, we introduce Cog-GA, a generative agent founded on large language models (LLMs) tailored for VLN-CE tasks. Cog-GA employs a dual-pronged strategy to emulate human-like cognitive processes. Firstly, it constructs a cognitive map, integrating temporal, spatial, and semantic elements, thereby facilitating the development of spatial memory within LLMs. Secondly, Cog-GA employs a predictive mechanism for waypoints, strategically optimizing the exploration trajectory to maximize navigational efficiency. Each waypoint is accompanied by a dual-channel scene description, categorizing environmental cues into 'what' and 'where' streams as the brain. This segregation enhances the agent's attentional focus, enabling it to discern pertinent spatial information for navigation. A reflective mechanism complements these strategies by capturing feedback from prior navigation experiences, facilitating continual learning and adaptive replanning. Extensive evaluations conducted on VLN-CE benchmarks validate Cog-GA's state-of-the-art performance and ability to simulate human-like navigation behaviors. This research significantly contributes to the development of strategic and interpretable VLN-CE agents.
- Research Report (0.82)
- Workflow (0.49)
To Navigate the Age of AI, the World Needs a New Turing Test
There was a time in the not too distant past--say, nine months ago--when the Turing test seemed like a pretty stringent detector of machine intelligence. Chances are you're familiar with how it works: Human judges hold text conversations with two hidden interlocutors, one human and one computer, and try to determine which is which. If the computer manages to fool at least 30 percent of the judges, it passes the test and is pronounced capable of thought. For 70 years, it was hard to imagine how a computer could pass the test without possessing what AI researchers now call artificial general intelligence, the entire range of human intellectual capacities. Then along came large language models such as GPT and Bard, and the Turing test suddenly began seeming strangely outmoded. OK, sure, a casual user today might admit with a shrug, GPT-4 might very well pass a Turing test if you asked it to impersonate a human.
The world's leading writing app is on sale for half off now
Research already shows that ChatGPT is helping to create better writers. But if you're trying to craft something longform, ChatGPT isn't going to help. The go-to app for best-selling novelists, screenwriters, nonfiction writers, journalists, lawyers, and more, Scrivener is all about helping you grow your manuscript or project your way. You can compose your text in any order, in sections as large or small as you like, allowing you to grow your project idea by idea. It gives you an organized place to access research, produce an outline, check for consistency, and even gives you the tools to print, self-publish, or export to popular formats.
Harnessing Semiotics and Discourse Communities to Understand User Intent - KDnuggets
In our previous article we set out the rationale for basing the Natural Language Understanding component of our digital assistant on the linguistic principles of semiotics and Discourse communities. Semiotics helps us understand the importance of context to determining the meaning of a term and discourse communities provide us with the background context (mental model) by which to correctly interpret its meaning correctly. But how do we discover the discourses that exist for our smart phone application? In order to do this we focus on identifying the "jargon" terms that exist for each community. These are terms that are unique to that community that help in effective communication.
Amazon, Microsoft's Awkward Partnership Sees Alexa Come To PCs
Just as Apple's iOS is forever linked to the iPhone, Alexa has been synonymous with Amazon's Echo speaker. But Amazon's digital assistant is becoming increasingly independent -- integrating into speakers made by Sonos, the Nest thermostat or lights made by Philips. Now it's also finding its way onto computer towers and notebooks made by PC makers, a move that could simultaneously ratchet up tensions between Amazon and Microsoft. PC makers like HP, ASUS and Acer are announcing Alexa integrations at the Consumer Electronics Show in Las Vegas this week, and in some cases the partnerships see hardware being upgraded to make Alexa more accessible, according to GeekWire. HP, for instance, plans to add a custom LED to its Pavillion Wave desktop computer tower that can glow when it hears Alexa's name, activating the digital assistant.
On Interface Requirements for Expert Systems
The user interface to a software system can spell the difference between success and failure. Sometimes, function does not seem to count. If the program does a good enough job, if the users see an easy to use, easy to learn, helpful, pleasant interface, they love it. The interface might be the most significant sales aspect of a software product (consider the spate of look-and-feel lawsuits!). This wasn't always the situation.
Computer-Aided Parts Estimation
Of all the parts that make up a Ford motor vehicle, the majority are actually manufactured by external suppliers, then purchased by Ford. To effectively manage this substantial vehicle cost component, Ford dedicates a whole division to this task. Purchase Cost Estimation and Analysis (PCE&A) employs a large number of estimators in Europe, typically production engineers, each one an expert in some area of vehicle component manufacture. The estimator is first involved at the design stage for future vehicle model programs. Working from initial engineering drawings, they provide feedback on production feasibility and economic considerations.
- Information Technology > Software (0.71)
- Automobiles & Trucks > Manufacturer (0.49)