AITopics

2410.18144

Country:

North America > Canada > Ontario > Middlesex County > London (0.14)
Europe > Austria > Vienna (0.14)
South America > Paraguay > Asunción > Asunción (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.56)

Tinio, Jan Nino G., Alaya, Mokhtar Z., Bouzebda, Salim

Bounds in Wasserstein distance for locally stationary processes

arXiv.org Machine LearningDec-4-2024

Locally stationary processes (LSPs) provide a robust framework for modeling time-varying phenomena, allowing for smooth variations in statistical properties such as mean and variance over time. In this paper, we address the estimation of the conditional probability distribution of LSPs using Nadaraya-Watson (NW) type estimators. The NW estimator approximates the conditional distribution of a target variable given covariates through kernel smoothing techniques. We establish the convergence rate of the NW conditional probability estimator for LSPs in the univariate setting under the Wasserstein distance and extend this analysis to the multivariate case using the sliced Wasserstein distance. Theoretical results are supported by numerical experiments on both synthetic and real-world datasets, demonstrating the practical usefulness of the proposed estimators.

estimation, estimator, wasserstein distance, (15 more...)

arXiv.org Machine Learning

2412.03414

Country:

Europe > France > Hauts-de-France > Oise > Compiègne (0.04)
Europe > Netherlands > South Holland > Delft (0.04)
South America > Chile > Araucanía Region > Malleco Province (0.04)
(6 more...)

Genre: Research Report (0.63)

Industry:

Health & Medicine (1.00)
Banking & Finance > Economy (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Zhang, Xianyang, Zhou, Huijuan

Generalization Bounds and Model Complexity for Kolmogorov-Arnold Networks

arXiv.org Machine LearningDec-4-2024

Kolmogorov-Arnold Network (KAN) is a network structure recently proposed by Liu et al. (2024) that offers improved interpretability and a more parsimonious design in many science-oriented tasks compared to multi-layer perceptrons. This work provides a rigorous theoretical analysis of KAN by establishing generalization bounds for KAN equipped with activation functions that are either represented by linear combinations of basis functions or lying in a low-rank Reproducing Kernel Hilbert Space (RKHS). In the first case, the generalization bound accommodates various choices of basis functions in forming the activation functions in each layer of KAN and is adapted to different operator norms at each layer. For a particular choice of operator norms, the bound scales with the $l_1$ norm of the coefficient matrices and the Lipschitz constants for the activation functions, and it has no dependence on combinatorial parameters (e.g., number of nodes) outside of logarithmic factors. Moreover, our result does not require the boundedness assumption on the loss function and, hence, is applicable to a general class of regression-type loss functions. In the low-rank case, the generalization bound scales polynomially with the underlying ranks as well as the Lipschitz constants of the activation functions in each layer. These bounds are empirically investigated for KANs trained with stochastic gradient descent on simulated and real data sets. The numerical results demonstrate the practical relevance of these bounds.

activation function, arxiv preprint arxiv, complexity, (13 more...)

arXiv.org Machine Learning

2410.08026

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Latvia > Riga Municipality > Riga (0.04)
(2 more...)

Genre: Research Report > New Finding (0.86)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Al JazeeraDec-3-2024, 21:09:05 GMT

Meta says AI had only 'modest' impact on global elections in 2024

Despite fears that artificial intelligence (AI) could influence the outcome of elections around the world, the United States technology giant Meta said it detected little impact across its platforms this year. That was in part due to defensive measures designed to prevent coordinated networks of accounts, or bots, from grabbing attention on Facebook, Instagram and Threads, Meta president of global affairs Nick Clegg told reporters on Tuesday. "I don't think the use of generative AI was a particularly effective tool for them to evade our trip wires," Clegg said of actors behind coordinated disinformation campaigns. In 2024, Meta says it ran several election operations centres around the world to monitor content issues, including during elections in the US, Bangladesh, Brazil, France, India, Indonesia, Mexico, Pakistan, South Africa, the United Kingdom and the European Union. Most of the covert influence operations it has disrupted in recent years were carried out by actors from Russia, Iran and China, Clegg said, adding that Meta took down about 20 "covert influence operations" on its platform this year.

artificial intelligence, machine learning, meta, (15 more...)

Al Jazeera

Country:

North America > United States (1.00)
Europe > Russia (0.26)
Asia > Russia (0.26)
(12 more...)

Industry:

Media (1.00)
Government > Voting & Elections (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.37)

EngadgetDec-3-2024, 13:00:42 GMT

Meta says AI-generated content was less than 1 precent of election misinformation

AI-generated content played a much smaller role in global election misinformation than what many officials and researchers had feared, according to a new analysis from Meta. In an update on its efforts to safeguard dozens of elections in 2024, the company said that AI content made up only a fraction of election-related misinformation that was caught and labeled by its fact checkers. "During the election period in the major elections listed above, ratings on AI content related to elections, politics and social topics represented less than 1% of all fact-checked misinformation," the company shared in a blog post, referring to elections in the US, UK, Bangladesh, Indonesia, India, Pakistan, France, South Africa, Mexico and Brazil, as well as the EU's Parliamentary elections. The update comes after numerous government officials and researchers for months raised the alarm about the role generative AI could play in supercharging election misinformation in a year when more than 2 billion people were expected to go to the polls. But those fears largely did not play out -- at least on Meta's platforms -- according to the company's President of Global Affairs, Nick Clegg.

artificial intelligence, machine learning, misinformation, (12 more...)

Engadget

Country:

North America > United States (1.00)
South America > Brazil (0.26)
North America > Mexico (0.26)
(6 more...)

Industry:

Media > News (1.00)
Government > Regional Government > North America Government > United States Government (0.73)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.39)

Trung, Quang Hoang, Phuc, Nguyen Van Hoang, Hoang, Le Trung, Hieu, Quang Huu, Duy, Vo Nguyen Le

Adaptive Two-Phase Finetuning LLMs for Japanese Legal Text Retrieval

Text Retrieval (TR) involves finding and retrieving text-based content relevant to a user's query from a large repository, with applications in real-world scenarios such as legal document retrieval. While most existing studies focus on English, limited work addresses Japanese contexts. In this paper, we introduce a new dataset specifically designed for Japanese legal contexts and propose a novel two-phase pipeline tailored to this domain. In the first phase, the model learns a broad understanding of global contexts, enhancing its generalization and adaptability to diverse queries. In the second phase, the model is fine-tuned to address complex queries specific to legal scenarios. Extensive experiments are conducted to demonstrate the superior performance of our method, which outperforms existing baselines. Furthermore, our pipeline proves effective in English contexts, surpassing comparable baselines on the MS MARCO dataset. We have made our code publicly available on GitHub, and the model checkpoints are accessible via HuggingFace.

large language model, machine learning, natural language, (18 more...)

2412.13205

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Japan (0.14)
Asia > Vietnam > Hồ Chí Minh City > Hồ Chí Minh City (0.04)
(7 more...)

Genre: Research Report (1.00)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Nobrega, Lucas Nogueira, de Oliveira, Ewerton, Saska, Martin, Nascimento, Tiago

Proximal Control of UAVs with Federated Learning for Human-Robot Collaborative Domains

The human-robot interaction (HRI) is a growing area of research. In HRI, complex command (action) classification is still an open problem that usually prevents the real applicability of such a technique. The literature presents some works that use neural networks to detect these actions. However, occlusion is still a major issue in HRI, especially when using uncrewed aerial vehicles (UAVs), since, during the robot's movement, the human operator is often out of the robot's field of view. Furthermore, in multi-robot scenarios, distributed training is also an open problem. In this sense, this work proposes an action recognition and control approach based on Long Short-Term Memory (LSTM) Deep Neural Networks with two layers in association with three densely connected layers and Federated Learning (FL) embedded in multiple drones. The FL enabled our approach to be trained in a distributed fashion, i.e., access to data without the need for cloud or other repositories, which facilitates the multi-robot system's learning. Furthermore, our multi-robot approach results also prevented occlusion situations, with experiments with real robots achieving an accuracy greater than 96%.

artificial intelligence, machine learning, uav, (14 more...)

doi: 10.1109/LRA.2024.3491417

2412.02863

Country:

South America > Brazil > Paraíba (0.14)
Europe > Czechia > Prague (0.04)
Europe > Switzerland (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

A Multi-Agent Framework for Extensible Structured Text Generation in PLCs

Yang, Donghao, Wu, Aolang, Zhang, Tianyi, Zhang, Li, Liu, Fang, Lian, Xiaoli, Ren, Yuming, Tian, Jiaji

Programmable Logic Controllers (PLCs) are microcomputers essential for automating factory operations. Structured Text (ST), a high-level language adhering to the IEC 61131-3 standard, is pivotal for PLCs due to its ability to express logic succinctly and to seamlessly integrate with other languages within the same standard. However, vendors develop their own customized versions of ST, and the lack of comprehensive and standardized documentation for the full semantics of ST has contributed to inconsistencies in how the language is implemented. Consequently, the steep learning curve associated with ST, combined with ever-evolving industrial requirements, presents significant challenges for developers. In response to these issues, we present AutoPLC, an LLM-based approach designed to automate the generation of vendor-specific ST code. To facilitate effective code generation, we first built a comprehensive knowledge base, including Rq2ST Case Library (requirements and corresponding implementations) and Instruction libraries. Then we developed a retrieval module to incorporate the domain-specific knowledge by identifying pertinent cases and instructions, guiding the LLM to generate code that meets the requirements. In order to verify and improve the quality of the generated code, we designed an adaptable code checker. If errors are detected, we initiate an iterative self-improvement process to instruct the LLM to revise the generated code. We evaluate AutoPLC's performance against seven state-of-the-art baselines using three benchmarks, one for open-source basic ST and two for commercial Structured Control Language (SCL) from Siemens. The results show that our approach consistently achieves superior performance across all benchmarks. Ablation study emphasizes the significance of our modules. Further manual analysis confirm the practical utility of the ST code generated by AutoPLC.

benchmark, large language model, machine learning, (19 more...)

2412.0241

Country:

South America (0.04)
North America > United States > Texas > Dallas County > Dallas (0.04)
North America > United States > New York > New York County > New York City (0.04)
(7 more...)

Genre: Research Report > New Finding (0.87)

Industry: Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

3D Interaction Geometric Pre-training for Molecular Relational Learning

Lee, Namkyeong, Oh, Yunhak, Noh, Heewoong, Na, Gyoung S., Xu, Minkai, Wang, Hanchen, Fu, Tianfan, Park, Chanyoung

Molecular relational learning (MRL) focuses on understanding the interaction dynamics between molecules and has gained significant attention from researchers thanks to its diverse applications [20]. For instance, understanding how a medication dissolves in different solvents (medication-solvent interaction) is vital in pharmacy [30, 26, 3], while predicting the optical and photophysical properties of chromophores in various solvents (chromophore-solvent interaction) is essential for material discovery [16]. Because of the expensive time and financial costs associated with conducting wet lab experiments to test the interaction behavior of all possible molecular pairs [31], machine learning methods have been quickly embraced for MRL. Despite recent advancements in MRL, previous works tend to ignore molecules' 3D geometric information and instead focus solely on their 2D topological structures. However, in molecular science, the 3D geometric information of molecules (Figure 1 (a)) is crucial for understanding and predicting molecular behavior across various contexts, ranging from physical properties [1] to biological functions [10, 46]. This is particularly important in MRL, as geometric information plays a key role in molecular interactions by determining how molecules recognize, interact, and bind with one another in their interaction environment [34]. In traditional molecular dynamics simulations, explicit solvent models, which directly consider the detailed environment of molecular interaction, have demonstrated superior performance compared to implicit solvent models, which simplify the solvent as a continuous medium, highlighting the significance of explicitly modeling the complex geometries of interaction environments [47]. However, acquiring stereochemical structures of molecules is often very costly, resulting in limited availability of such 3D geometric information for downstream tasks [23].

artificial intelligence, machine learning, molecule, (19 more...)

2412.02957

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Minnesota (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Materials > Chemicals (1.00)
Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Amancio, Diego R., Machicao, Jeaneth, Quispe, Laura V. C.

Probing the statistical properties of enriched co-occurrence networks

Recent studies have explored the addition of virtual edges to word co-occurrence networks using word embeddings to enhance graph representations, particularly for short texts. While these enriched networks have demonstrated some success, the impact of incorporating semantic edges into traditional co-occurrence networks remains uncertain. In this study, we investigate two key statistical properties of text-based network models. First, we assess whether network metrics can effectively distinguish between meaningless and meaningful texts. Second, we analyze whether these metrics are more sensitive to syntactic or semantic aspects of the text. Our results show that incorporating virtual edges can have both positive and negative effects, depending on the specific network metric. For instance, the informativeness of the average shortest path and closeness centrality improves in short texts, while the clustering coefficient's informativeness decreases as more virtual edges are added. Additionally, we found that including stopwords affects the statistical properties of enriched networks. Our results can serve as a guideline for determining which network metrics are most appropriate for specific applications, depending on the typical text size and the nature of the problem.

artificial intelligence, machine learning, natural language, (18 more...)

2412.02664

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
South America > Brazil > São Paulo (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.36)