AITopics | crossentropy

Collaborating Authors

crossentropy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LearningaCondensed FrameforMemory-Efficient VideoClass-IncrementalLearning SupplementaryMaterials

Neural Information Processing SystemsFeb-11-2026, 21:31:14 GMT

We observe that the learned prompts have no intuitivesemantics.

artificial intelligence, dataset, learningacondensed frameformemory-efficient videoclass-incrementallearning supplementarymaterial, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.49)

Add feedback

96671501524948bc3937b4b30d0e57b9-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 10:26:11 GMT

BERT is incapable of processing long texts due to its quadratically increasing memory andtimeconsumption. Themost natural waystoaddress thisproblem, such as slicing the text by a sliding window or simplifying transformers, suffer from insufficient long-range attentions orneed customized CUDAkernels.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Asia > China > Jiangsu Province > Changzhou (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Deep Learning and Machine Learning, Advancing Big Data Analytics and Management: Tensorflow Pretrained Models

Chen, Keyu, Bi, Ziqian, Niu, Qian, Liu, Junyu, Peng, Benji, Zhang, Sen, Liu, Ming, Li, Ming, Pan, Xuanhe, Xu, Jiawei, Wang, Jinlang, Feng, Pohsun

arXiv.org Artificial IntelligenceDec-10-2024

The application of TensorFlow pre-trained models in deep learning is explored, with an emphasis on practical guidance for tasks such as image classification and object detection. The study covers modern architectures, including ResNet, MobileNet, and EfficientNet, and demonstrates the effectiveness of transfer learning through real-world examples and experiments. A comparison of linear probing and model fine-tuning is presented, supplemented by visualizations using techniques like PCA, t-SNE, and UMAP, allowing for an intuitive understanding of the impact of these approaches. The work provides complete example code and step-by-step instructions, offering valuable insights for both beginners and advanced users. By integrating theoretical concepts with hands-on practice, the paper equips readers with the tools necessary to address deep learning challenges efficiently.

artificial intelligence, machine learning, tensorflow, (17 more...)

arXiv.org Artificial Intelligence

2409.13566

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(3 more...)

Genre:

Research Report > Promising Solution (0.67)
Instructional Material > Training Manual (0.53)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Evaluation of large language models for assessing code maintainability

Dillmann, Marc, Siebert, Julien, Trendowicz, Adam

arXiv.org Artificial IntelligenceJan-23-2024

Increased availability of open-source software repositories and recent advances in code analysis using large language models (LLMs) has triggered a wave of new work to automate software engineering tasks that were previously very difficult to automate. In this paper, we investigate a recent line of work that hypothesises that comparing the probability of code generated by LLMs with the probability the current code would have had can indicate potential quality problems. We investigate the association between the cross-entropy of code generated by ten different models (based on GPT2 and Llama2) and the following quality aspects: readability, understandability, complexity, modularisation, and overall maintainability assessed by experts and available in an benchmark dataset. Our results show that, controlling for the number of logical lines of codes (LLOC), cross-entropy computed by LLMs is indeed a predictor of maintainability on a class level (the higher the cross-entropy the lower the maintainability). However, this relation is reversed when one does not control for LLOC (e.g., comparing small classes with longer ones). Furthermore, while the complexity of LLMs affects the range of cross-entropy (smaller models tend to have a wider range of cross-entropy), this plays a significant role in predicting maintainability aspects. Our study limits itself on ten different pretrained models (based on GPT2 and Llama2) and on maintainability aspects collected by Schnappinger et al. When controlling for logical lines of code (LLOC), cross-entropy is a predictor of maintainability. However, while related work has shown the potential usefulness of cross-entropy at the level of tokens or short sequences, at the class level this criterion alone may prove insufficient to predict maintainability and further research is needed to make best use of this information in practice.

lloc, maintainability, probability, (15 more...)

arXiv.org Artificial Intelligence

2401.12714

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Germany (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On the R\'{e}nyi Cross-Entropy

Thierrin, Ferenc Cole, Alajaji, Fady, Linder, Tamás

arXiv.org Artificial IntelligenceAug-6-2022

The R\'{e}nyi cross-entropy measure between two distributions, a generalization of the Shannon cross-entropy, was recently used as a loss function for the improved design of deep learning generative adversarial networks. In this work, we examine the properties of this measure and derive closed-form expressions for it when one of the distributions is fixed and when both distributions belong to the exponential family. We also analytically determine a formula for the cross-entropy rate for stationary Gaussian processes and for finite-alphabet Markov sources.

differential cross-entropy, nyi differential, nyi differential cross-entropy, (13 more...)

arXiv.org Artificial Intelligence

2206.14329

Country: North America > Canada > Ontario > Kingston (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.55)

Add feedback

Deep Learning Fundamental Terms

#artificialintelligenceJan-5-2022, 08:30:13 GMT

AI: The effort to automate intellectual tasks normally performed by humans. AI encompasses machine learning and deep learning. ML: In ML, humans input data as well as answers expected from data, and outcome the rules. These rules can be applied to new data to produce original answers. A machine-learning system is trained rather than explicitly programmed.

artificial intelligence, loss function, machine learning, (17 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)

Add feedback

MetaPix: Domain Transfer for Semantic Segmentation by Meta Pixel Weighting

Jian, Yiren, Gao, Chongyang

arXiv.org Artificial IntelligenceOct-4-2021

Training a deep neural model for semantic segmentation requires collecting a large amount of pixel-level labeled data. To alleviate the data scarcity problem presented in the real world, one could utilize synthetic data whose label is easy to obtain. Previous work has shown that the performance of a semantic segmentation model can be improved by training jointly with real and synthetic examples with a proper weighting on the synthetic data. Such weighting was learned by a heuristic to maximize the similarity between synthetic and real examples. In our work, we instead learn a pixel-level weighting of the synthetic data by meta-learning, i.e., the learning of weighting should only be minimizing the loss on the target task. We achieve this by gradient-on-gradient technique to propagate the target loss back into the parameters of the weighting model. The experiments show that our method with only one single meta module can outperform a complicated combination of an adversarial feature alignment, a reconstruction loss, plus a hierarchical heuristic weighting at pixel, region and image levels.

adaptation, computer vision, semantic segmentation, (15 more...)

arXiv.org Artificial Intelligence

2110.01777

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
(12 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Distilling Neuron Spike with High Temperature in Reinforcement Learning Agents

Zhang, Ling, Cao, Jian, Zhang, Yuan, Zhou, Bohan, Feng, Shuo

arXiv.org Artificial IntelligenceAug-5-2021

Spiking neural network (SNN), compared with depth neural network (DNN), has faster processing speed, lower energy consumption and more biological interpretability, which is expected to approach Strong AI. Reinforcement learning is similar to learning in biology. It is of great significance to study the combination of SNN and RL. We propose the reinforcement learning method of spike distillation network (SDN) with STBP. This method uses distillation to effectively avoid the weakness of STBP, which can achieve SOTA performance in classification, and can obtain a smaller, faster convergence and lower power consumption SNN reinforcement learning model. Experiments show that our method can converge faster than traditional SNN reinforcement learning and DNN reinforcement learning methods, about 1000 epochs faster, and obtain SNN 200 times smaller than DNN. We also deploy SDN to the PKU nc64c chip, which proves that SDN has lower power consumption than DNN, and the power consumption of SDN is more than 600 times lower than DNN on large-scale devices. SDN provides a new way of SNN reinforcement learning, and can achieve SOTA performance, which proves the possibility of further development of SNN reinforcement learning.

reinforcement, sdn, student network, (16 more...)

arXiv.org Artificial Intelligence

2108.10078

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.83)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Multi-modal Residual Perceptron Network for Audio-Video Emotion Recognition

Chang, Xin, Skarbek, Władysław

arXiv.org Artificial IntelligenceJul-30-2021

Audio-Video Emotion Recognition is now attacked with Deep Neural Network modeling tools. In published papers, as a rule, the authors show only cases of the superiority in multi-modality over audio-only or video-only modality. However, there are cases superiority in uni-modality can be found. In our research, we hypothesize that for fuzzy categories of emotional events, the within-modal and inter-modal noisy information represented indirectly in the parameters of the modeling neural network impedes better performance in the existing late fusion and end-to-end multi-modal network training strategies. To take advantage and overcome the deficiencies in both solutions, we define a Multi-modal Residual Perceptron Network which performs end-to-end learning from multi-modal network branches, generalizing better multi-modal feature representation. For the proposed Multi-modal Residual Perceptron Network and the novel time augmentation for streaming digital movies, the state-of-art average recognition rate was improved to 91.4% for The Ryerson Audio-Visual Database of Emotional Speech and Song dataset and to 83.15% for Crowd-sourced Emotional multi-modal Actors dataset. Moreover, the Multi-modal Residual Perceptron Network concept shows its potential for multi-modal applications dealing with signal sources not only of optical and acoustical types.

information, modality, recognition, (15 more...)

arXiv.org Artificial Intelligence

2107.10742

Country:

Europe > Poland > Masovia Province > Warsaw (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Tutorial On Keras Tokenizer For Text Classification in NLP

#artificialintelligenceAug-31-2020, 16:45:25 GMT

Now we will compile the model using optimizer as stochastic gradient descent, loss as cross-entropy and metrics to measure the performance would be accuracy. After compiling we will train the model and check the performance on validation data. We are taking a batch size of 64 and epochs to be 10.

machine learning, natural language, test, (15 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.56)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.41)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.34)

Add feedback