AITopics | huffman

Collaborating Authors

huffman

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Millions Use It Every Day. It's One of the Internet's Most Important Websites. Bots Are Destroying It, Piece by Piece.

SlateJun-23-2025, 16:10:38 GMT

Sign up for the Slatest to get the most insightful analysis, criticism, and advice out there, delivered to your inbox daily. In the years since ChatGPT's debut transformed Silicon Valley into an artificial intelligence hype factory, the internet's most vibrant communities have puzzled over how to adapt to the ensuing deluge of A.I. slop, especially as autogenerated outputs become more sophisticated. Perhaps no platform exemplifies this conundrum better than Reddit, the anonymized message-board network that's been connecting millions of humans across the world for 20 years now--as many users there increasingly wonder whether they are, indeed, still connecting with other humans. Such concerns aren't new, but they've been heightened thanks to a shocking exercise of A.I.-powered manipulation. In late April, the moderation team for the popular subreddit r/ChangeMyView disclosed that researchers from the University of Zurich had conducted an "unauthorized experiment" on community members that "deployed AI-generated comments to study how AI could be used to change views."

large language model, machine learning, natural language, (21 more...)

Slate

Country:

Europe > Switzerland > Zürich > Zürich (0.27)
North America > United States > California (0.24)

Industry:

Media > News (1.00)
Information Technology > Security & Privacy (0.70)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.52)

Add feedback

Lossless Compression for LLM Tensor Incremental Snapshots

Waddington, Daniel, Constantinescu, Cornel

arXiv.org Artificial IntelligenceMay-16-2025

During the training of Large Language Models (LLMs), tensor data is periodically "checkpointed" to persistent storage to allow recovery of work done in the event of failure. The volume of data that must be copied during each checkpoint, even when using reduced-precision representations such as bfloat16, often reaches hundreds of gigabytes. Furthermore, the data must be moved across a network and written to a storage system before the next epoch occurs. With a view to ultimately building an optimized checkpointing solution, this paper presents experimental analysis of checkpoint data used to derive a design that maximizes the use of lossless compression to reduce the volume of data. We examine how tensor data and its compressibility evolve during model training and evaluate the efficacy of existing common off-the-shelf general purpose compression engines combined with known data optimization techniques such as byte-grouping and incremental delta compression. Leveraging our analysis we have built an effective compression solution, known as Language Model Compressor (LMC), which is based on byte-grouping and Huffman encoding. LMC offers more compression performance than the best alternative (BZ2) but with an order-of-magnitude reduction in the time needed to perform the compression. We show that a 16-core parallel implementation of LMC can attain compression and decompression throughput of 2.78 GiB/s and 3.76 GiB/s respectively. This increase in performance ultimately reduces the CPU resources needed and provides more time to copy the data to the storage system before the next epoch thus allowing for higher-frequency checkpoints.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2505.0981

Country:

North America > United States > Colorado > Denver County > Denver (0.04)
Europe > Germany (0.04)

Genre: Research Report (0.41)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

Goal-Oriented Source Coding using LDPC Codes for Compressed-Domain Image Classification

Aliouat, Ahcen, Dupraz, Elsa

arXiv.org Artificial IntelligenceMar-14-2025

In the emerging field of goal-oriented communications, the focus has shifted from reconstructing data to directly performing specific learning tasks, such as classification, segmentation, or pattern recognition, on the received coded data. In the commonly studied scenario of classification from compressed images, a key objective is to enable learning directly on entropy-coded data, thereby bypassing the computationally intensive step of data reconstruction. Conventional entropy-coding methods, such as Huffman and Arithmetic coding, are effective for compression but disrupt the data structure, making them less suitable for direct learning without decoding. This paper investigates the use of low-density parity-check (LDPC) codes -- originally designed for channel coding -- as an alternative entropy-coding approach. It is hypothesized that the structured nature of LDPC codes can be leveraged more effectively by deep learning models for tasks like classification. At the receiver side, gated recurrent unit (GRU) models are trained to perform image classification directly on LDPC-coded data. Experiments on datasets like MNIST, Fashion-MNIST, and CIFAR show that LDPC codes outperform Huffman and Arithmetic coding in classification tasks, while requiring significantly smaller learning models. Furthermore, the paper analyzes why LDPC codes preserve data structure more effectively than traditional entropy-coding techniques and explores the impact of key code parameters on classification performance. These results suggest that LDPC-based entropy coding offers an optimal balance between learning efficiency and model complexity, eliminating the need for prior decoding.

accuracy, bitplane, ldpc code, (14 more...)

arXiv.org Artificial Intelligence

2503.11954

Country: Europe > France > Brittany > Finistère > Brest (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DQA: An Efficient Method for Deep Quantization of Deep Neural Network Activations

Hu, Wenhao, Henderson, Paul, Cano, José

arXiv.org Machine LearningDec-12-2024

Quantization of Deep Neural Network (DNN) activations is a commonly used technique to reduce compute and memory demands during DNN inference, which can be particularly beneficial on resource-constrained devices. To achieve high accuracy, existing methods for quantizing activations rely on complex mathematical computations or perform extensive searches for the best hyper-parameters. However, these expensive operations are impractical on devices with limited computation capabilities, memory capacities, and energy budgets. Furthermore, many existing methods do not focus on sub-6-bit (or deep) quantization. To fill these gaps, in this paper we propose DQA (Deep Quantization of DNN Activations), a new method that focuses on sub-6-bit quantization of activations and leverages simple shifting-based operations and Huffman coding to be efficient and achieve high accuracy. We evaluate DQA with 3, 4, and 5-bit quantization levels and three different DNN models for two different tasks, image classification and image segmentation, on two different datasets. DQA shows significantly better accuracy (up to 29.28%) compared to the direct quantization method and the state-of-the-art NoisyQuant for sub-6-bit quantization.

artificial intelligence, machine learning, quantization, (17 more...)

arXiv.org Machine Learning

2412.09687

Country:

Europe > United Kingdom > Scotland > City of Glasgow > Glasgow (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Morning After: Squid Game returns on December 26

EngadgetAug-1-2024, 11:15:50 GMT

After the live experiences, TV shows based on TV shows and a boom in childhood South Korean games and hobbies, Squid Game returns for season two. Almost three years after the bleak, lightly anti-capitalism drama became a massive hit in the US. Season two will hit Netflix December 26, with a final third season coming sometime in 2025. In a letter, series director and writer, Hwang Dong-hyuk, teased the continuation of Seong Gi-hun's revenge, facing off against Front Man. We're expecting more death, betrayal and enough delicious Korean food to make me want to take a trip to Seoul.

deepfake, huffman, squid game return, (2 more...)

Engadget

Country:

North America > United States (0.54)
Asia > South Korea > Seoul > Seoul (0.26)

Industry:

Leisure & Entertainment (0.96)
Media > Television (0.80)
Government (0.78)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Reddit shares priced at 34 in largest IPO by social media company in years

The GuardianMar-20-2024, 22:48:27 GMT

Reddit will enter a new era as a publicly traded company with a market value of 6.4bn after the social media platform's initial public offering was priced at 34 per share. The price, announced late on Wednesday, came in at the top of the target range set by Reddit's investment bankers as they spent the past few weeks gauging investor demand for the stock. It sets the stage for Reddit's shares to begin trading Thursday on the New York stock exchange under the ticker symbol RDDT in the largest initial public offering by a social media company in years. The platform, which is hoping to raise 748m, is set to sell 22m shares. The company's latest 6.4bn valuation is a drop from 2021, when it was valued at 10bn during a private funding round.

platform, reddit, social media company, (8 more...)

The Guardian

Country: North America > United States > New York (0.64)

Industry:

Media (1.00)
Banking & Finance > Trading (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.33)

Add feedback

Zero-shot Generative Linguistic Steganography

Lin, Ke, Luo, Yiyang, Zhang, Zijian, Luo, Ping

arXiv.org Artificial IntelligenceMar-16-2024

Generative linguistic steganography attempts to hide secret messages into covertext. Previous studies have generally focused on the statistical differences between the covertext and stegotext, however, ill-formed stegotext can readily be identified by humans. In this paper, we propose a novel zero-shot approach based on in-context learning for linguistic steganography to achieve better perceptual and statistical imperceptibility. We also design several new metrics and reproducible language evaluations to measure the imperceptibility of the stegotext. Our experimental results indicate that our method produces $1.926\times$ more innocent and intelligible stegotext than any other method.

imperceptibility, steganography, stegotext, (15 more...)

arXiv.org Artificial Intelligence

2403.10856

Country:

North America > United States > Wisconsin > Milwaukee County > West Allis (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
(6 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Narcan, rare books and citizenship: How L.A.'s chief librarian is meeting the city's needs

Los Angeles TimesAug-9-2023, 12:00:58 GMT

The sparrows fled the courtyard. It was quiet amid the classics. John Szabo stepped out of the elevator and walked through the sunlit atrium of the Central Library. He passed a slumbering homeless man and, with the efficiency of a spy, disappeared into stacks of bound archives, hundreds of thousands of relevant and obscure pages -- including the 1991 "Journal of the American Chamber of Commerce in Japan." A tall man with sparks of gray in his goatee, Szabo, the city librarian, oversees 72 branches, a $241.8 million budget, 17,000 restaurant menus, 64 ukuleles, a Shakespeare volume from 1685, and lockers of puppets for a children's theater. He stopped at a shelf holding years of "Family Handyman" magazines. Founded in 1951 for those who grout tile and hang cabinets, the periodical was no match for Prince Harry's memoir or a Stephen King novel.

librarian, library, szabo, (13 more...)

Los Angeles Times

Country:

Asia > Japan (0.24)
North America > United States > California > Los Angeles County > Los Angeles (0.07)
North America > United States > Ohio (0.04)
(7 more...)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Government > Regional Government (0.89)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.69)
Health & Medicine > Pharmaceuticals & Biotechnology (0.64)

Technology: Information Technology > Artificial Intelligence (0.46)

Add feedback

Optimal and Efficient Binary Questioning for Human-in-the-Loop Annotation

Marchesoni-Acland, Franco, Morel, Jean-Michel, Kherroubi, Josselin, Facciolo, Gabriele

arXiv.org Artificial IntelligenceJul-4-2023

Even though data annotation is extremely important for interpretability, research and development of artificial intelligence solutions, most research efforts such as active learning or few-shot learning focus on the sample efficiency problem. This paper studies the neglected complementary problem of getting annotated data given a predictor. For the simple binary classification setting, we present the spectrum ranging from optimal general solutions to practical efficient methods. The problem is framed as the full annotation of a binary classification dataset with the minimal number of yes/no questions when a predictor is available. For the case of general binary questions the solution is found in coding theory, where the optimal questioning strategy is given by the Huffman encoding of the possible labelings. However, this approach is computationally intractable even for small dataset sizes. We propose an alternative practical solution based on several heuristics and lookahead minimization of proxy cost functions. The proposed solution is analysed, compared with optimal solutions and evaluated on several synthetic and real-world datasets. On these datasets, the method allows a significant improvement ($23-86\%$) in annotation efficiency.

artificial intelligence, machine learning, predictor, (18 more...)

arXiv.org Artificial Intelligence

2307.01578

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

The Morning After: OpenAI and Microsoft aren't happy

EngadgetJun-14-2023, 11:15:44 GMT

Microsoft may own almost half of OpenAI, but a recent expose hints the pair aren't the happiest of bedfellows. The Wall Street Journal claims the AI company warned Microsoft not to incorporate GPT-4 into Bing search without further training, but it did so anyway. It resulted in several high-profile examples of odd behavior, including bots arguing with users, and at least one instance of a user being urged to dissolve their marriage and elope with Bing instead. There's resentment, too, on Microsoft's side, finding its own internal AI projects overlooked in favor of OpenAI. Which, despite the close financial ties, is very much free to work with Microsoft's rivals in plenty of fields.

announcement, mccartney, microsoft, (2 more...)

Engadget

Country: North America > United States > Indiana (0.06)

Industry:

Leisure & Entertainment (1.00)
Media > News (0.38)
Media > Music (0.32)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.83)

Add feedback