Goto

Collaborating Authors

 lingo


Do you know your 2025 lingo? As 'parasocial' is named word of the year, take the test to see if you can keep up with this year's trending language

Daily Mail - Science & tech

The truth behind Trump's dramatic late-night Epstein file reversal: It wasn't a gamble, it was a tactic... and White House insiders say it's Democrats who will pay the price Doctor's warning about lesser discussed Mounjaro side effect - which has similar symptom to deadly bowel cancer The incredible new treatment that can cure liver cancer - without surgery, drugs or radiation. Roger had cirrhosis and thought he was going to die. Now he says: 'I'm so grateful' X is DOWN: Elon Musk's social media app crashes for thousands of users around the world Tom Cruise breaks his silence over ex-wife Nicole Kidman's split from Keith Urban: 'Karma' North Korea executes'big shot' couple who became'arrogant' after the success of their business, accusing them of being'anti-republic' Movie icon'lost her virginity to her stepfather at 11', seduced her friend's 17-year-old son... but took a forbidden secret to her grave Charlie Kirk's head of security finally explains the unusual hand signals his team made just moments before kill shot rang out Trump is being utterly humiliated by a dead pedophile. MAGA and his legacy are collapsing. AMANDA PLATELL: Everyone is saying the same thing about pampered Princess Beatrice and her latest PR stunt.


Efficient and Scalable Fine-Tune of Language Models for Genome Understanding

Zhan, Huixin, Wu, Ying Nian, Zhang, Zijun

arXiv.org Artificial Intelligence

Although DNA foundation models have advanced the understanding of genomes, they still face significant challenges in the limited scale and diversity of genomic data. This limitation starkly contrasts with the success of natural language foundation models, which thrive on substantially larger scales. Furthermore, genome understanding involves numerous downstream genome annotation tasks with inherent data heterogeneity, thereby necessitating more efficient and robust finetuning methods tailored for genomics. Lingo further accommodates numerous, heterogeneous downstream fine-tune tasks by an adaptive rank sampling method that prunes and stochastically reintroduces pruned singular vectors within small computational budgets. Adaptive rank sampling outperformed existing fine-tuning methods on all benchmarked 14 genome understanding tasks, while requiring fewer than 2% of trainable parameters as genomic-specific adapters. Impressively, applying these adapters on natural language foundation models matched or even exceeded the performance of DNA foundation models. Lingo presents a new paradigm of efficient and scalable genome understanding via genomic-specific adapters on language models. DNA foundation models, such as DNABERT [1], DNABERT-2 [2], and Nucleotide Transformer (NT) [3], have made significant progress in decoding the linguistic intricacies of the genome. An important paradigm of utilizing such DNA foundation models is "pre-training+finetuning", i.e., pre-training on unlabeled genomic sequences, and then adaptation to a particular genome understanding task. A critical aspect of genome annotation and downstream tasks is their considerable number and diversity. For example, state-of-the-art deep learning models in epigenetics alone can encompass nearly 22,000 individual tasks [4].


LINGO : Visually Debiasing Natural Language Instructions to Support Task Diversity

Arunkumar, Anjana, Sharma, Shubham, Agrawal, Rakhi, Chandrasekaran, Sriram, Bryan, Chris

arXiv.org Artificial Intelligence

Cross-task generalization is a significant outcome that defines mastery in natural language understanding. Humans show a remarkable aptitude for this, and can solve many different types of tasks, given definitions in the form of textual instructions and a small set of examples. Recent work with pre-trained language models mimics this learning style: users can define and exemplify a task for the model to attempt as a series of natural language prompts or instructions. While prompting approaches have led to higher cross-task generalization compared to traditional supervised learning, analyzing 'bias' in the task instructions given to the model is a difficult problem, and has thus been relatively unexplored. For instance, are we truly modeling a task, or are we modeling a user's instructions? To help investigate this, we develop LINGO, a novel visual analytics interface that supports an effective, task-driven workflow to (1) help identify bias in natural language task instructions, (2) alter (or create) task instructions to reduce bias, and (3) evaluate pre-trained model performance on debiased task instructions. To robustly evaluate LINGO, we conduct a user study with both novice and expert instruction creators, over a dataset of 1,616 linguistic tasks and their natural language instructions, spanning 55 different languages. For both user groups, LINGO promotes the creation of more difficult tasks for pre-trained models, that contain higher linguistic diversity and lower instruction bias. We additionally discuss how the insights learned in developing and evaluating LINGO can aid in the design of future dashboards that aim to minimize the effort involved in prompt creation across multiple domains.


Text Mining Through Label Induction Grouping Algorithm Based Method

Saleem, Gulshan, Ahmed, Nisar, Qamar, Usman

arXiv.org Artificial Intelligence

The main focus of information retrieval methods is to provide accurate and efficient results which are cost-effective too. LINGO (Label Induction Grouping Algorithm) is a clustering algorithm that aims to provide search results in form of quality clusters but also has a few limitations. In this paper, our focus is based on achieving results that are more meaningful and improving the overall performance of the algorithm. LINGO works on two main steps; Cluster Label Induction by using Latent Semantic Indexing technique (LSI) and Cluster content discovery by using the Vector Space Model (VSM). As LINGO uses VSM in cluster content discovery, our task is to replace VSM with LSI for cluster content discovery and to analyze the feasibility of using LSI with Okapi BM25. The next task is to compare the results of a modified method with the LINGO original method. The research is applied to five different text-based data sets to get more reliable results for every method. Research results show that LINGO produces 40-50% better results when using LSI for content Discovery. From theoretical evidence using Okapi BM25 for scoring method in LSI (LSI+Okapi BM25) for cluster content discovery instead of VSM, also results in better clusters generation in terms of scalability and performance when compares to both VSM and LSI's Results.


Introducing "Project: Machine Learning in a Box"

@machinelearnbot

In 2017, the very first "Machine Learning" oriented content based on the SAP Predictive Service was rolled out on the SAP Developer Center with dedicated series of tutorials and a brand new CodeJam topic that was delivered at over 12 locations. And the feedback I got was consistent and simple: we want more! And because with every new year comes new resolutions, new projects, new content, new ways to engage with the developer community (from the SAP ecosystem and beyond), we decided that 2018 is in no way different. So, let's start something different! The goal will be to let you to open the "black box" that a lot of people think is Machine Learning and help you find out what is inside.


Flying planes: Why Fintech A.I should be more human

#artificialintelligence

I don't know if you've ever tried to fly a plane but it is awesome. The idea of having complete control, flying through clouds and handling the most complicated machinery that man has ever built is bliss. The most terrifying aspect of flying is the passenger. Yes, I didn't expect that either but it is true. People are very unpredictable when they are under stress.


6 'data' buzzwords you need to understand

PCWorld

Take one major trend spanning the business and technology worlds, add countless vendors and consultants hoping to cash in, and what do you get? In the world of big data, the surrounding hype has spawned a brand-new lingo. Read on for a glossary of sorts highlighting some of the main data types you should understand. The shining star in this constellation of terms is "fast data," which is popping up with increasing frequency. It refers to "data whose utility is going to decline over time," said Tony Baer, a principal analyst at Ovum who says he coined the term back in 2012.


Tranzzl!n9o: A Human Computation Approach to English Translation of Internet Lingo

Hong, Ming-Tung (National Taiwan University) | Hsu, Yung-Jen (National Taiwan University)

AAAI Conferences

Lingo is an emerging language on the Internet. Providing a standardized definition remains difficult due to continuous changes made to its nature. We proposed Tranzzl!n9o, a crossword puzzle game for engaging crowds to translate Internet lingo. Players provide explanations for lingo in parallel and iteratively verify the explanations from other players. Crowd-sourced translations are very informative containing explanations as well as lingo usage.