AITopics | Varol, Onur

Collaborating Authors

Varol, Onur

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

TurkishBERTweet: Fast and Reliable Large Language Model for Social Media Analysis

Najafi, Ali, Varol, Onur

arXiv.org Artificial IntelligenceNov-29-2023

Turkish is one of the most popular languages in the world. Wide us of this language on social media platforms such as Twitter, Instagram, or Tiktok and strategic position of the country in the world politics makes it appealing for the social network researchers and industry. To address this need, we introduce TurkishBERTweet, the first large scale pre-trained language model for Turkish social media built using almost 900 million tweets. The model shares the same architecture as base BERT model with smaller input length, making TurkishBERTweet lighter than BERTurk and can have significantly lower inference time. We trained our model using the same approach for RoBERTa model and evaluated on two text classification tasks: Sentiment Classification and Hate Speech Detection. We demonstrate that TurkishBERTweet outperforms the other available alternatives on generalizability and its lower inference time gives significant advantage to process large-scale datasets. We also compared our models with the commercial OpenAI solutions in terms of cost and performance to demonstrate TurkishBERTweet is scalable and cost-effective solution. As part of our research, we released TurkishBERTweet and fine-tuned LoRA adapters for the mentioned tasks under the MIT License to facilitate future research and applications on Turkish social media. Our TurkishBERTweet model is available at: https://github.com/ViralLab/TurkishBERTweet

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2311.18063

Country:

Asia > Middle East (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.64)

Industry: Information Technology > Services (0.88)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Network Medicine Framework for Identifying Drug Repurposing Opportunities for COVID-19

Gysi, Deisy Morselli, Valle, Ítalo Do, Zitnik, Marinka, Ameli, Asher, Gan, Xiao, Varol, Onur, Ghiassian, Susan Dina, Patten, JJ, Davey, Robert, Loscalzo, Joseph, Barabási, Albert-László

arXiv.org Machine LearningAug-9-2020

The current pandemic has highlighted the need for methodologies that can quickly and reliably prioritize clinically approved compounds for their potential effectiveness for SARS-CoV-2 infections. In the past decade, network medicine has developed and validated multiple predictive algorithms for drug repurposing, exploiting the sub-cellular network-based relationship between a drug's targets and disease genes. Here, we deployed algorithms relying on artificial intelligence, network diffusion, and network proximity, tasking each of them to rank 6,340 drugs for their expected efficacy against SARS-CoV-2. To test the predictions, we used as ground truth 918 drugs that had been experimentally screened in VeroE6 cells, and the list of drugs under clinical trial, that capture the medical community's assessment of drugs with potential COVID-19 efficacy. We find that while most algorithms offer predictive power for these ground truth data, no single method offers consistently reliable outcomes across all datasets and metrics. This prompted us to develop a multimodal approach that fuses the predictions of all algorithms, showing that a consensus among the different predictive methods consistently exceeds the performance of the best individual pipelines. We find that 76 of the 77 drugs that successfully reduced viral infection do not bind the proteins targeted by SARS-CoV-2, indicating that these drugs rely on network-based actions that cannot be identified using docking-based strategies. These advances offer a methodological pathway to identify repurposable drugs for future pathogens and neglected diseases underserved by the costs and extended timeline of de novo drug development.

immunology, neural network, pipeline, (22 more...)

arXiv.org Machine Learning

2004.07229

Country:

Europe (0.92)
Asia (0.68)
North America > United States > Massachusetts > Middlesex County (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

L2P: An Algorithm for Estimating Heavy-tailed Outcomes

Wang, Xindi, Varol, Onur, Eliassi-Rad, Tina

arXiv.org Machine LearningAug-13-2019

Many real-world prediction tasks have outcome (a.k.a.~target or response) variables that have characteristic heavy-tail distributions. Examples include copies of books sold, auction prices of art pieces, etc. By learning heavy-tailed distributions, ``big and rare'' instances (e.g., the best-sellers) will have accurate predictions. Most existing approaches are not dedicated to learning heavy-tailed distribution; thus, they heavily under-predict such instances. To tackle this problem, we introduce \emph{Learning to Place} (\texttt{L2P}), which exploits the pairwise relationships between instances to learn from a proportionally higher number of rare instances. \texttt{L2P} consists of two stages. In Stage 1, \texttt{L2P} learns a pairwise preference classifier: \textit{is instance A $>$ instance B?}. In Stage 2, \texttt{L2P} learns to place a new instance into an ordinal ranking of known instances. Based on its placement, the new instance is then assigned a value for its outcome variable. Experiments on real data show that \texttt{L2P} outperforms competing approaches in terms of accuracy and capability to reproduce heavy-tailed outcome distribution. In addition, \texttt{L2P} can provide an interpretable model with explainable outcomes by placing each predicted instance in context with its comparable neighbors.

dataset, law enforcement, neural network, (23 more...)

arXiv.org Machine Learning

1908.04628

Country: North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry: Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Detection of Promoted Social Media Campaigns

Ferrara, Emilio (University of Southern California) | Varol, Onur (Indiana University) | Menczer, Filippo (Indiana University) | Flammini, Alessandro (Indiana University)

AAAI ConferencesMay-8-2016

Information spreading on social media contributes to the formation of collective opinions. Millions of social media users are exposed every day to popular memes — some generated organically by grassroots activity, others sustained by advertising, information campaigns or more or less transparent coordinated efforts. While most information campaigns are benign, some may have nefarious purposes, including terrorist propaganda, political astroturf, and financial market manipulation. This poses a crucial technological challenge with deep social implications: can we detect whether the spreading of a viral meme is being sustained by a promotional campaign? Here we study trending memes that attract attention either organically, or by means of advertisement. We designed a machine learning framework capable to detect promoted campaigns and separate them from organic ones in their early stages. Using a dataset of millions of posts associated with trending Twitter hashtags, we prove that remarkably accurate early detection is possible, achieving 95% AUC score. Feature selection analysis reveals that network diffusion patterns and content cues are powerful early detection signals.

detection, promoted social media campaign

AAAI Conferences

Tenth International AAAI Conference on Web and Social Media

Industry:

Media (0.53)
Marketing (0.53)
Law Enforcement & Public Safety > Terrorism (0.53)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.87)

Add feedback

The DARPA Twitter Bot Challenge

Subrahmanian, V. S., Azaria, Amos, Durst, Skylar, Kagan, Vadim, Galstyan, Aram, Lerman, Kristina, Zhu, Linhong, Ferrara, Emilio, Flammini, Alessandro, Menczer, Filippo, Stevens, Andrew, Dekhtyar, Alexander, Gao, Shuyang, Hogg, Tad, Kooti, Farshad, Liu, Yan, Varol, Onur, Shiralkar, Prashant, Vydiswaran, Vinod, Mei, Qiaozhu, Hwang, Tim

arXiv.org Artificial IntelligenceApr-21-2016

A number of organizations ranging from terrorist groups such as ISIS to politicians and nation states reportedly conduct explicit campaigns to influence opinion on social media, posing a risk to democratic processes. There is thus a growing need to identify and eliminate "influence bots" - realistic, automated identities that illicitly shape discussion on sites like Twitter and Facebook - before they get too influential. Spurred by such events, DARPA held a 4-week competition in February/March 2015 in which multiple teams supported by the DARPA Social Media in Strategic Communications program competed to identify a set of previously identified "influence bots" serving as ground truth on a specific topic within Twitter. Past work regarding influence bots often has difficulty supporting claims about accuracy, since there is limited ground truth (though some exceptions do exist [3,7]). However, with the exception of [3], no past work has looked specifically at identifying influence bots on a specific topic. This paper describes the DARPA Challenge and describes the methods used by the three top-ranked teams.

bot, social media, us government, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/MC.2016.183

1601.0514

Country: North America > United States > Maryland > Prince George's County > College Park (0.14)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (1.00)
Law Enforcement & Public Safety > Terrorism (0.76)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

Add feedback