AITopics | Çano, Erion

Collaborating Authors

Çano, Erion

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Transparent NLP: Using RAG and LLM Alignment for Privacy Q&A

Leschanowsky, Anna, Kolagar, Zahra, Çano, Erion, Habernal, Ivan, Hallinan, Dara, Habets, Emanuël A. P., Popp, Birgit

arXiv.org Artificial IntelligenceFeb-10-2025

The transparency principle of the General Data Protection Regulation (GDPR) requires data processing information to be clear, precise, and accessible. While language models show promise in this context, their probabilistic nature complicates truthfulness and comprehensibility. This paper examines state-of-the-art Retrieval Augmented Generation (RAG) systems enhanced with alignment techniques to fulfill GDPR obligations. We evaluate RAG systems incorporating an alignment module like Rewindable Auto-regressive Inference (RAIN) and our proposed multidimensional extension, MultiRAIN, using a Privacy Q&A dataset. Responses are optimized for preciseness and comprehensibility and are assessed through 21 metrics, including deterministic and large language model-based evaluations. Our results show that RAG systems with an alignment module outperform baseline RAG systems on most metrics, though none fully match human answers. Principal component analysis of the results reveals complex interactions between metrics, highlighting the need to refine metrics. This study provides a foundation for integrating advanced natural language processing systems into legal compliance frameworks.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2502.06652

Country:

North America > Canada (0.14)
Europe > Denmark (0.14)
Asia > China (0.14)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.48)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Comprehensive Survey on Legal Summarization: Challenges and Future Directions

Akter, Mousumi, Çano, Erion, Weber, Erik, Dobler, Dennis, Habernal, Ivan

arXiv.org Artificial IntelligenceJan-29-2025

The constant engagement with extensive written materials is fundamental and immensely time-consuming [104]. Legal professionals often spend hours, if not days, combing through documents to find precedents or relevant cases that could be pivotal to their current cases. This laborious process is a significant part of the workload of legal professionals like lawyers and judges, taking up lots of time that could be invested otherwise. Automatic summarization tools could help to condense lengthy legal documents into concise summaries, helping to save both time and costs. Moreover, integrating advanced Natural Language Processing (NLP) techniques into legal research holds significant promise for democratizing access to legal information. Figure 1 shows the general pipeline for legal summarization. Compared to other domains, legal texts present unique challenges that distinguish them from other document types. Legal documents tend to be longer and more detailed than those from other domains.

information retrieval, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2501.1783

Country:

North America > Canada (1.00)
Europe (1.00)
Asia (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Law > Litigation (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Regional Government > Europe Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

AlbNews: A Corpus of Headlines for Topic Modeling in Albanian

Çano, Erion, Lamaj, Dario

arXiv.org Artificial IntelligenceFeb-6-2024

The scarcity of available text corpora for low-resource languages like Albanian is a serious hurdle for research in natural language processing tasks. This paper introduces AlbNews, a collection of 600 topically labeled news headlines and 2600 unlabeled ones in Albanian. The data can be freely used for conducting topic modeling research. We report the initial classification scores of some traditional machine learning classifiers trained with the AlbNews samples. These results show that basic models outrun the ensemble learning ones and can serve as a baseline for future experiments.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2402.04028

Country:

Europe (1.00)
Asia (0.69)
North America > United States > New York > New York County > New York City (0.15)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.72)

Add feedback

AlbNER: A Corpus for Named Entity Recognition in Albanian

Çano, Erion

arXiv.org Artificial IntelligenceSep-15-2023

Scarcity of resources such as annotated text corpora for under-resourced languages like Albanian is a serious impediment in computational linguistics and natural language processing research. This paper presents AlbNER, a corpus of 900 sentences with labeled named entities, collected from Albanian Wikipedia articles. Preliminary results with BERT and RoBERTa variants fine-tuned and tested with AlbNER data indicate that model size has slight impact on NER performance, whereas language transfer has a significant one. AlbNER corpus and these obtained results should serve as baselines for future experiments.

artificial intelligence, computational linguistic, natural language, (14 more...)

arXiv.org Artificial Intelligence

2309.08741

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

MemeGraphs: Linking Memes to Knowledge Graphs

Kougia, Vasiliki, Fetzel, Simon, Kirchmair, Thomas, Çano, Erion, Baharlou, Sina Moayed, Sharifzadeh, Sahand, Roth, Benjamin

arXiv.org Artificial IntelligenceJun-26-2023

Memes are a popular form of communicating trends and ideas in social media and on the internet in general, combining the modalities of images and text. They can express humor and sarcasm but can also have offensive content. Analyzing and classifying memes automatically is challenging since their interpretation relies on the understanding of visual elements, language, and background knowledge. Thus, it is important to meaningfully represent these sources and the interaction between them in order to classify a meme as a whole. In this work, we propose to use scene graphs, that express images in terms of objects and their visual relations, and knowledge graphs as structured representations for meme classification with a Transformer-based architecture. We compare our approach with ImgBERT, a multimodal model that uses only learned (instead of structured) representations of the meme, and observe consistent improvements. We further provide a dataset with human graph annotations that we compare to automatically generated graphs and entity linking. Analysis shows that automatic methods link more entities than human annotators and that automatically generated graphs are better suited for hatefulness classification in memes.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2305.18391

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Industry: Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

AlbMoRe: A Corpus of Movie Reviews for Sentiment Analysis in Albanian

Çano, Erion

arXiv.org Artificial IntelligenceJun-14-2023

Lack of available resources such as text corpora for low-resource languages seriously hinders research on natural language processing and computational linguistics. This paper presents AlbMoRe, a corpus of 800 sentiment annotated movie reviews in Albanian. Each text is labeled as positive or negative and can be used for sentiment analysis research. Preliminary results based on traditional machine learning classifiers trained with the AlbMoRe samples are also reported. They can serve as comparison baselines for future research experiments.

artificial intelligence, computational linguistic, natural language, (15 more...)

arXiv.org Artificial Intelligence

2306.08526

Country:

Asia > Middle East > Qatar (0.15)
North America > United States > Oregon (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report (0.52)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.76)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.76)

Add feedback

Batch Layer Normalization, A new normalization layer for CNNs and RNN

Ziaee, Amir, Çano, Erion

arXiv.org Artificial IntelligenceSep-19-2022

This study introduces a new normalization layer termed Batch Layer Normalization (BLN) to reduce the problem of internal covariate shift in deep neural network layers. As a combined version of batch and layer normalization, BLN adaptively puts appropriate weight on mini-batch and feature normalization based on the inverse size of mini-batches to normalize the input to a layer during the learning process. It also performs the exact computation with a minor change at inference times, using either mini-batch statistics or population statistics. The decision process to either use statistics of mini-batch or population gives BLN the ability to play a comprehensive role in the hyper-parameter optimization process of models. The key advantage of BLN is the support of the theoretical analysis of being independent of the input data, and its statistical configuration heavily depends on the task performed, the amount of training data, and the size of batches. Test results indicate the application potential of BLN and its faster convergence than batch normalization and layer normalization in both Convolutional and Recurrent Neural Networks. The code of the experiments is publicly available online (https://github.com/A2Amir/Batch-Layer-Normalization).

artificial intelligence, machine learning, normalization, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3571560.3571566

2209.08898

Country:

North America > United States > New York (0.14)
North America > United States > Hawaii (0.14)
Europe > Austria > Vienna (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Focused Contrastive Training for Test-based Constituency Analysis

Roth, Benjamin, Çano, Erion

arXiv.org Artificial IntelligenceSep-30-2021

We propose a scheme for self-training of grammaticality models for constituency analysis based on linguistic tests. A pre-trained language model is fine-tuned by contrastive estimation of grammatical sentences from a corpus, and ungrammatical sentences that were perturbed by a syntactic test, a transformation that is motivated by constituency theory. We show that consistent gains can be achieved if only certain positive instances are chosen for training, depending on whether they could be the result of a test transformation. This way, the positives, and negatives exhibit similar characteristics, which makes the objective more challenging for the language model, and also allows for additional markup that indicates the position of the test application within the sentence.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2109.15159

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.47)

Add feedback