Goto

Collaborating Authors

 South America


Offensive Language Detection with BERT-based models, By Customizing Attention Probabilities

arXiv.org Artificial Intelligence

This paper describes a novel study on using `Attention Mask' input in transformers and using this approach for detecting offensive content in both English and Persian languages. The paper's principal focus is to suggest a methodology to enhance the performance of the BERT-based models on the `Offensive Language Detection' task. Therefore, we customize attention probabilities by changing the `Attention Mask' input to create more efficacious word embeddings. To do this, we firstly tokenize the training set of the exploited datasets (by BERT tokenizer). Then, we apply Multinomial Naive Bayes to map these tokens to two probabilities. These probabilities indicate the likelihood of making a text non-offensive or offensive, provided that it contains that token. Afterwards, we use these probabilities to define a new term, namely Offensive Score. Next, we create two separate (because of the differences in the types of the employed datasets) equations based on Offensive Scores for each language to re-distribute the `Attention Mask' input for paying more attention to more offensive phrases. Eventually, we put the F1-macro score as our evaluation metric and fine-tune several combinations of BERT with ANNs, CNNs and RNNs to examine the effect of using this methodology on various combinations. The results indicate that all models will enhance with this methodology. The most improvement was 2% and 10% for English and Persian languages, respectively.


Estimating IRI based on pavement distress type, density, and severity: Insights from machine learning techniques

arXiv.org Machine Learning

Surface roughness is primary measure of pavement performance that has been associated with ride quality and vehicle operating costs. Of all the surface roughness indicators, the International Roughness Index (IRI) is the most widely used. However, it is costly to measure IRI, and for this reason, certain road classes are excluded from IRI measurements at a network level. Higher levels of distresses are generally associated with higher roughness. However, for a given roughness level, pavement data typically exhibits a great deal of variability in the distress types, density, and severity. It is hypothesized that it is feasible to estimate the IRI of a pavement section given its distress types and their respective densities and severities. To investigate this hypothesis, this paper uses data from in-service pavements and machine learning methods to ascertain the extent to which IRI can be predicted given a set of pavement attributes. The results suggest that machine learning can be used reliably to estimate IRI based on the measured distress types and their respective densities and severities. The analysis also showed that IRI estimated this way depends on the pavement type and functional class. The paper also includes an exploratory section that addresses the reverse situation, that is, estimating the probability of pavement distress type distribution and occurrence severity/extent based on a given roughness level.


Explainable Fact-checking through Question Answering

arXiv.org Artificial Intelligence

Misleading or false information has been creating chaos in some places around the world. To mitigate this issue, many researchers have proposed automated fact-checking methods to fight the spread of fake news. However, most methods cannot explain the reasoning behind their decisions, failing to build trust between machines and humans using such technology. Trust is essential for fact-checking to be applied in the real world. Here, we address fact-checking explainability through question answering. In particular, we propose generating questions and answers from claims and answering the same questions from evidence. We also propose an answer comparison model with an attention mechanism attached to each question. Leveraging question answering as a proxy, we break down automated fact-checking into several steps -- this separation aids models' explainability as it allows for more detailed analysis of their decision-making processes. Experimental results show that the proposed model can achieve state-of-the-art performance while providing reasonable explainable capabilities.


Brazil lawmakers approve bill regulating artificial intelligence

#artificialintelligence

Brazil's House of Representatives has approved a bill that sets out legal regulations for artificial intelligence (AI). Bill No. 21/20 outlines guidelines to develop and utilize AI in Brazil. The bill will regulate transparency regarding the use of AI in the public sector, promote the creation of AI for the public sector, and require the "adoption of regulatory instruments that promote innovation." AI can predict and make decisions when implemented into computer systems and machines. The innovation in society and the regulations that have been introduced have been welcomed by the author of the project, Deputy Eduardo Bismarck (PDT-CA). He stated that "the time is now to outline principles: rights and duties and responsibilities" to account for this innovation already integrated into reality.


In Israel, Google tests making traffic lights more efficient

#artificialintelligence

Google is running a pilot project in Israel that could use artificial intelligence to make traffic lights 10% to 20% more efficient for drivers. In a blog post touting the technology giant's current efforts in the field of environmentally sustainable solutions, Google and Alphabet CEO Sundar Pichai wrote: "We're finding ways to make routes more efficient, across an entire city, with early research into using artificial intelligence to optimize the efficiency of traffic lights. We've been piloting this research in Israel to predict traffic conditions and improve the timing of when traffic lights change. So far, we are seeing a 10%-20% reduction in fuel consumption and delay time at intersections. We're excited to expand these pilots to Rio de Janeiro and beyond."


AI: the Inverse Tower of Babel

#artificialintelligence

I've always found the fact that the acronym for artificial intelligence in English, AI, is surprisingly similar to the first two characters for that word in both simplified Chinese -- '人工智能'. The first two characters together, 人工, mean'people' and'work' individually, but when put together mean'artificial' while '智能' means'intelligent.' This is quite a fascinating linguistic experiment, and it's interesting that the two most widely used languages in the world came up a similar acronym or character for one of the most important technologies ever invented by man. Perhaps there is some weird universal synergy going on or maybe there's an easy answer hidden somewhere deep within the linguistic annals of these two languages. Either way, this got me thinking about language.


Cognitive/Artificial Intelligence Systems Market Analysis by Recent Developments and Demand 2021 to 2027 - Amite Tangy Digest

#artificialintelligence

The Cognitive/Artificial Intelligence Systems Market report includes a comprehensive analysis of the global market. This includes investigating past progress, on-going market scenarios, and future prospects. Accurate data on the products, strategies and market share of leading companies in this particular market are mentioned. This report provides a 360-degree overview of the global market's competitive landscape. The report further predicts the size and valuation of the global market during the forecast period.


"AI for Impact" lives up to its name

#artificialintelligence

For entrepreneurial MIT students looking to put their skills to work for a greater good, the Media Arts and Sciences class MAS.664 (AI for Impact) has been a destination point. With the onset of the pandemic, that goal came into even sharper focus. Just weeks before the campus shut down in 2020, a team of students from the class launched a project that would make significant strides toward an open-source platform to identify coronavirus exposures without compromising personal privacy. Their work was at the heart of Safe Paths, one of the earliest contact tracing apps in the United States. The students joined with volunteers from other universities, medical centers, and companies to publish their code, alongside a well-received white paper describing the privacy-preserving, decentralized protocol, all while working with organizations wishing to launch the app within their communities.



How to Create Dummy Data in Python

#artificialintelligence

Dummy data is randomly generated data that can be substituted for live data. Whether you are a Developer, Software Engineer, or Data Scientist, sometimes you need dummy data to test what you have built, it can be a web app, mobile app, or machine learning model. If you are using python language, you can use a faker python package to create dummy data of any type, for example, dates, transactions, names, texts, time, and others. Faker is a simple python package that generates fake data with different data types. Faker package is heavily inspired by PHP Faker, Perl Faker, and by Ruby Faker.