bulgaria
Ants can be used to make 'tangy' yogurt
Environment Animals Insects Ants can be used to make'tangy' yogurt An old family recipe from Bulgaria goes under the microscope. Breakthroughs, discoveries, and DIY tips sent every weekday. Typically, we humans do our best to keep ants out of our kitchens and away from our food . But an old and almost forgotten recipe harnesses the power of these hard-working insects to make yogurt. The recipe, which was once common across Turkey (or Türkiye) and the Balkans, has been recreated in a study published today in the journal .
- Asia > Middle East > Republic of Türkiye (0.56)
- Europe > Bulgaria (0.27)
- Europe > Denmark > Capital Region > Copenhagen (0.05)
- (11 more...)
Rare cataclysmic exploding star spotted by citizen scientists
Breakthroughs, discoveries, and DIY tips sent every weekday. Two years ago, a team of astronomers requested help from citizen scientists around the world for the Kilonova Seekers Project. Launched in July 2023, the endeavor tasks volunteers with parsing through all-sky survey images captured daily by telescopes on opposite sides of the planet known as the Gravitational-wave Optical Transient Observer (GOTO). Within six months, Kilonova Seekers' over 2,000 volunteers contributed more than 600,000 classifications to researchers, resulting in a total of 20 new discoveries. Now, astronomers have announced the project's first major published find in Astronomy & Astrophysics: a brilliant exploding star observed in near real-time.
ParsiPy: NLP Toolkit for Historical Persian Texts in Python
Farsi, Farhan, Fazel, Parnian, Haghighi, Sepand, Sabouri, Sadra, Goshtasb, Farzaneh, Hajipour, Nadia, Asgari, Ehsaneddin, Sameti, Hossein
The study of historical languages presents unique challenges due to their complex orthographic systems, fragmentary textual evidence, and the absence of standardized digital representations of text in those languages. Tackling these challenges needs special NLP digital tools to handle phonetic transcriptions and analyze ancient texts. This work introduces ParsiPy, an NLP toolkit designed to facilitate the analysis of historical Persian languages by offering modules for tokenization, lemmatization, part-of-speech tagging, phoneme-to-transliteration conversion, and word embedding. We demonstrate the utility of our toolkit through the processing of Parsig (Middle Persian) texts, highlighting its potential for expanding computational methods in the study of historical languages. Through this work, we contribute to computational philology, offering tools that can be adapted for the broader study of ancient texts and their digital preservation.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- North America > United States > California (0.14)
- Europe > Bulgaria > Varna Province > Varna (0.05)
- (8 more...)
BgGPT 1.0: Extending English-centric LLMs to other languages
Alexandrov, Anton, Raychev, Veselin, Dimitrov, Dimitar I., Zhang, Ce, Vechev, Martin, Toutanova, Kristina
We present BgGPT-Gemma-2-27B-Instruct and BgGPT-Gemma-2-9B-Instruct: continually pretrained and fine-tuned versions of Google's Gemma-2 models, specifically optimized for Bulgarian language understanding and generation. Leveraging Gemma-2's multilingual capabilities and over 100 billion tokens of Bulgarian and English text data, our models demonstrate strong performance in Bulgarian language tasks, setting a new standard for language-specific AI models. Our approach maintains the robust capabilities of the original Gemma-2 models, ensuring that the English language performance remains intact. To preserve the base model capabilities, we incorporate continual learning strategies based on recent Branch-and-Merge techniques as well as thorough curation and selection of training data. We provide detailed insights into our methodology, including the release of model weights with a commercial-friendly license, enabling broader adoption by researchers, companies, and hobbyists. Further, we establish a comprehensive set of benchmarks based on non-public educational data sources to evaluate models on Bulgarian language tasks as well as safety and chat capabilities. Our findings demonstrate the effectiveness of fine-tuning state-of-the-art models like Gemma 2 to enhance language-specific AI applications while maintaining cross-lingual capabilities.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Bulgaria (0.05)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- (13 more...)
Are Large Language Models Chameleons?
Geng, Mingmeng, He, Sihong, Trotta, Roberto
Do large language models (LLMs) have their own worldviews and personality tendencies? Simulations in which an LLM was asked to answer subjective questions were conducted more than 1 million times. Comparison of the responses from different LLMs with real data from the European Social Survey (ESS) suggests that the effect of prompts on bias and variability is fundamental, highlighting major cultural, age, and gender biases. Methods for measuring the difference between LLMs and survey data are discussed, such as calculating weighted means and a new proposed measure inspired by Jaccard similarity. We conclude that it is important to analyze the robustness and variability of prompts before using LLMs to model individual decisions or collective behavior, as their imitation abilities are approximate at best.
- Europe > Bulgaria (0.15)
- North America > United States (0.14)
- Europe > Germany (0.05)
- (31 more...)
- Research Report (1.00)
- Questionnaire & Opinion Survey (1.00)
cantnlp@LT-EDI-2024: Automatic Detection of Anti-LGBTQ+ Hate Speech in Under-resourced Languages
Wong, Sidney G. -J., Durward, Matthew
This paper describes our homophobia/transphobia in social media comments detection system developed as part of the shared task at LT-EDI-2024. We took a transformer-based approach to develop our multiclass classification model for ten language conditions (English, Spanish, Gujarati, Hindi, Kannada, Malayalam, Marathi, Tamil, Tulu, and Telugu). We introduced synthetic and organic instances of script-switched language data during domain adaptation to mirror the linguistic realities of social media language as seen in the labelled training data. Our system ranked second for Gujarati and Telugu with varying levels of performance for other language conditions. The results suggest incorporating elements of paralinguistic behaviour such as script-switching may improve the performance of language detection systems especially in the cases of under-resourced languages conditions.
- Oceania > New Zealand (0.05)
- Europe > Bulgaria > Varna Province > Varna (0.05)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- (4 more...)
Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2023): Workshop and Shared Task Report
Hürriyetoğlu, Ali, Tanev, Hristo, Mutlu, Osman, Thapa, Surendrabikram, Tan, Fiona Anting, Yörük, Erdem
We provide a summary of the sixth edition of the CASE workshop that is held in the scope of RANLP 2023. The workshop consists of regular papers, three keynotes, working papers of shared task participants, and shared task overview papers. This workshop series has been bringing together all aspects of event information collection across technical and social science fields. In addition to contributing to the progress in text based event extraction, the workshop provides a space for the organization of a multimodal event information collection task.
- Europe > Ukraine (0.14)
- Asia > Russia (0.14)
- Europe > Bulgaria > Varna Province > Varna (0.07)
- (15 more...)
- Research Report (1.00)
- Instructional Material > Course Syllabus & Notes (0.88)
On the trail of the Dark Avenger: the most dangerous virus writer in the world
In the 1980s, there was no better place than Bulgaria for virus lovers. The socialist country – plagued by hyperinflation, crumbling infrastructure, food and petrol rationing, daily blackouts and packs of wild dogs in its streets – had become one of the hottest hi-tech zones on the planet. Legions of young Bulgarian programmers were tinkering on their pirated IBM PC clones, pumping out computer viruses that managed to travel to the gleaming and prosperous west. In 1989, an article appeared in Bulgaria's leading computer magazine saying the media's treatment of computer viruses was sensationalist and inaccurate. The article, in the January issue of Bulgaria's Computer for You magazine, titled The Truth About Computer Viruses, was written by Vesselin Bontchev, a 29-year-old researcher at the Institute of Industrial Cybernetics and Robotics at the Bulgarian Academy of Sciences in Sofia. Fear of computer viruses, Bontchev wrote, was turning into "mass psychosis". Any competent programmer, Bontchev claimed, could tell when files are corrupted by a virus. Infected files are bigger than uninfected files. They do strange things, such as play tunes, draw Christmas trees on the screen and reboot computers. It was hard to miss a virus! Prevention through basic cyber hygiene was simple: "Do not allow other people to use your computer; do not use suspicious software products; do not use software products acquired illegally."
- Europe > Austria > Vienna (0.06)
- North America > United States > Missouri > St. Louis County > St. Louis (0.04)
- Europe > Russia (0.04)
- (2 more...)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence > Robots (0.54)
Senior Data Engineer at SumUp - Sofia, Bulgaria
The team does so by providing real time models and batch applications in the realm of risk and financial crime over the whole SumUp life cycle and products. Together with the risk platform squad the risk modelling team builds the necessary platform foundations for scalable and reliable ML model serving and development in SumUp. The platform enables a global approach supported by local specifics. Are you up for the challenge? At SumUp, we are driven to empower small businesses across the globe by de-hassling their lives and helping them to succeed.
- Europe > Bulgaria > Sofia City Province > Sofia (0.40)
- South America (0.06)
- North America > United States (0.06)
- Banking & Finance (0.55)
- Law Enforcement & Public Safety (0.39)
Machine Learning Consultant at Experian - Sofia, Bulgaria
Experian is the world's leading global information services company. During life's big moments -- from buying a home or a car to sending a child to college to growing a business by connecting with new customers -- we empower consumers and our clients to manage their data with confidence. We have 20,000 people operating across 44 countries. By investing in our people, technology and innovation, we can help transform businesses, help communities prosper, enable more people to feel included in the financial opportunities that should be available to them, and help people to thrive. We're looking for inspired employees that want to make an impact on people and business.