Alliance
Geological Inference from Textual Data using Word Embeddings
Linphrachaya, Nanmanas, Gómez-Méndez, Irving, Siripatana, Adil
This research explores the use of Natural Language Processing (NLP) techniques to locate geological resources, with a specific focus on industrial minerals. By using word embeddings trained with the GloVe model, we extract semantic relationships between target keywords and a corpus of geological texts. The text is filtered to retain only words with geographical significance, such as city names, which are then ranked by their cosine similarity to the target keyword. Dimensional reduction techniques, including Principal Component Analysis (PCA), Autoencoder, Variational Autoencoder (VAE), and VAE with Long Short-Term Memory (VAE-LSTM), are applied to enhance feature extraction and improve the accuracy of semantic relations. For benchmarking, we calculate the proximity between the ten cities most semantically related to the target keyword and identified mine locations using the haversine equation. The results demonstrate that combining NLP with dimensional reduction techniques provides meaningful insights into the spatial distribution of natural resources. Although the result shows to be in the same region as the supposed location, the accuracy has room for improvement.
- Europe > United Kingdom (0.05)
- Asia > Indonesia > Java > Jakarta > Jakarta (0.05)
- North America > Canada > British Columbia (0.04)
- (32 more...)
- Energy (0.94)
- Materials > Metals & Mining > Lithium (0.50)
Data Science Education in Undergraduate Physics: Lessons Learned from a Community of Practice
Shah, Karan, Butler, Julie, Knaub, Alexis, Zenginoğlu, Anıl, Ratcliff, William, Soltanieh-ha, Mohammad
It is becoming increasingly important that physics educators equip their students with the skills to work with data effectively. However, many educators may lack the necessary training and expertise in data science to teach these skills. To address this gap, we created the Data Science Education Community of Practice (DSECOP), bringing together graduate students and physics educators from different institutions and backgrounds to share best practices and lessons learned from integrating data science into undergraduate physics education. In this article we present insights and experiences from this community of practice, highlighting key strategies and challenges in incorporating data science into the introductory physics curriculum. Our goal is to provide guidance and inspiration to educators who seek to integrate data science into their teaching, helping to prepare the next generation of physicists for a data-driven world.
- North America > United States > Maryland > Prince George's County > College Park (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Ohio > Stark County > Alliance (0.04)
- (5 more...)
- Research Report (1.00)
- Instructional Material > Course Syllabus & Notes (1.00)
Neural Approaches to Entity-Centric Information Extraction
Artificial Intelligence (AI) has huge impact on our daily lives with applications such as voice assistants, facial recognition, chatbots, autonomously driving cars, etc. Natural Language Processing (NLP) is a cross-discipline of AI and Linguistics, dedicated to study the understanding of the text. This is a very challenging area due to unstructured nature of the language, with many ambiguous and corner cases. In this thesis we address a very specific area of NLP that involves the understanding of entities (e.g., names of people, organizations, locations) in text. First, we introduce a radically different, entity-centric view of the information in text. We argue that instead of using individual mentions in text to understand their meaning, we should build applications that would work in terms of entity concepts. Next, we present a more detailed model on how the entity-centric approach can be used for the entity linking task. In our work, we show that this task can be improved by considering performing entity linking at the coreference cluster level rather than each of the mentions individually. In our next work, we further study how information from Knowledge Base entities can be integrated into text. Finally, we analyze the evolution of the entities from the evolving temporal perspective.
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- North America > United States > Oregon > Multnomah County > Portland (0.04)
- North America > United States > Ohio > Stark County > Alliance (0.04)
- (25 more...)
- Research Report > New Finding (1.00)
- Overview (1.00)
- Research Report > Experimental Study (0.92)
- Law (1.00)
- Information Technology (1.00)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
- (10 more...)
Machine Learning Technique Predicting Video Streaming Views to Reduce Cost of Cloud Services
Video streams tremendously occupied the highest portion of online traffic. Multiple versions of a video are created to fit the user's device specifications. In cloud storage, Keeping all versions of frequently accessed video streams in the repository for the long term imposes a significant cost paid by video streaming providers. Generally, the popularity of a video changes each period of time, which means the number of views received by a video could be dropped, thus, the video must be deleted from the repository. Therefore, in this paper, we develop a method that predicts the popularity of each video stream in the repository in the next period. On the other hand, we propose an algorithm that utilizes the predicted popularity of a video to compute the storage cost, and then it decides whether the video will be kept or deleted from the cloud repository. The experiment results show a cost reduction of the cloud services by 15% compared to keeping all video streams.
Determining Sentencing Recommendations and Patentability Using a Machine Learning Trained Expert System
Brown, Logan, Pezewski, Reid, Straub, Jeremy
This paper presents two studies that use a machine learning expert system (MLES). One focuses on a system to advise to United States federal judges for regarding consistent federal criminal sentencing, based on both the federal sentencing guidelines and offender characteristics. The other study aims to develop a system that could prospectively assist the U.S. Patent and Trademark Office automate their patentability assessment process. Both studies use a machine learning-trained rule-fact expert system network to accept input variables for training and presentation and output a scaled variable that represents the system recommendation (e.g., the sentence length or the patentability assessment). This paper presents and compares the rule-fact networks that have been developed for these projects. It explains the decision-making process underlying the structures used for both networks and the pre-processing of data that was needed and performed. It also, through comparing the two systems, discusses how different methods can be used with the MLES system.
- North America > United States > Wisconsin > Milwaukee County > Milwaukee (0.04)
- North America > United States > Ohio > Stark County > Alliance (0.04)
- North America > United States > North Dakota > Cass County > Fargo (0.04)
- (5 more...)