South America
NLP Stemming
In my February blog, I explained how to use the tokenization technique in Natural Language Processing (NLP) to predict whether a particular Tweet could be geolocated to a particular neighborhood in the city of Caracas, Venezuela. Almost 37,000 Spanish Tweets that had a latitude and longitude from the city of Caracas, Venezuela were used to observe reactions to the food shortages within each of the city's five municipalities from December 2014 to October 2016.
Netherlands Dataset: A New Public Dataset for Machine Learning in Seismic Interpretation
Silva, Reinaldo Mozart, Baroni, Lais, Ferreira, Rodrigo S., Civitarese, Daniel, Szwarcman, Daniela, Brazil, Emilio Vital
Machine learning and, more specifically, deep learning algorithms have seen remarkable growth in their popularity and usefulness in the last years. This is arguably due to three main factors: powerful computers, new techniques to train deeper networks and larger datasets. Although the first two are readily available in modern computers and ML libraries, the last one remains a challenge for many domains. It is a fact that big data is a reality in almost all fields nowadays, and geosciences are not an exception. However, to achieve the success of general-purpose applications such as ImageNet - for which there are +14 million labeled images for 1000 target classes - we not only need more data, we need more high-quality labeled data. When it comes to the Oil&Gas industry, confidentiality issues hamper even more the sharing of datasets. In this work, we present the Netherlands interpretation dataset, a contribution to the development of machine learning in seismic interpretation. The Netherlands F3 dataset acquisition was carried out in the North Sea, Netherlands offshore. The data is publicly available and contains pos-stack data, 8 horizons and well logs of 4 wells. For the purposes of our machine learning tasks, the original dataset was reinterpreted, generating 9 horizons separating different seismic facies intervals. The interpreted horizons were used to generate approximatelly 190,000 labeled images for inlines and crosslines. Finally, we present two deep learning applications in which the proposed dataset was employed and produced compelling results.
On the Functional Equivalence of TSK Fuzzy Systems to Neural Networks, Mixture of Experts, CART, and Stacking Ensemble Regression
Wu, Dongrui, Lin, Chin-Teng, Huang, Jian, Zeng, Zhigang
Fuzzy systems have achieved great success in numerous applications. However, there are still many challenges in designing an optimal fuzzy system, e.g., how to efficiently train its parameters, how to improve its performance without adding too many parameters, how to balance the trade-off between cooperations and competitions among the rules, how to overcome the curse of dimensionality, etc. Literature has shown that by making appropriate connections between fuzzy systems and other machine learning approaches, good practices from other domains may be used to improve the fuzzy systems, and vice versa. This paper gives an overview on the functional equivalence between Takagi-Sugeno-Kang fuzzy systems and four classic machine learning approaches -- neural networks, mixture of experts, classification and regression trees, and stacking ensemble regression -- for regression problems. We also point out some promising new research directions, inspired by the functional equivalence, that could lead to solutions to the aforementioned problems. To our knowledge, this is so far the most comprehensive overview on the connections between fuzzy systems and other popular machine learning approaches, and hopefully will stimulate more hybridization between different machine learning algorithms.
Iteratively Learning Embeddings and Rules for Knowledge Graph Reasoning
Zhang, Wen, Paudel, Bibek, Wang, Liang, Chen, Jiaoyan, Zhu, Hai, Zhang, Wei, Bernstein, Abraham, Chen, Huajun
Reasoning is essential for the development of large knowledge graphs, especially for completion, which aims to infer new triples based on existing ones. Both rules and embeddings can be used for knowledge graph reasoning and they have their own advantages and difficulties. Rule-based reasoning is accurate and explainable but rule learning with searching over the graph always suffers from efficiency due to huge search space. Embedding-based reasoning is more scalable and efficient as the reasoning is conducted via computation between embeddings, but it has difficulty learning good representations for sparse entities because a good embedding relies heavily on data richness. Based on this observation, in this paper we explore how embedding and rule learning can be combined together and complement each other's difficulties with their advantages. We propose a novel framework IterE iteratively learning embeddings and rules, in which rules are learned from embeddings with proper pruning strategy and embeddings are learned from existing triples and new triples inferred by rules. Evaluations on embedding qualities of IterE show that rules help improve the quality of sparse entity embeddings and their link prediction results. We also evaluate the efficiency of rule learning and quality of rules from IterE compared with AMIE+, showing that IterE is capable of generating high quality rules more efficiently. Experiments show that iteratively learning embeddings and rules benefit each other during learning and prediction.
Polynomial and Exponential Bounded Logic Programs with Function Symbols: Some New Decidable Classes
Asuncion, Vernon, Zhang, Yan, Zhang, Heng, Li, Ruixuan
A logic program with function symbols is called finitely ground if there is a finite propositional logic program whose stable models are exactly the same as the stable models of this program. Finite groundability is an important property for logic programs with function symbols because it makes feasible to compute such programs' stable models using traditional ASP solvers. In this paper, we introduce new decidable classes of finitely ground programs called poly-bounded and k-EXP-bounded programs, which, to the best of our knowledge, strictly contain all other decidable classes of finitely ground programs discovered so far in the literature. We also study the relevant complexity properties for these classes of programs. We prove that the membership complexities for poly-bounded and k-EXP-bounded programs are EXPTIME-complete and (k+1)-EXPTIME-complete, respectively.
Extracting Frequent Gradual Patterns Using Constraints Modeling
Lonlac, Jerry, Jabbour, Saïdd, Nguifo, Engelbert Mephu, Saïs, Lakhdar, Raddaoui, Badran
In this paper, we propose a constraint-based modeling approach for the problem of discovering frequent gradual patterns in a numerical dataset. This SAT-based declarative approach offers an additional possibility to benefit from the recent progress in satisfiability testing and to exploit the efficiency of modern SAT solvers for enumerating all frequent gradual patterns in a numerical dataset. Our approach can easily be extended with extra constraints, such as temporal constraints in order to extract more specific patterns in a broad range of gradual patterns mining applications. We show the practical feasibility of our SAT model by running experiments on two real world datasets.
Discovering Heterogeneous Subsequences for Trajectory Classification
Ferrero, Carlos Andres, Petry, Lucas May, Alvares, Luis Otavio, Zalewski, Willian, Bogorny, Vania
However these works are limited to only consider three dimension, as space, time, and semantics. Ferrero in [2] introduced the concept of Multiple Aspect Trajectory Analysis, that consists of analyzing trajectory data by integrating other movement aspects to further enrich trajectory data, such as more information about the visited places, the transportation modes, the weather conditions, and the social interactions. The proposal in [2] is that time has come to integrate all relevant information about movement in trajectories and explore trajectory analysis over several layers of information. An example of this new kind of trajectory is shown in Figure 1.
How AI, blockchain and autonomous vehicles will shake up insurance: Allianz exec
Dogan Kaleli, head of programs for Allianz Global Corporate and Specialty in North America, has an active role in Allianz's innovation efforts driving multiple InsurTech projects within Allianz North America. He is also a cofounder of multiple tech startups specializing in AI, quantum computing and digitalized ecosystems. Previously, Dogan was Allianz's Head of Risk Management in the Americas and began his insurance career with Allianz in Turkey and Brazil; Kaleli actively mentors up-and-coming Turkish entrepreneurs through the Hamdi Ulukaya Startup Support Program (HUG). He spoke to Digital Insurance about issues in insurtech. How does your work in the program business connect to your interest in emerging tech and innovation in insurance?
9 Artificial Intelligence Startups in Lebanon - Nanalyze
With roughly the same population as the State of Missouri, Lebanon is a small country of six million people that borders Syria and Israel. Due to its location, the country has been subjected to a multitude of political and religious factions inhabiting the state. People frequently fight over whose invisible friend is better, and the country has faced long periods of instability including wars with Israel, civil wars and internal conflicts, and most recently some spillover from the Syrian war – which means lots of Syrians flying around on motorcycles. All of this turmoil has contributed to structural problems in the economy such as chronic fiscal deficits that have increased Lebanon's debt-to-GDP ratio to the third highest in the world. Economic growth has slowed to 1-2% over the past decade which constrains government investments in necessary infrastructure improvements. Notwithstanding these challenges, day to day life in Lebanon is pretty awesome.
Artificial intelligence to boost Latin American and Caribbean economies: IDB study
BUENOS AIRES – Artificial intelligence can facilitate trade negotiations and add one additional percentage point of growth to Latin American and the Caribbean economies, according to a study by the Inter-American Development Bank (IDB). The study – put together by the IDB's Institute for the Integration of Latin America and the Caribbean (INTAL) – predicts artificial intelligence could boost regional GDP in the mid-term to 4 percent from current 3 percent projections. Brazil, for instance, could grow 4.1 percent a year instead of 3.2 percent. Colombia's economy could hit 4.5 percent growth instead of 3.7 percent. Overall, economic growth of countries that embrace artificial intelligence is expected to be 25 percent higher, on average, than those that do not, the report finds.