Albania
661c1c090ff5831a647202397c61d73c-Paper.pdf
Recent results in the literature indicate that a residual network (ResNet) composed of a single residual block outperforms linear predictors, in the sense that all local minima in its optimization landscape are at least as good as the best linear predictor. However, these results are limited to a single residual block (i.e., shallow ResNets), instead of the deep ResNets composed of multiple residual blocks. We take a step towards extending this result to deep ResNets. We start by two motivating examples. First, we show that there exist datasets for which all local minima of a fully-connected ReLU network are no better than the best linear predictor, whereas a ResNet has strictly better local minima. Second, we show that even at the global minimum, the representation obtained from the residual block outputs of a 2-block ResNet do not necessarily improve monotonically over subsequent blocks, which highlights a fundamental difficulty in analyzing deep ResNets. Our main theorem on deep ResNets shows under simple geometric conditions that, any critical point in the optimization landscape is either (i) at least as good as the best linear predictor; or (ii) the Hessian at this critical point has a strictly negative eigenvalue. Notably, our theorem shows that a chain of multiple skip-connections can improve the optimization landscape, whereas existing results study direct skip-connections to the last hidden layer or output layer. Finally, we complement our results by showing benign properties of the "near-identity regions" of deep ResNets, showing depth-independent upper bounds for the risk attained at critical points as well as the Rademacher complexity.
Integrating Natural Language Processing Techniques of Text Mining Into Financial System: Applications and Limitations
Millo, Denisa, Vika, Blerina, Baci, Nevila
The financial sector, a pivotal force in economic development, increasingly uses the intelligent technologies such as natural language processing to enhance data processing and insight extraction. This research paper through a review process of the time span of 2018-2023 explores the use of text mining as natural language processing techniques in various components of the financial system including asset pricing, corporate finance, derivatives, risk management, and public finance and highlights the need to address the specific problems in the discussion section. We notice that most of the research materials combined probabilistic with vector-space models, and text-data with numerical ones. The most used technique regarding information processing is the information classification technique and the most used algorithms include the long-short term memory and bidirectional encoder models. The research noticed that new specific algorithms are developed and the focus of the financial system is mainly on asset pricing component. The research also proposes a path from engineering perspective for researchers who need to analyze financial text. The challenges regarding text mining perspective such as data quality, context-adaption and model interpretability need to be solved so to integrate advanced natural language processing models and techniques in enhancing financial analysis and prediction. Keywords: Financial System (FS), Natural Language Processing (NLP), Software and Text Engineering, Probabilistic, Vector-Space, Models, Techniques, TextData, Financial Analysis.
Have Faith in Faithfulness: Going Beyond Circuit Overlap When Finding Model Mechanisms
Hanna, Michael, Pezzelle, Sandro, Belinkov, Yonatan
Many recent language model (LM) interpretability studies have adopted the circuits framework, which aims to find the minimal computational subgraph, or circuit, that explains LM behavior on a given task. Most studies determine which edges belong in a LM's circuit by performing causal interventions on each edge independently, but this scales poorly with model size. Edge attribution patching (EAP), gradient-based approximation to interventions, has emerged as a scalable but imperfect solution to this problem. In this paper, we introduce a new method - EAP with integrated gradients (EAP-IG) - that aims to better maintain a core property of circuits: faithfulness. A circuit is faithful if all model edges outside the circuit can be ablated without changing the model's performance on the task; faithfulness is what justifies studying circuits, rather than the full model. Our experiments demonstrate that circuits found using EAP are less faithful than those found using EAP-IG, even though both have high node overlap with circuits found previously using causal interventions. We conclude more generally that when using circuits to compare the mechanisms models use to solve tasks, faithfulness, not overlap, is what should be measured.
ChatGPT's app for iOS is now available in the UK and 10 more countries
Namely, these are Albania, Croatia, France, Germany, Ireland, Jamaica, Korea, New Zealand, Nicaragua, Nigeria, and the UK, and OpenAI says there are more to come "soon." OpenAI launched its ChatGPT app for iOS a week ago, initially limiting it to the U.S. market. The app syncs your history across devices, and can take voice input via OpenAI's speech-recognition system Whisper. If you're subscribed to ChatGPT Plus, you'll get the ChatGPT 4 capabilities on your iOS app, too. One thing is still missing, though: An Android ChatGPT app.
ChatGPT for iOS is now available in 11 more countries
OpenAI first launched its ChatGPT iOS app across the US in mid-May and now it has made good on its promise to expand to more countries in the "coming weeks" by launching in 11 new countries. The countries are a global mix with iOS users in Albania, Croatia, France, Germany, Ireland, Jamaica, Korea, New Zealand, Nicaragua, Nigeria and the UK now able to access the app. The ChatGPT app for iOS is now available to users in 11 more countries -- Albania, Croatia, France, Germany, Ireland, Jamaica, Korea, New Zealand, Nicaragua, Nigeria, and the UK. The ChatGPT app works and looks like the website does with conversation history synced between the computer and iPhone. ChatGPT Plus subscribers can access GPT-4 through the app and receive faster responses.
Bayesian community detection for networks with covariates
Shen, Luyi, Amini, Arash, Josephs, Nathaniel, Lin, Lizhen
The increasing prevalence of network data in a vast variety of fields and the need to extract useful information out of them have spurred fast developments in related models and algorithms. Among the various learning tasks with network data, community detection, the discovery of node clusters or "communities," has arguably received the most attention in the scientific community. In many real-world applications, the network data often come with additional information in the form of node or edge covariates that should ideally be leveraged for inference. In this paper, we add to a limited literature on community detection for networks with covariates by proposing a Bayesian stochastic block model with a covariate-dependent random partition prior. Under our prior, the covariates are explicitly expressed in specifying the prior distribution on the cluster membership. Our model has the flexibility of modeling uncertainties of all the parameter estimates including the community membership. Importantly, and unlike the majority of existing methods, our model has the ability to learn the number of the communities via posterior inference without having to assume it to be known. Our model can be applied to community detection in both dense and sparse networks, with both categorical and continuous covariates, and our MCMC algorithm is very efficient with good mixing properties. We demonstrate the superior performance of our model over existing models in a comprehensive simulation study and an application to two real datasets.
Transform Our Cities' Relationship With Nature With Advanced Technology
Our cities can no longer afford to be at war with nature: they need to rapidly become places where people and nature co-exist and thrive. Fortunately, there is growing recognition that nature-based solutions to cities' various challenges offer far wider benefits than traditional engineered'grey' solutions: including improving resilience, better health for its citizens, and a faster path to net zero. In our recent report with the World Economic Forum, BiodiverCities by 2030: Transforming Cities' Relationship with Nature, we highlighted that nature-based solutions are on average 50% more cost-effective than purely man-made alternatives, and deliver 28% more added value in both direct and environmental benefits. But what will wean us off our addiction to'grey' traditional concrete solutions, and move us towards approaches that better regenerate nature and reduce carbon? I believe that the innovation and fresh opportunities that come from using advanced digital tools can provide the answer.
Multiway Spherical Clustering via Degree-Corrected Tensor Block Models
We consider the problem of multiway clustering in the presence of unknown degree heterogeneity. Such data problems arise commonly in applications such as recommendation system, neuroimaging, community detection, and hypergraph partitions in social networks. The allowance of degree heterogeneity provides great flexibility in clustering models, but the extra complexity poses significant challenges in both statistics and computation. Here, we develop a degree-corrected tensor block model with estimation accuracy guarantees. We present the phase transition of clustering performance based on the notion of angle separability, and we characterize three signal-to-noise regimes corresponding to different statistical-computational behaviors. In particular, we demonstrate that an intrinsic statistical-to-computational gap emerges only for tensors of order three or greater. Further, we develop an efficient polynomial-time algorithm that provably achieves exact clustering under mild signal conditions. The efficacy of our procedure is demonstrated through two data applications, one on human brain connectome project, and another on Peru Legislation network dataset.
Facebook tackles deepfake spread and troll farms in latest moderation push
Facebook has removed a troll farm, spreaders of misinformation, and creators of deepfake images in its latest moderation efforts. The company's latest Coordinated Inauthentic Behavior (CIB) report, published this week (.PDF), lists Facebook's most recent efforts to reduce coordinated, inauthentic behavior across the network. According to the March CIB report, Facebook investigated and wiped out a "long-running" troll farm located in Albania. The troll farm's members primarily targeted an Iranian audience and are thought to have ties to Mojahedin-e Khalq (MEK), a political-militant group made up of several thousand members. MEK was exiled to Albania in the 1980s and now appears to be running a network made up of both genuine and fake accounts to spread information that is critical of the Iranian government and that praises MEK's activities.
Implicit Normalizing Flows
Lu, Cheng, Chen, Jianfei, Li, Chongxuan, Wang, Qiuhao, Zhu, Jun
Normalizing flows define a probability distribution by an explicit invertible transformation $\boldsymbol{\mathbf{z}}=f(\boldsymbol{\mathbf{x}})$. In this work, we present implicit normalizing flows (ImpFlows), which generalize normalizing flows by allowing the mapping to be implicitly defined by the roots of an equation $F(\boldsymbol{\mathbf{z}}, \boldsymbol{\mathbf{x}})= \boldsymbol{\mathbf{0}}$. ImpFlows build on residual flows (ResFlows) with a proper balance between expressiveness and tractability. Through theoretical analysis, we show that the function space of ImpFlow is strictly richer than that of ResFlows. Furthermore, for any ResFlow with a fixed number of blocks, there exists some function that ResFlow has a non-negligible approximation error. However, the function is exactly representable by a single-block ImpFlow. We propose a scalable algorithm to train and draw samples from ImpFlows. Empirically, we evaluate ImpFlow on several classification and density modeling tasks, and ImpFlow outperforms ResFlow with a comparable amount of parameters on all the benchmarks.