kale
BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions
Awadalla, Anas, Xue, Le, Shu, Manli, Yan, An, Wang, Jun, Purushwalkam, Senthil, Shen, Sheng, Lee, Hannah, Lo, Oscar, Park, Jae Sung, Guha, Etash, Savarese, Silvio, Schmidt, Ludwig, Choi, Yejin, Xiong, Caiming, Xu, Ran
Table 1: Comparison of open-source synthetic image-text datasets: We compare various datasets in terms of scale (number of samples), density (average number of words per sample), whether they are knowledge-augmented (meaning that the caption includes information found in image's web scraped alt-text), and the size of the captioning model used to generate the descriptions. For KALE, we create an initial pool of 100M captions from a 17B parameter model and use it to distill a 2B parameter model that matches the performance of the larger 17B model. We introduce BLIP3-KALE, a dataset of 218 million image-text pairs that advances the state of knowledge-augmented image captioning. KALE builds upon recent work in this area, particularly CapsFusion [28], which pioneered the use of large language models to fuse synthetically generated captions with alt-text to incorporate real-world knowledge.
KALE: An Artwork Image Captioning System Augmented with Heterogeneous Graph
Jiang, Yanbei, Ehinger, Krista A., Lau, Jey Han
Exploring the narratives conveyed by fine-art paintings is a challenge in image captioning, where the goal is to generate descriptions that not only precisely represent the visual content but also offer a in-depth interpretation of the artwork's meaning. The task is particularly complex for artwork images due to their diverse interpretations and varied aesthetic principles across different artistic schools and styles. In response to this, we present KALE Knowledge-Augmented vision-Language model for artwork Elaborations), a novel approach that enhances existing vision-language models by integrating artwork metadata as additional knowledge. KALE incorporates the metadata in two ways: firstly as direct textual input, and secondly through a multimodal heterogeneous knowledge graph. To optimize the learning of graph representations, we introduce a new cross-modal alignment loss that maximizes the similarity between the image and its corresponding metadata. Experimental results demonstrate that KALE achieves strong performance (when evaluated with CIDEr, in particular) over existing state-of-the-art work across several artwork datasets. Source code of the project is available at https://github.com/Yanbei-Jiang/Artwork-Interpretation.
Quick Dense Retrievers Consume KALE: Post Training Kullback Leibler Alignment of Embeddings for Asymmetrical dual encoders
Campos, Daniel, Magnani, Alessandro, Zhai, ChengXiang
In this paper, we consider the problem of improving the inference latency of language model-based dense retrieval systems by introducing structural compression and model size asymmetry between the context and query encoders. First, we investigate the impact of pre and post-training compression on the MSMARCO, Natural Questions, TriviaQA, SQUAD, and SCIFACT, finding that asymmetry in the dual encoders in dense retrieval can lead to improved inference efficiency. Knowing this, we introduce Kullback Leibler Alignment of Embeddings (KALE), an efficient and accurate method for increasing the inference efficiency of dense retrieval methods by pruning and aligning the query encoder after training. Specifically, KALE extends traditional Knowledge Distillation after bi-encoder training, allowing for effective query encoder compression without full retraining or index generation. Using KALE and asymmetric training, we can generate models which exceed the performance of DistilBERT despite having 3x faster inference.
Kale
In the age of big data, data analytics expertise is increasingly valuable. This expertise includes not only formal knowledge, such as algorithms and statistics, but also practical skills that are learned through practice and are difficult to teach in classroom settings: management and preparation of data sets, feature design, and iterative exploratory analysis. Semantic workflows are a valuable tool for empowering non-expert users to carry out systematic analytics on large datasets using sophisticated machine learning methods captured in the workflows and their semantic constraints. In this paper we motivate and illustrate the role of visualizations in the usability of workflows by non-experts as well as their role in learning practical data analytics skills to gain interesting insights into data and methods. This capability is particularly important when confronting large datasets, where the selection of appropriate methods and their configuration with the best parameter and algorithm selections can be crucial in obtaining useful results.
The Future Of Artificial Intelligence Is Now - Liwaiwai
Imagine if doctors, nurses, and health care researchers had the ability to interrogate both the healthy and diseased states of a patient's biology and then use that data to uncover a network of causal relationships between historical, molecular, and other data types to approach treatment or develop the right type of drugs. BERG Health is using this information with a platform that uses artificial intelligence (AI) and machine learning to examine disparate sets of data from patient biology and electronic medical records. "Artificial intelligence has the potential to disrupt many industries, but perhaps most importantly is its impact on health care, where the unsolved challenge is getting the right treatments to the right patients by utilizing tremendous amounts of experimental and observational data," says Niven Narain, co-founder, president and CEO of BERG Health. "By comparing individual patient health data to the greater population health data, we can develop prescriptive analytics that can determine what treatments will work best for that patient, while also warning patients of potential side effects." AI is a set of complex algorithms and technologies that enables machines, systems and software to make human-like decisions.
KALE Flow: A Relaxed KL Gradient Flow for Probabilities with Disjoint Support
Glaser, Pierre, Arbel, Michael, Gretton, Arthur
We study the gradient flow for a relaxed approximation to the Kullback-Leibler (KL) divergence between a moving source and a fixed target distribution. This approximation, termed the KALE (KL approximate lower-bound estimator), solves a regularized version of the Fenchel dual problem defining the KL over a restricted class of functions. When using a Reproducing Kernel Hilbert Space (RKHS) to define the function class, we show that the KALE continuously interpolates between the KL and the Maximum Mean Discrepancy (MMD). Like the MMD and other Integral Probability Metrics, the KALE remains well defined for mutually singular distributions. Nonetheless, the KALE inherits from the limiting KL a greater sensitivity to mismatch in the support of the distributions, compared with the MMD. These two properties make the KALE gradient flow particularly well suited when the target distribution is supported on a low-dimensional manifold. Under an assumption of sufficient smoothness of the trajectories, we show the global convergence of the KALE flow. We propose a particle implementation of the flow given initial samples from the source and the target distribution, which we use to empirically confirm the KALE's properties.
Neural Network Gaussian Process Considering Input Uncertainty for Composite Structures Assembly
Lee, Cheolhei, Wu, Jianguo, Wang, Wenjia, Yue, Xiaowei
Developing machine learning enabled smart manufacturing is promising for composite structures assembly process. To improve production quality and efficiency of the assembly process, accurate predictive analysis on dimensional deviations and residual stress of the composite structures is required. The novel composite structures assembly involves two challenges: (i) the highly nonlinear and anisotropic properties of composite materials; and (ii) inevitable uncertainty in the assembly process. To overcome those problems, we propose a neural network Gaussian process model considering input uncertainty for composite structures assembly. Deep architecture of our model allows us to approximate a complex process better, and consideration of input uncertainty enables robust modeling with complete incorporation of the process uncertainty. Based on simulation and case study, the NNGPIU can outperform other benchmark methods when the response function is nonsmooth and nonlinear. Although we use composite structure assembly as an example, the proposed methodology can be applicable to other engineering systems with intrinsic uncertainties.
The way you version control your ML projects is wrong
A Data Scientist spends most of his time inside a Jupyter Notebook exploring the data and drafting ideas. Usually, when we try to version our work, we end up with a bunch of duplicated ipynb files, assuming different naming schemes. Can we have something that automatically snapshots our work, before and after every step in an ML pipeline? Moreover, can we get started using it without a ton of configuration needed? Just open a Notebook, do our thing and be sure that everything else will take care of itself.