AITopics

Most contributions on Few-Shot Object Detection (FSOD) evaluate their methods on natural images only, yet the transferability of the announced performance is not guaranteed for applications on other kinds of images. We demonstrate this with an in-depth analysis of existing FSOD methods on aerial images and observed a large performance gap compared to natural images. Small objects, more numerous in aerial images, are the cause for the apparent performance gap between natural and aerial images. As a consequence, we improve FSOD performance on small objects with a carefully designed attention mechanism. In addition, we also propose a scale-adaptive box similarity criterion, that improves the training and evaluation of FSOD methods, particularly for small objects. We also contribute to generic FSOD with two distinct approaches based on metric learning and fine-tuning. Impressive results are achieved with the fine-tuning method, which encourages tackling more complex scenarios such as Cross-Domain FSOD. We conduct preliminary experiments in this direction and obtain promising results. Finally, we address the deployment of the detection models inside COSE's systems. Detection must be done in real-time in extremely large images (more than 100 megapixels), with limited computation power. Leveraging existing optimization tools such as TensorRT, we successfully tackle this engineering challenge.

large language model, machine learning, neural information processing system 33, (25 more...)

2310.10433

Country:

Europe > France (0.45)
North America > United States (0.45)
Europe > Serbia (0.14)
(5 more...)

Genre:

Summary/Review (1.00)
Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
(2 more...)

Industry:

Transportation > Ground > Road (1.00)
Leisure & Entertainment > Sports (1.00)
Health & Medicine (1.00)
(5 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(11 more...)

Intelligent Software Tooling for Improving Software Development

Cooper, Nathan

combining visual and textual information, duplicate bug report detection, ieee 25th international conference, (14 more...)

Software has eaten the world with many of the necessities and quality of life services people use requiring software. Therefore, tools that improve the software development experience can have a significant impact on the world such as generating code and test cases, detecting bugs, question and answering, etc. The success of Deep Learning (DL) over the past decade has shown huge advancements in automation across many domains, including Software Development processes. One of the main reasons behind this success is the availability of large datasets such as open-source code available through GitHub or image datasets of mobile Graphical User Interfaces (GUIs) with RICO [112] and ReDRAW [267] to be trained on. Therefore, the central research question my dissertation explores is: In what ways can the software development process be improved through leveraging DL techniques on the vast amounts of unstructured software engineering artifacts? We coin the approaches that leverage DL to automate or augment various software development task as Intelligent Software Tools.

2310.10921

Country:

North America > United States > New York > New York County > New York City (0.14)
Oceania > Australia > New South Wales > Sydney (0.14)
North America > United States > Maryland > Baltimore (0.13)
(29 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.87)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Education (1.00)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Software Engineering (1.00)
Information Technology > Data Science > Data Quality (1.00)
(12 more...)

DemoSG: Demonstration-enhanced Schema-guided Generation for Low-resource Event Extraction

Zhao, Gang, Gong, Xiaocheng, Yang, Xinjie, Dong, Guanting, Lu, Shudong, Li, Si

Most current Event Extraction (EE) methods focus on the high-resource scenario, which requires a large amount of annotated data and can hardly be applied to low-resource domains. To address EE more effectively with limited resources, we propose the Demonstration-enhanced Schema-guided Generation (DemoSG) model, which benefits low-resource EE from two aspects: Firstly, we propose the demonstration-based learning paradigm for EE to fully use the annotated data, which transforms them into demonstrations to illustrate the extraction process and help the model learn effectively. Secondly, we formulate EE as a natural language generation task guided by schema-based prompts, thereby leveraging label semantics and promoting knowledge transfer in low-resource scenarios. We conduct extensive experiments under in-domain and domain adaptation low-resource settings on three datasets, and study the robustness of DemoSG. The results show that DemoSG significantly outperforms current methods in low-resource scenarios.

demonstration, demosg, extraction, (13 more...)

2310.10481

Country:

Asia > China > Beijing > Beijing (0.05)
Asia > South Korea > Seoul > Seoul (0.05)
North America > United States > Washington > King County > Seattle (0.04)
(7 more...)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.47)

From Cloze to Comprehension: Retrofitting Pre-trained Masked Language Model to Pre-trained Machine Reader

Xu, Weiwen, Li, Xin, Zhang, Wenxuan, Zhou, Meng, Lam, Wai, Si, Luo, Bing, Lidong

We present Pre-trained Machine Reader (PMR), a novel method for retrofitting pre-trained masked language models (MLMs) to pre-trained machine reading comprehension (MRC) models without acquiring labeled data. PMR can resolve the discrepancy between model pre-training and downstream fine-tuning of existing MLMs. To build the proposed PMR, we constructed a large volume of general-purpose and high-quality MRC-style training data by using Wikipedia hyperlinks and designed a Wiki Anchor Extraction task to guide the MRC-style pre-training. Apart from its simplicity, PMR effectively solves extraction tasks, such as Extractive Question Answering and Named Entity Recognition. PMR shows tremendous improvements over existing approaches, especially in low-resource scenarios. When applied to the sequence classification task in the MRC formulation, PMR enables the extraction of high-quality rationales to explain the classification process, thereby providing greater prediction explainability. PMR also has the potential to serve as a unified model for tackling various extraction and classification tasks in the MRC formulation.

computational linguistic, pmr, proceedings, (13 more...)

2212.04755

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > Syria (0.05)
Asia > Japan (0.05)
(9 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment > Sports > Football (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
(2 more...)

Efficiently Enhancing Zero-Shot Performance of Instruction Following Model via Retrieval of Soft Prompt

Ye, Seonghyeon, Jang, Joel, Kim, Doyoung, Jo, Yongrae, Seo, Minjoon

Enhancing the zero-shot performance of instruction-following models requires heavy computation, either by scaling the total number of training datasets or the model size. In this work, we explore how retrieval of soft prompts obtained through prompt tuning can efficiently assist hard prompts in zero-shot task generalization. Specifically, we train soft prompt embeddings for each prompt through prompt tuning, store the samples of the training instances mapped with the prompt embeddings, and retrieve the corresponding prompt embedding of the training instance closest to the query instance during inference. While only adding 0.007% additional parameters, retrieval of soft prompt enhances the performance of T0 on unseen tasks by outperforming it on 10 out of 11 datasets as well as improving the mean accuracy of T0 on BIG-bench benchmark by 2.39% points. Also, we report an interesting finding that retrieving source embeddings trained on similar answer choice formats is more important than those on similar task types.

dataset, paw, qa show choice, (13 more...)

2210.03029

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > China > Hong Kong (0.04)
(13 more...)

Genre: Research Report (0.82)

Industry:

Media (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.97)

Boll, Bastian, Schnörr, Christoph

On Certified Generalization in Structured Prediction

arXiv.org Machine LearningOct-16-2023

In structured prediction, target objects have rich internal structure which does not factorize into independent components and violates common i.i.d. assumptions. This challenge becomes apparent through the exponentially large output space in applications such as image segmentation or scene graph generation. We present a novel PAC-Bayesian risk bound for structured prediction wherein the rate of generalization scales not only with the number of structured examples but also with their size. The underlying assumption, conforming to ongoing research on generative models, is that data are generated by the Knothe-Rosenblatt rearrangement of a factorizing reference measure. This allows to explicitly distill the structure between random output variables into a Wasserstein dependency matrix. Our work makes a preliminary step towards leveraging powerful generative models to establish generalization bounds for discriminative downstream tasks in the challenging setting of structured prediction.

artificial intelligence, inductive learning, machine learning, (15 more...)

arXiv.org Machine Learning

2306.09112

Country:

North America > United States > New York > New York County > New York City (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Michigan (0.04)
(4 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
(2 more...)

arXiv.org Machine LearningOct-16-2023

Relabeling Minimal Training Subset to Flip a Prediction

Yang, Jinghan, Xu, Linjie, Yu, Lequan

When facing an unsatisfactory prediction from a machine learning model, it is crucial to investigate the underlying reasons and explore the potential for reversing the outcome. We ask: To flip the prediction on a test point $x_t$, how to identify the smallest training subset $\mathcal{S}_t$ we need to relabel? We propose an efficient procedure to identify and relabel such a subset via an extended influence function. We find that relabeling fewer than 2% of the training points can always flip a prediction. This mechanism can serve multiple purposes: (1) providing an approach to challenge a model prediction by altering training points; (2) evaluating model robustness with the cardinality of the subset (i.e., $|\mathcal{S}_t|$); we show that $|\mathcal{S}_t|$ is highly related to the noise ratio in the training set and $|\mathcal{S}_t|$ is correlated with but complementary to predicted probabilities; (3) revealing training points lead to group attribution bias. To the best of our knowledge, we are the first to investigate identifying and relabeling the minimal training subset required to flip a given prediction.

artificial intelligence, machine learning, prediction, (18 more...)

arXiv.org Machine Learning

2305.12809

Country:

Europe > Russia (0.04)
Asia > Russia (0.04)
Asia > China > Hong Kong (0.04)
(6 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

arXiv.org Artificial IntelligenceOct-15-2023

CAPro: Webly Supervised Learning with Cross-Modality Aligned Prototypes

Qin, Yulei, Chen, Xingyu, Shen, Yunhang, Fu, Chaoyou, Gu, Yun, Li, Ke, Sun, Xing, Ji, Rongrong

Webly supervised learning has attracted increasing attention for its effectiveness in exploring publicly accessible data at scale without manual annotation. However, most existing methods of learning with web datasets are faced with challenges from label noise, and they have limited assumptions on clean samples under various noise. For instance, web images retrieved with queries of tiger cat (a cat species) and drumstick (a musical instrument) are almost dominated by images of tigers and chickens, which exacerbates the challenge of fine-grained visual concept learning. In this case, exploiting both web images and their associated texts is a requisite solution to combat real-world noise. In this paper, we propose Cross-modality Aligned Prototypes (CAPro), a unified prototypical contrastive learning framework to learn visual representations with correct semantics. For one thing, we leverage textual prototypes, which stem from the distinct concept definition of classes, to select clean images by text matching and thus disambiguate the formation of visual prototypes. For another, to handle missing and mismatched noisy texts, we resort to the visual feature space to complete and enhance individual texts and thereafter improve text matching. Such semantically aligned visual prototypes are further polished up with high-quality samples, and engaged in both cluster regularization and noise removal. Besides, we propose collective bootstrapping to encourage smoother and wiser label reference from appearance-similar instances in a manner of dictionary look-up. Extensive experiments on WebVision1k and NUS-WIDE (Web) demonstrate that CAPro well handles realistic noise under both single-label and multi-label scenarios. CAPro achieves new state-of-the-art performance and exhibits robustness to open-set recognition. Codes are available at https://github.com/yuleiqin/capro.

computer vision, proceedings, prototype, (11 more...)

2310.09761

Country:

South America (0.04)
North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
Asia > Singapore (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
(2 more...)

Wu, Zhuofeng, Xiao, Chaowei, Vydiswaran, VG Vinod

HiCL: Hierarchical Contrastive Learning of Unsupervised Sentence Embeddings

arXiv.org Artificial IntelligenceOct-14-2023

In this paper, we propose a hierarchical contrastive learning framework, HiCL, which considers local segment-level and global sequence-level relationships to improve training efficiency and effectiveness. Traditional methods typically encode a sequence in its entirety for contrast with others, often neglecting local representation learning, leading to challenges in generalizing to shorter texts. Conversely, HiCL improves its effectiveness by dividing the sequence into several segments and employing both local and global contrastive learning to model segment-level and sequence-level relationships. Further, considering the quadratic time complexity of transformers over input tokens, HiCL boosts training efficiency by first encoding short segments and then aggregating them to obtain the sequence representation. Extensive experiments show that HiCL enhances the prior top-performing SNCSE model across seven extensively evaluated STS tasks, with an average increase of +0.2% observed on BERT-large and +0.44% on RoBERTa-large.

proceedings, representation, sequence, (14 more...)

2310.0972

Country:

North America > United States > Michigan (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(10 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Koloski, Boshko, Lavrač, Nada, Pollak, Senja, Škrlj, Blaž

Latent Graphs for Semi-Supervised Learning on Biomedical Tabular Data

arXiv.org Artificial IntelligenceOct-14-2023

In the domain of semi-supervised learning, the current approaches insufficiently exploit the potential of considering inter-instance relationships among (un)labeled data. In this work, we address this limitation by providing an approach for inferring latent graphs that capture the intrinsic data relationships. By leveraging graph-based representations, our approach facilitates the seamless propagation of information throughout the graph, effectively incorporating global and local knowledge. Through evaluations on biomedical tabular datasets, we compare the capabilities of our approach to other contemporary methods. Our work demonstrates the significance of inter-instance relationship discovery as practical means for constructing robust latent graphs to enhance semi-supervised learning techniques. The experiments show that the proposed methodology outperforms contemporary state-of-the-art methods for (semi-)supervised learning on three biomedical datasets.

dataset, graph, latent graph, (10 more...)

2309.15757

Country:

Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.04)
North America > United States (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)