Goto

Collaborating Authors

 good response


Boosting Reward Model with Preference-Conditional Multi-Aspect Synthetic Data Generation

Shen, Jiaming, Xu, Ran, Jun, Yennie, Qin, Zhen, Liu, Tianqi, Yang, Carl, Liang, Yi, Baumgartner, Simon, Bendersky, Michael

arXiv.org Artificial Intelligence

Reward models (RMs) are crucial for aligning large language models (LLMs) with human preferences. They are trained using preference datasets where each example consists of one input prompt, two responses, and a preference label. As curating a high-quality human labeled preference dataset is both time-consuming and expensive, people often rely on existing powerful LLMs for preference label generation. This can potentially introduce noise and impede RM training. In this work, we present RMBoost, a novel synthetic preference data generation paradigm to boost reward model quality. Unlike traditional methods, which generate two responses before obtaining the preference label, RMBoost first generates one response and selects a preference label, followed by generating the second more (or less) preferred response conditioned on the pre-selected preference label and the first response. This approach offers two main advantages. First, RMBoost reduces labeling noise since preference pairs are constructed intentionally. Second, RMBoost facilitates the creation of more diverse responses by incorporating various quality aspects (e.g., helpfulness, relevance, completeness) into the prompts. We conduct extensive experiments across three diverse datasets and demonstrate that RMBoost outperforms other synthetic preference data generation techniques and significantly boosts the performance of four distinct reward models.


SimOAP: Improve Coherence and Consistency in Persona-based Dialogue Generation via Over-sampling and Post-evaluation

Zhou, Junkai, Pang, Liang, Shen, Huawei, Cheng, Xueqi

arXiv.org Artificial Intelligence

Language models trained on large-scale corpora can generate remarkably fluent results in open-domain dialogue. However, for the persona-based dialogue generation task, consistency and coherence are also key factors, which are great challenges for language models. Existing works mainly focus on valuable data filtering, model structure modifying, or objective function designing, while their improvements are limited and hard to generalize to all types of pre-trained language models. However, we find that language models can produce consistent and coherent responses if we consider enough generations. Thus, the problems lay in large-scale response generation and target response selection. In this work, a simple but effective two-stage SimOAP strategy is proposed, i.e., over-sampling and post-evaluation. The over-sampling stage takes large-scale responses from existing trained models efficiently via off-the-shelf distilling and compressing methods, and the post-evaluation stage selects a good response based on multiple well-designed evaluation metrics from large-scale candidates. Experimental results show that the proposed plug-in SimOAP strategy improves the backbone models and outperforms the baseline strategies in both automatic and human evaluations.


IIT Bombay earns good response to its fund-raising initiative

#artificialintelligence

NEW DELHI: The Indian Institute of Technology (IIT), Bombay, has seen good response to its fund raising initiatives in the current academic year, with Rs26.16 crore contributed by alumni currently residing in the US. Sharad Saraf, chairman, Technocraft Group and Sudarshan Saraf, co-chairman of Technocraft Group, have given Rs15 crore to build a'Technocraft Centre for Applied Artificial Intelligence' at IIT Bombay. The donors believe that the future of technology is with the growth of Artificial Intelligence (AI) and it is necessary to expose students to AI through a dedicated AI Center, according to IIT Bombay. A campaign spearheaded by IIT Bombay Heritage Foundation in the US received good response, the institute said. "This year IIT Bombay alumni in USA have contributed to $3.6 million (Rs26.16 IIT Bombay had initiated annual fund-raising drive, 'Cherish IIT Bombay' for donors all over India and the world. It has been raising funds for various causes for the benefit of its students and faculty. According to IIT Bombay, the campaign that stands out this year is the IT hardware campaign. "As IIT Bombay moved to online classes in response to the pandemic, many of our students couldn't access the online classes as they couldn't afford the investment in IT hardware at their respective homes.


Machine Learning in Cybersecurity

#artificialintelligence

Our technical report provides an overview of the relevant parts of an ML lifecycle--selecting the right problem, the right data, and the right math and summarizing the model output for consumption--as well as questions that relate to those areas of focus. As the federally funded research and development center (FFRDC) known for AI engineering, and with its long experience in cybersecurity, the SEI has the expertise to advise you--the decision makers adopting these tools--on evaluating the adequacy of ML tools applied to cybersecurity. To that end, we structured the report around the questions you should ask about ML tools. We chose this framing, rather than proposing a detailed guide of how to build an ML system in cybersecurity, because we want to enable you to learn what a good tool looks like. When decision makers have difficulty identifying a good tool, the market will usually stop providing them.