axcel
AXCEL: Automated eXplainable Consistency Evaluation using LLMs
Sreekar, P Aditya, Verma, Sahil, Chopra, Suransh, Ghazarian, Sarik, Persad, Abhishek, Sadagopan, Narayanan
Large Language Models (LLMs) are widely used in both industry and academia for various tasks, yet evaluating the consistency of generated text responses continues to be a challenge. Traditional metrics like ROUGE and BLEU show a weak correlation with human judgment. More sophisticated metrics using Natural Language Inference (NLI) have shown improved correlations but are complex to implement, require domain-specific training due to poor cross-domain generalization, and lack explainability. More recently, prompt-based metrics using LLMs as evaluators have emerged; while they are easier to implement, they still lack explainability and depend on task-specific prompts, which limits their generalizability. This work introduces Automated eXplainable Consistency Evaluation using LLMs (AXCEL), a prompt-based consistency metric which offers explanations for the consistency scores by providing detailed reasoning and pinpointing inconsistent text spans. AXCEL is also a generalizable metric which can be adopted to multiple tasks without changing the prompt. AXCEL outperforms both non-prompt and prompt-based state-of-the-art (SOTA) metrics in detecting inconsistencies across summarization by 8.7%, free text generation by 6.2%, and data-to-text conversion tasks by 29.4%. We also evaluate the influence of underlying LLMs on prompt based metric performance and recalibrate the SOTA prompt-based metrics with the latest LLMs for fair comparison. Further, we show that AXCEL demonstrates strong performance using open source LLMs.
- North America > United States > California > Santa Barbara County > Goleta (0.14)
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
- Asia > Singapore (0.04)
- (3 more...)
Webinars – Axcel
Demand forecasting is critical to the success of a company. It drives the decision-making from supply chain management to marketing campaigns. In this webinar, we discuss time series elements including trend, seasonality, and random parts. We explain how to design, implement, and validate a demand forecasting model in Excel. We also present how you can level up your analysis with Axcel AI by running advanced decomposition and time series models such as autoregressive integrated moving average (ARIMA) and visualize the results with a single function in Excel.
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.30)
- North America > United States > California > San Francisco County > San Francisco (0.10)
- Europe > Switzerland > Neuchâtel > Neuchâtel (0.10)
- Asia > Middle East > Iran > Tehran Province > Tehran (0.10)
- Information Technology > Artificial Intelligence > Machine Learning (0.93)
- Information Technology > Data Science (0.73)
- Information Technology > Communications > Web (0.66)
- Information Technology > Software (0.66)