AITopics

2511.04995

Genre: Research Report (0.82)

Industry:

Education > Curriculum > Subject-Specific Education (1.00)
Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Palta, Shramay, Chandrasekaran, Nirupama, Rudinger, Rachel, Counts, Scott

Speaking the Right Language: The Impact of Expertise Alignment in User-AI Interactions

arXiv.org Artificial IntelligenceFeb-25-2025

Using a sample of 25,000 Bing Copilot conversations, we study how the agent responds to users of varying levels of domain expertise and the resulting impact on user experience along multiple dimensions. Our findings show that across a variety of topical domains, the agent largely responds at proficient or expert levels of expertise (77% of conversations) which correlates with positive user experience regardless of the user's level of expertise. Misalignment, such that the agent responds at a level of expertise below that of the user, has a negative impact on overall user experience, with the impact more profound for more complex tasks. We also show that users engage more, as measured by the number of words in the conversation, when the agent responds at a level of expertise commensurate with that of the user. Our findings underscore the importance of alignment between user and AI when designing human-centered AI systems, to ensure satisfactory and productive interactions.

computational linguistic, expertise, user expertise, (14 more...)

2502.18685

Country:

North America > United States > Florida > Miami-Dade County > Miami (0.05)
Asia > Singapore (0.04)
North America > United States > Maryland > Prince George's County > College Park (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.80)

arXiv.org Artificial IntelligenceDec-26-2023

Gemini Pro Defeated by GPT-4V: Evidence from Education

Lee, Gyeong-Geon, Latif, Ehsan, Shi, Lehong, Zhai, Xiaoming

This study compared the classification performance of Gemini Pro and GPT-4V in educational settings. Employing visual question answering (VQA) techniques, the study examined both models' abilities to read text-based rubrics and then automatically score student-drawn models in science education. We employed both quantitative and qualitative analyses using a dataset derived from student-drawn scientific models and employing NERIF (Notation-Enhanced Rubrics for Image Feedback) prompting methods. The findings reveal that GPT-4V significantly outperforms Gemini Pro in terms of scoring accuracy and Quadratic Weighted Kappa. The qualitative analysis reveals that the differences may be due to the models' ability to process fine-grained texts in images and overall image classification performance. Even adapting the NERIF approach by further de-sizing the input images, Gemini Pro seems not able to perform as well as GPT-4V. The findings suggest GPT-4V's superior capability in handling complex multimodal educational tasks. The study concludes that while both models represent advancements in AI, GPT-4V's higher performance makes it a more suitable tool for educational applications involving multimodal data interpretation.

arxiv preprint arxiv, gemini, gpt-4v, (14 more...)

2401.0866

Country:

North America > United States > Georgia > Clarke County > Athens (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (1.00)
Education > Educational Setting (0.66)
Education > Curriculum > Subject-Specific Education (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Lee, Gyeong-Geon, Zhai, Xiaoming

NERIF: GPT-4V for Automatic Scoring of Drawn Models

arXiv.org Artificial IntelligenceDec-23-2023

Scoring student-drawn models is time-consuming. Recently released GPT-4V provides a unique opportunity to advance scientific modeling practices by leveraging the powerful image processing capability. To test this ability specifically for automatic scoring, we developed a method NERIF (Notation-Enhanced Rubric Instruction for Few-shot Learning) employing instructional note and rubrics to prompt GPT-4V to score students' drawn models for science phenomena. We randomly selected a set of balanced data (N = 900) that includes student-drawn models for six modeling assessment tasks. Each model received a score from GPT-4V ranging at three levels: 'Beginning,' 'Developing,' or 'Proficient' according to scoring rubrics. GPT-4V scores were compared with human experts' scores to calculate scoring accuracy. Results show that GPT-4V's average scoring accuracy was mean =.51, SD = .037. Specifically, average scoring accuracy was .64 for the 'Beginning' class, .62 for the 'Developing' class, and .26 for the 'Proficient' class, indicating that more proficient models are more challenging to score. Further qualitative study reveals how GPT-4V retrieves information from image input, including problem context, example evaluations provided by human coders, and students' drawing models. We also uncovered how GPT-4V catches the characteristics of student-drawn models and narrates them in natural language. At last, we demonstrated how GPT-4V assigns scores to student-drawn models according to the given scoring rubric and instructional notes. Our findings suggest that the NERIF is an effective approach for employing GPT-4V to score drawn models. Even though there is space for GPT-4V to improve scoring accuracy, some mis-assigned scores seemed interpretable to experts. The results of this study show that utilizing GPT-4V for automatic scoring of student-drawn models is promising.

accuracy, gpt-4v, student, (16 more...)

2311.1299

Country:

North America > United States > Georgia > Clarke County > Athens (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (1.00)
Education > Curriculum > Subject-Specific Education (1.00)
Education > Educational Setting > K-12 Education (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

arXiv.org Artificial IntelligenceNov-30-2023

Applying Large Language Models and Chain-of-Thought for Automatic Scoring

Lee, Gyeong-Geon, Latif, Ehsan, Wu, Xuansheng, Liu, Ninghao, Zhai, Xiaoming

This study investigates the application of large language models (LLMs), specifically GPT-3.5 and GPT-4, with Chain-of-Though (CoT)in the automatic scoring of student-written responses to science assessments. We focused on overcoming the challenges of accessibility, technical complexity, and lack of explainability that have previously limited the use of automatic assessment tools among researchers and educators. We used a testing dataset comprising six assessment tasks (three binomial and three trinomial) with 1,650 student responses. We employed six prompt engineering strategies, combining zero-shot or few-shot learning with CoT, either alone or alongside item stem and scoring rubrics. Results indicated that few-shot (acc = .67) outperformed zero-shot learning (acc = .60), with 12.6\% increase. CoT, when used without item stem and scoring rubrics, did not significantly affect scoring accuracy (acc = .60). However, CoT prompting paired with contextual item stems and rubrics proved to be a significant contributor to scoring accuracy (13.44\% increase for zero-shot; 3.7\% increase for few-shot). Using a novel approach PPEAS, we found a more balanced accuracy across different proficiency categories, highlighting the importance of domain-specific reasoning in enhancing the effectiveness of LLMs in scoring tasks. Additionally, we also found that GPT-4 demonstrated superior performance over GPT-3.5 in various scoring tasks, showing 8.64\% difference. The study revealed that the single-call strategy with GPT-4, particularly using greedy sampling, outperformed other approaches, including ensemble voting strategies. This study demonstrates the potential of LLMs in facilitating automatic scoring, emphasizing that CoT enhances accuracy, particularly when used with item stem and scoring rubrics.

accuracy, gpt-3, llm, (14 more...)

2312.03748

Country:

North America > United States > Georgia > Clarke County > Athens (0.14)
South America > Uruguay > Maldonado > Maldonado (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Education > Educational Setting (0.94)
Education > Assessment & Standards > Student Performance (0.68)
Education > Curriculum > Subject-Specific Education (0.47)
Education > Educational Technology > Educational Software > Computer-Aided Assessment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

#artificialintelligenceMar-24-2023, 03:23:03 GMT

Senior Communications Software Engineer - Fort Wayne, IN Job in Fort Wayne, IN

Secmation is an equal opportunity employer. Qualified Applicants will be considered without regard to age, race, creed, color, national origin, ancestry, marital status, sex, affectional or sexual orientation, gender identity or expression, disability, nationality, or veteran status. Certain positions at Secmation may require specific physical/mental abilities. Additional information and provisions for reasonable accommodation will be provided by the hiring manager.

fort wayne, secmation, senior communication software engineer, (4 more...)

Country:

North America > United States > Indiana > Allen County > Fort Wayne (0.76)
North America > United States > Texas > Bexar County > San Antonio (0.06)
North America > United States > North Carolina > Wake County > Raleigh (0.06)
(2 more...)

Industry: Information Technology (0.91)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.34)

#artificialintelligenceJan-20-2023, 15:11:45 GMT

Intern, Data Scientist at Western Digital - San Jose, CA, United States

At Western Digital, our vision is to power global innovation and push the boundaries of technology to make what you thought was once impossible, possible. At our core, Western Digital is a company of problem solvers. People achieve extraordinary things given the right technology. For decades, we've been doing just that. Our technology helped people put a man on the moon.

artificial intelligence, data mining, western digital, (6 more...)

Industry: Information Technology > Services (0.73)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.36)

#artificialintelligenceAug-27-2022, 06:05:57 GMT

Venture Build Engineer (Open to Remote) - Remote Tech Jobs

At American Family Insurance, we believe people are an organization's most valuable asset, and their ideas and experiences matter. From our CEO to our agency force, we're committed to growing a diverse and inclusive culture that empowers innovation that will inspire, protect, and restore our customers' dreams in ways never imagined. American Family Insurance is driven by our customers and employees. That's why we provide more than just a job – we provide opportunity. Every dream is a journey that starts with a single step.

demonstrated experience, proficient, venture build engineer, (12 more...)

Technology: Information Technology > Artificial Intelligence (0.72)

#artificialintelligenceMar-1-2018, 20:57:16 GMT

"Proficient in Machine Learning" is a Must-Have on Your Resume

Training an algorithm to predict future outcomes, using a PCA algorithm to uncover clients' personality traits, uploading a corpus of text to extract sentiment and grouping 650 000 lines of CRM and weblog data to cluster clients with machine learning all sounded like unreachable rocket science to me a while ago. Now I do it as easily as I use Excel or Illustrator. They are my new secret weapons. In 5 years "proficient in machine learning" will be a must-have on any manager's resume (just like project management or strategic thinking/analytical thinking is today). Harvard Business Review says "The most important general-purpose technology of our era is artificial intelligence, particularly machine learning (ML). But for now it's still a rare skill, a strong competitive advantage that only early adopters hold. And this might surprise you, but it's already within arm's reach. It's changed my brain a bit. As our lead data scientist Bernardo Nunes puts it…I've invested in a "growth ...

algorithm, artificial intelligence, machine learning, (5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.31)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.31)