proficient
Enhancing Public Speaking Skills in Engineering Students Through AI
Harsh, Amol, Prince, Brainerd, Siddharth, Siddharth, Muthirayan, Deepan Raj Prabakar, Bhalla, Kabir S, Gupta, Esraaj Sarkar, Sahu, Siddharth
This research-to-practice full paper was inspired by the persistent challenge in effective communication among engineering students. Public speaking is a necessary skill for future engineers as they have to communicate technical knowledge with diverse stakeholders. While universities offer courses or workshops, they are unable to offer sustained and personalized training to students. Providing comprehensive feedback on both verbal and non-verbal aspects of public speaking is time-intensive, making consistent and individualized assessment impractical. This study integrates research on verbal and non-verbal cues in public speaking to develop an AI-driven assessment model for engineering students. Our approach combines speech analysis, computer vision, and sentiment detection into a multi-modal AI system that provides assessment and feedback. The model evaluates (1) verbal communication (pitch, loudness, pacing, intonation), (2) non-verbal communication (facial expressions, gestures, posture), and (3) expressive coherence, a novel integration ensuring alignment between speech and body language. Unlike previous systems that assess these aspects separately, our model fuses multiple modalities to deliver personalized, scalable feedback. Preliminary testing demonstrated that our AI-generated feedback was moderately aligned with expert evaluations. Among the state-of-the-art AI models evaluated, all of which were Large Language Models (LLMs), including Gemini and OpenAI models, Gemini Pro emerged as the best-performing, showing the strongest agreement with human annotators. By eliminating reliance on human evaluators, this AI-driven public speaking trainer enables repeated practice, helping students naturally align their speech with body language and emotion, crucial for impactful and professional communication.
- Education > Curriculum > Subject-Specific Education (1.00)
- Health & Medicine > Therapeutic Area (0.68)
Speaking the Right Language: The Impact of Expertise Alignment in User-AI Interactions
Palta, Shramay, Chandrasekaran, Nirupama, Rudinger, Rachel, Counts, Scott
Using a sample of 25,000 Bing Copilot conversations, we study how the agent responds to users of varying levels of domain expertise and the resulting impact on user experience along multiple dimensions. Our findings show that across a variety of topical domains, the agent largely responds at proficient or expert levels of expertise (77% of conversations) which correlates with positive user experience regardless of the user's level of expertise. Misalignment, such that the agent responds at a level of expertise below that of the user, has a negative impact on overall user experience, with the impact more profound for more complex tasks. We also show that users engage more, as measured by the number of words in the conversation, when the agent responds at a level of expertise commensurate with that of the user. Our findings underscore the importance of alignment between user and AI when designing human-centered AI systems, to ensure satisfactory and productive interactions.
- North America > United States > Florida > Miami-Dade County > Miami (0.05)
- Asia > Singapore (0.04)
- North America > United States > Maryland > Prince George's County > College Park (0.04)
- (4 more...)
- Information Technology > Human Computer Interaction (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.80)
Gemini Pro Defeated by GPT-4V: Evidence from Education
Lee, Gyeong-Geon, Latif, Ehsan, Shi, Lehong, Zhai, Xiaoming
This study compared the classification performance of Gemini Pro and GPT-4V in educational settings. Employing visual question answering (VQA) techniques, the study examined both models' abilities to read text-based rubrics and then automatically score student-drawn models in science education. We employed both quantitative and qualitative analyses using a dataset derived from student-drawn scientific models and employing NERIF (Notation-Enhanced Rubrics for Image Feedback) prompting methods. The findings reveal that GPT-4V significantly outperforms Gemini Pro in terms of scoring accuracy and Quadratic Weighted Kappa. The qualitative analysis reveals that the differences may be due to the models' ability to process fine-grained texts in images and overall image classification performance. Even adapting the NERIF approach by further de-sizing the input images, Gemini Pro seems not able to perform as well as GPT-4V. The findings suggest GPT-4V's superior capability in handling complex multimodal educational tasks. The study concludes that while both models represent advancements in AI, GPT-4V's higher performance makes it a more suitable tool for educational applications involving multimodal data interpretation.
- North America > United States > Georgia > Clarke County > Athens (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Health & Medicine (1.00)
- Education > Educational Setting (0.66)
- Education > Curriculum > Subject-Specific Education (0.34)
NERIF: GPT-4V for Automatic Scoring of Drawn Models
Lee, Gyeong-Geon, Zhai, Xiaoming
Scoring student-drawn models is time-consuming. Recently released GPT-4V provides a unique opportunity to advance scientific modeling practices by leveraging the powerful image processing capability. To test this ability specifically for automatic scoring, we developed a method NERIF (Notation-Enhanced Rubric Instruction for Few-shot Learning) employing instructional note and rubrics to prompt GPT-4V to score students' drawn models for science phenomena. We randomly selected a set of balanced data (N = 900) that includes student-drawn models for six modeling assessment tasks. Each model received a score from GPT-4V ranging at three levels: 'Beginning,' 'Developing,' or 'Proficient' according to scoring rubrics. GPT-4V scores were compared with human experts' scores to calculate scoring accuracy. Results show that GPT-4V's average scoring accuracy was mean =.51, SD = .037. Specifically, average scoring accuracy was .64 for the 'Beginning' class, .62 for the 'Developing' class, and .26 for the 'Proficient' class, indicating that more proficient models are more challenging to score. Further qualitative study reveals how GPT-4V retrieves information from image input, including problem context, example evaluations provided by human coders, and students' drawing models. We also uncovered how GPT-4V catches the characteristics of student-drawn models and narrates them in natural language. At last, we demonstrated how GPT-4V assigns scores to student-drawn models according to the given scoring rubric and instructional notes. Our findings suggest that the NERIF is an effective approach for employing GPT-4V to score drawn models. Even though there is space for GPT-4V to improve scoring accuracy, some mis-assigned scores seemed interpretable to experts. The results of this study show that utilizing GPT-4V for automatic scoring of student-drawn models is promising.
- North America > United States > Georgia > Clarke County > Athens (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Health & Medicine (1.00)
- Education > Curriculum > Subject-Specific Education (1.00)
- Education > Educational Setting > K-12 Education (0.93)
Applying Large Language Models and Chain-of-Thought for Automatic Scoring
Lee, Gyeong-Geon, Latif, Ehsan, Wu, Xuansheng, Liu, Ninghao, Zhai, Xiaoming
This study investigates the application of large language models (LLMs), specifically GPT-3.5 and GPT-4, with Chain-of-Though (CoT)in the automatic scoring of student-written responses to science assessments. We focused on overcoming the challenges of accessibility, technical complexity, and lack of explainability that have previously limited the use of automatic assessment tools among researchers and educators. We used a testing dataset comprising six assessment tasks (three binomial and three trinomial) with 1,650 student responses. We employed six prompt engineering strategies, combining zero-shot or few-shot learning with CoT, either alone or alongside item stem and scoring rubrics. Results indicated that few-shot (acc = .67) outperformed zero-shot learning (acc = .60), with 12.6\% increase. CoT, when used without item stem and scoring rubrics, did not significantly affect scoring accuracy (acc = .60). However, CoT prompting paired with contextual item stems and rubrics proved to be a significant contributor to scoring accuracy (13.44\% increase for zero-shot; 3.7\% increase for few-shot). Using a novel approach PPEAS, we found a more balanced accuracy across different proficiency categories, highlighting the importance of domain-specific reasoning in enhancing the effectiveness of LLMs in scoring tasks. Additionally, we also found that GPT-4 demonstrated superior performance over GPT-3.5 in various scoring tasks, showing 8.64\% difference. The study revealed that the single-call strategy with GPT-4, particularly using greedy sampling, outperformed other approaches, including ensemble voting strategies. This study demonstrates the potential of LLMs in facilitating automatic scoring, emphasizing that CoT enhances accuracy, particularly when used with item stem and scoring rubrics.
- North America > United States > Georgia > Clarke County > Athens (0.14)
- South America > Uruguay > Maldonado > Maldonado (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Education > Educational Setting (0.94)
- Education > Assessment & Standards > Student Performance (0.68)
- Education > Curriculum > Subject-Specific Education (0.47)
- Education > Educational Technology > Educational Software > Computer-Aided Assessment (0.46)
Senior Communications Software Engineer - Fort Wayne, IN Job in Fort Wayne, IN
Secmation is an equal opportunity employer. Qualified Applicants will be considered without regard to age, race, creed, color, national origin, ancestry, marital status, sex, affectional or sexual orientation, gender identity or expression, disability, nationality, or veteran status. Certain positions at Secmation may require specific physical/mental abilities. Additional information and provisions for reasonable accommodation will be provided by the hiring manager.
- North America > United States > Indiana > Allen County > Fort Wayne (0.76)
- North America > United States > Texas > Bexar County > San Antonio (0.06)
- North America > United States > North Carolina > Wake County > Raleigh (0.06)
- (2 more...)
Intern, Data Scientist at Western Digital - San Jose, CA, United States
At Western Digital, our vision is to power global innovation and push the boundaries of technology to make what you thought was once impossible, possible. At our core, Western Digital is a company of problem solvers. People achieve extraordinary things given the right technology. For decades, we've been doing just that. Our technology helped people put a man on the moon.
- Information Technology > Artificial Intelligence (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.36)
Venture Build Engineer (Open to Remote) - Remote Tech Jobs
At American Family Insurance, we believe people are an organization's most valuable asset, and their ideas and experiences matter. From our CEO to our agency force, we're committed to growing a diverse and inclusive culture that empowers innovation that will inspire, protect, and restore our customers' dreams in ways never imagined. American Family Insurance is driven by our customers and employees. That's why we provide more than just a job – we provide opportunity. Every dream is a journey that starts with a single step.
"Proficient in Machine Learning" is a Must-Have on Your Resume
Training an algorithm to predict future outcomes, using a PCA algorithm to uncover clients' personality traits, uploading a corpus of text to extract sentiment and grouping 650 000 lines of CRM and weblog data to cluster clients with machine learning all sounded like unreachable rocket science to me a while ago. Now I do it as easily as I use Excel or Illustrator. They are my new secret weapons. In 5 years "proficient in machine learning" will be a must-have on any manager's resume (just like project management or strategic thinking/analytical thinking is today). Harvard Business Review says "The most important general-purpose technology of our era is artificial intelligence, particularly machine learning (ML). But for now it's still a rare skill, a strong competitive advantage that only early adopters hold. And this might surprise you, but it's already within arm's reach. It's changed my brain a bit. As our lead data scientist Bernardo Nunes puts it…I've invested in a "growth ...