Industry
Learning about Machine Learning: An Extended Assignment to Classify Twitter Accounts
Mustafaraj, Eni (Wellesley College) | Anderson, Scott D. (Wellesley College)
We describe a four-week series of assignments in an undergraduate AI course at a liberal arts college developing a supervised learning solution to the problem of classifying Twitter accounts as either a person account or a non-person account (e.g. organization or spambot). This problem employs real data in an ongoing research project by the first author, yet is accessible to students with limited programming expertise.The students were able to experience a complete cycle of creating a machine learning solution: exploring raw data,creating a training set, engineering features, comparing different classifiers, evaluating the results, and performing erroranalysis. We received positive feedback from the students and intend to refine the assignment and make it available (together with the created training data) for use by the research community.
A New Set of Eyes and a New Pair of Legs: A Robust Learning Environment for Advanced High School Robotics
Karnowski, Jeremy (University of California, San Diego) | Touretzky, David S. (Carnegie Mellon University)
Tekkotsu is an open source application development framework for intelligent mobile robots. Originally designed for undergraduate computer science majors, recent refinements to the framework have led us to explore its use with high school students. We developed a pilot course curriculum to introduce high level robotics to students with little or no programming experience in a way that provides improved feedback and error detection on multiple levels. The use of visualization tools and pair programming techniques scaffolds the learning process and provides a systematic way to introduce robotics as a fun and worthwhile endeavor to novices, and helps instructors efficiently address students’ concerns in a real-time manner.
Myro-C++: An Open Source C++ Library for CS Education Using AI
Hoare, John Robert (University of Tennessee) | Edwards, Richard E. ( University of Tennessee ) | MacLennan, Bruce J. ( University of Tennessee ) | Parker, Lynne E. ( University of Tennessee )
In this paper we present Myro-C++, developed at the University of Tennessee. Myro-C++ is a C++ port ofthe Python Myro library that was written by the Institute for Personal Robots in Education (IPRE) at Georgia Tech and Bryn Mawr College. Myro-C++ is publicly available, open source software, released under the GPLv3 open source license. At the time of writing, the library has been used six semesters for the CS1 courseat the University of Tennessee, Knoxville. The library contains functions for control of the robot and access to sensor information, and provides the ability to display the live camera image from the robot into a video window. This library is used as a teaching tool in our CS1 course where students learn basic programming fundamentals using multiple artificial intelligence based labs. In addition to the software, the IPRE book, Learning Computing with Robots, has been edited to use C++ examples and explanations, and is freely available. We also present example programs that we use as laboratory assignments in our Introduction to Computer Science course, which are also freely available.
Intentional Analysis of Medical Conversations for Community Engagement
Sahay, Saurav (Georgia Institute of Technology)
With an explosion in the proliferation of user-generated content in communities, information overload is increasing and quality of readily available online content is deteriorating. There is an increasing need for intelligent systems that make use of implicit user generated knowledge in communities for community engagement. We describe our approach based on modeling user utterances in communities to proactively target the community for exchange of questions and answers. We envision a system that automatically encourages user engagement and participation by routing relevant conversations to users based on individual and community activity levels. In this paper, we analyze health forum conversations from WebMD, a popular health portal consumer site, and classify them in different acts of speech using Verbal Response Modes (VRM) theory. We describe our approach for modeling an intelligent community recommender to engage participants based on observations from our analysis.
Geotagging Tweets Using Their Content
Paradesi, Sharon Myrtle (Massachusetts Institute of Technology)
Harnessing rich, but unstructured information on social networks in real-time and showing it to relevant audience based on its geographic location is a major challenge. The system developed, TwitterTagger, geotags tweets and shows them to users based on their current physical location. Experimental validation shows a performance improvement of three orders by TwitterTagger compared to that of the baseline model.
A Contrastive Corpus Analysis of Modern Art Criticism and Photography Criticism
Hullender, Arthur (University of Memphis) | McCarthy, Philip M. (University of Memphis)
In this study, we analyze two corpora of art critiques: one on the subject of photography and the other on the subject of modern art. We use two computational tools, the Gramulator and GPAT to analyze both sets of texts. The Gramulator was used to show the indicative linguistic features that make photography criticism a distinct genre from modern art criticism. Results suggest that lexical features, structural formats, and genre consistency differed significantly between the two corpora. The findings provide information for teachers, students, publishers, and curriculum developers for creating more effective writing and teaching materials. This includes material for English for Specific Purposes (ESP) in the form of textbooks, workbooks and other external learning material.
Differential Linguistic Features in U.S. Immigration Newspaper Articles: A Contrastive Corpus Analysis Using the Gramulator
Haertl, Barbara E. (The University of Memphis) | McCarthy, Philip M. (The University of Memphis)
Our corpus comprises 752 texts, culled from newspapers of U.S. border states (approximately 75 texts per state). Immigration is a national issue in the United States; Because four states border Mexico, we selected four however, regional implications differ because of matching states (of the 11) that border Canada. To do so, immigrants' varying effects on local economies. These we considered the following criteria for all 15 terrestrial implications are made manifest in the reportage of local border states: total population, immigrant population, newspapers, which, while ostensibly portraying length of international border, and political leaning. These "objective" language, may reveal the narrative of local data were input into a custom PERL script designed to perspectives on national issues.
Evaluating Conversational Characters Created through Question Generation
Chen, Grace (California State University Long Beach) | Tosch, Emma (Brandeis University) | Artstein, Ron (USC Institute for Creative Technologies) | Leuski, Anton ( USC Institute for Creative Technologies ) | Traum, David ( USC Institute for Creative Technologies )
Question generation tools can be used to extract a question-answer database from text articles. We investigate how suitable this technique is for giving domain-specific knowledge to conversational characters. We tested these characters by collecting questions and answers from naive participants, running the questions through the character, and comparing the system responses to the participant answers. Characters gave a full or partial answer to 53% of the user questions which had an answer available in the source text, and 43% of all questions asked. Performance was better for questions asked after the user had read the source text, and also varied by question type: the best results were answers to who questions, while answers to yes/no questions were among the poorer performers. The results show that question generation is a promising method for creating a question answering conversational character from an existing text.
Number of Words Versus Number Ideas: Finding a Better Predictor of Writing Quality
Weston, Jennifer L. (University of Memphis) | Crossley, Scott A. (Georgia State University) | McCarthy, Philip M. (University of Memphis) | McNamara, Danielle S. (University of Memphis)
This study examines the relation between the linguistic features of freewrites and human assessments of freewriting quality. This study builds upon the authors’ previous studies in which a model was developed based on the linguistic features of freewrites written by 9th and 11th grade students to predict freewrite quality. The current study reexamines this model using number of propositions as a predictor instead of number of words because the number of propositions was expected to be a better proxy for number of ideas in contrast to simple text length. The results indicated that there were only slight advantages for using a measure for number of propositions, indicating that from an artificial intelligence perspective, the number of words was the better measure.