This work compares user collaboration with conversational personal assistants vs. teams of expert chatbots. Two studies were performed to investigate whether each approach affects accomplishment of tasks and collaboration costs. Participants interacted with two equivalent financial advice chatbot systems, one composed of a single conversational adviser and the other based on a team of four experts chatbots. Results indicated that users had different forms of experiences but were equally able to achieve their goals. Contrary to the expected, there were evidences that in the teamwork situation that users were more able to predict agent behavior better and did not have an overhead to maintain common ground, indicating similar collaboration costs. The results point towards the feasibility of either of the two approaches for user collaboration with conversational agents.
Crowdsourcing is a popular methodology to collect manual labels at scale. Such labels are often used to train AI models and, thus, quality control is a key aspect in the process. One of the most popular quality assurance mechanisms in paid micro-task crowdsourcing is based on gold questions: the use of a small set of tasks of which the requester knows the correct answer and, thus, is able to directly assess crowd work quality. In this paper, we show that such mechanism is prone to an attack carried out by a group of colluding crowd workers that is easy to implement and deploy: the inherent size limit of the gold set can be exploited by building an inferential system to detect which parts of the job are more likely to be gold questions. The described attack is robust to various forms of randomisation and programmatic generation of gold questions. We present the architecture of the proposed system, composed of a browser plug-in and an external server used to share information, and briefly introduce its potential evolution to a decentralised implementation. We implement and experimentally validate the gold detection system, using real-world data from a popular crowdsourcing platform. Our experimental results show that crowdworkers using the proposed system spend more time on signalled gold questions but do not neglect the others thus achieving an increased overall work quality. Finally, we discuss the economic and sociological implications of this kind of attack.
What if I told a story here, how would that story start?" Thus, the summarization prompt: "My second grader asked me what this passage means: …" When a given prompt isn't working and GPT-3 keeps pivoting into other modes of completion, that may mean that one hasn't constrained it enough by imitating a correct output, and one needs to go further; writing the first few words or sentence of the target output may be necessary.
Trump himself said his select Advisory Commission on Election Integrity -- vice-chaired by voter-fraud hard-liner Kris Kobach, Kansas' secretary of state -- had found evidence of widespread voter fraud in records it collected from about 20 states. That's despite multiple academic and independent studies showing the problem of illegal voting is miniscule at worst.