mmpi-2
Assessing the nature of large language models: A caution against anthropocentrism
Generative AI models garnered a large amount of public attention and speculation with the release of OpenAIs chatbot, ChatGPT. At least two opinion camps exist: one excited about possibilities these models offer for fundamental changes to human tasks, and another highly concerned about power these models seem to have. To address these concerns, we assessed several LLMs, primarily GPT 3.5, using standard, normed, and validated cognitive and personality measures. For this seedling project, we developed a battery of tests that allowed us to estimate the boundaries of some of these models capabilities, how stable those capabilities are over a short period of time, and how they compare to humans. Our results indicate that LLMs are unlikely to have developed sentience, although its ability to respond to personality inventories is interesting. GPT3.5 did display large variability in both cognitive and personality measures over repeated observations, which is not expected if it had a human-like personality. Variability notwithstanding, LLMs display what in a human would be considered poor mental health, including low self-esteem, marked dissociation from reality, and in some cases narcissism and psychopathy, despite upbeat and helpful responses.
Minnesota Multiphasic Personality Inventory-2 (MMPI-2)
The original Minnesota Multiphasic Personality Inventory (MMPI) was published in 1940 and the second revised version--the MMPI-2--was published in 1989. It is the most widely used psychometric test for measuring adult psychopathology in the world. The MMPI-2 is used in mental health, medical and employment settings. The test developers Hathaway and McKinley used an empirical test construction technique to develop the MMPI. This involved basing the test scales (for example the hypochondriasis scale) on the actual test items that differentiate people with hypochondriasis from'normals'.