A multidisciplinary group of UNAM scientists is making progress in a project to detect presumed suicidal ideas in texts from users of social networks such as Facebook and Twitter, using computational linguistic techniques. The research is led by Gerardo Sierra Martínez, head of the Engineering Linguistics group at the Institute of Engineering (II), and Patricia Andrade Palos, a graduate student at the Faculty of Psychology (FP) of UNAM.
One of the emerging ways of expressing an intention to commit this act is through the platforms and analogous spaces that exist on the Internet. It is expressed textually on these sites through their discussion and, in the worst cases, their promotion. It is therefore necessary to know the dynamics of its expression that are typical of these virtual environments and to use methods such as language analysis, to develop detection tools that contribute to preventive work.
The project seeks to find linguistic characteristics that can be identified and processed for risk detection, which would make it possible to detect people who hypothetically wish to attempt their person. A count and comparison of the lexicon in groups of Facebook and Twitter users (they are confidential, so their identities are not known, because their profiles were not accessed) against random texts on other topics was carried out.
We were able to establish the linguistic difference that exists between people who point out some alleged risk, and those who talk about any other common thing. How was this achieved? Through a count of words that were grouped into different linguistic and psychological categories; among these are that users at risk talk about themselves, always in the first person, they do not use the plural, nor 'we', or 'you'.
Sentences with some presumed suicidal ideation may contain: "I feel this way"; "I am thinking"; "why is this happening to me"; "it has happened to me...". Concepts such as "crying", "despair", "loneliness", "frustration", "depressed", "pessimistic", are also integrated. Likewise, categories of words show anxiety, anguish, sadness, or death, but are inescapably accompanied by "I".
In sum, three different sets of texts were analyzed, whose content was about depression and suicide, on the one hand; and on the other, about random topics. The analysis between them yielded strong results that there are significant linguistic differences that are a sign of suicide risk.
The results of the project are unprecedented for the country and the Mexican Spanish language. However, it is necessary to continue with research that confirms and expands the data of this first approach to the phenomenon, to have conclusive elements of the use of language for the detection of cases at risk. To have a more far-reaching analysis of language, a Netspeak dictionary (words and abbreviations used in Internet communication) was elaborated, which contains a variety of terms used in the blogosphere.
For the diagnosis of suicidal risk, this virtual library played an important role, since these terms were frequent in the texts reviewed and their evaluation from a psychological perspective was possible thanks to the integration of these words. Together with the development of this tool, a word counter was generated based on the LIWC (Linguistic Inquiry Word Counter) program, which classifies them into a series of linguistic and psychological categories.
The next step would be to develop software that would perform this continuous and automated search in social networks. Otherwise, a huge number of tweets and Facebook messages would have to be tracked and analyzed, with the respective authorizations. The gradual development of these methods will generate applications that will serve to identify possible urgent cases requiring psychological care.
They will also help health professionals to design prevention programs based on clear and specific information about the thoughts and emotions that people experience. As a result of the first part of this research project, the scientific article "Suicide Risk Factors: A Language Analysis Approach in Social Networks" was written and published in the Journal of Language and Social Psychology.