Week 2: The nearest neighbours inherit all qualities from their gold1000 parent

Computational linguistics is the interdisciplinary field that deals with the analysis and modeling of natural language from the point of view of computation.

While its origins lay in cold-war urges and still have much to do with the cybernetic dreams of an artificial intelligence, nowadays the field has many ramifications and many practical applications. Great progresses have been made in the fields of linguistics and of language acquisition thanks to this discipline, but its outcomes are also increasingly fulfilling necessities of surveillance, economy and governance. These more mundane and materialistic sides deal in general with the translation of a multiplicity of written text into comparable, processable data, and the profiling of its authors.

Nevertheless, if we consider the historical ancestors of this discipline, where the wonder of logic and the magic of written language meet, this intertwining of knowledge and power is not new. If we could draw an imaginary line from the mathe-magical methods of the Kabbalah, through Ramon Llull’s rhetorical combinatorics, it would probably arrive to computational linguistics. In the periphery of this line, though, there are as well many extraordinary examples of art, music, poetry and prose, attracted by the very same magic.

Starting somewhere near...

The computer-linguists of the research center CLiPS at the University of Antwerp are working on different text analysis tools and datasets to address large corpora of natural language. Between the projects they are busy with, there is a research on the recognition of gender- and age-based author profiles and the automatic detection of lies in written text, but they also release open-source libraries and tools for web mining and language-processing. We think that these tools and methodologies merit to have a look at them. This week could be a lovely occasion for that, so how can we crummify them? And what are the creative elements latent in those technologies? For example, if the average author, built out of thousands of different authors, would write, what would the text look like? Is it possible to become a perfect liar by studying the lying-detection algorithms?

This week we want to look closely at and experiment with the non-pragmatic potential of tools that were designed as economic and surveillance means. We will scry through the magic of code and language and try to find something else than quantification and determination.

How do we want to work? Some suggestions of what we might be busy with this week: