uThe Annotator
An example of how the process of annotation for one of the most popular features is usually documented: http://www.cs.cornell.edu/People/pabo/movie-review-data/
[[READ ME]]
[[annotation_source_list]]
[[annotator_scripts]]
Consequence of the annotation process
Roel: moment of discussion happens, overtly political moment.
Computational linguistics is removing the annotator
There is always a human involved
automasition, extrapolation
Femke: annotator is middle person between systems of truth
systems of truth --> constructions
no construction before moment of discover
maybe the annotator is the negotiator?
making this process/figure visible/invisible
--> example: was it an human or machine that wrote the text?
Cath: For me the annotator has the possibility to comment (so annotation is a conversatonal practice, rather than judgement)
Annotation adds richness, meta-layers.
The CLIPS version of annotation equals evaluation
Cristina:Interested in at what point objectivity becomes subjective.
what is the wisdom of the crowd?
Manetta: Annotator becomes the author. What is the context? How? when? What classifiers?
Description of the classifying process, protocol. Annotation scheme.
Annotation protocols according to corpus in div languages
Annotation Scheme for Constructing Sentiment Corpus in Korean
http://www.aclweb.org/anthology/Y12-1019
or Creating an Annotated Corpus for Sentiment Analysis of German Product Reviews
http://www.ssoar.info/ssoar/handle/document/33939
Subjectivity and sentiment annotation of modern standard Arabic newswire
http://dl.acm.org/citation.cfm?id=2018979
Annotated Twitter Sentiment Dataset by Mechanical Turk workers
http://www.dai-labor.de/en/irml/datasets/annotierter_sentiment_datensatz/
+Semantria applies Text and Sentiment Analysis to tweets, facebook posts, surveys, reviews or enterprise content : https://semantria.com/
Projects are shaped by what the tool does. Develop an addition, a patch, a library
Sentiment is a function. Go through an annotation process ... of something more difficult than sentiment
--> showing the annotation of the annotation
--> insert it back into the pattern project
----> in order to reveal more than the outcome --> show the process
annotation as valueing a source
annotator as someone who deduces
an annotation process
need to think abt the source & classifiers
Task: To develop an additional function for pattern that would:
Allow the annotation process to become visible
That problemetizes the annotation process as value-judgment
That shows the consequence of the annotation process
Thinking from binaries
---------------------------------------
- what is the class conciousness of this text
- is this content harassing
- is this tweet racist -1 0 +1
- this is opinion is nuanced vs simplist / opinion-oriented
- this statement is populist <-> elitist
- this is common sense or a negation of it
- Richard Stallman is a mysoginist
- This is very funny or not funny at all
- This is boring ... this is exciting
- This is criminal - this is legal
- Problematic - unproblematic
- I agree with this statement -- I do not agree with this statement
- This is likeable
- This is probable or unrealistic (or speculative)
- This is a good question
- This is an authentic statement (or maybe: this statement has integrity)
- This is an original statement
- This is a relevant statement
- this is a controversial statement
- How WIRED is this story (solutionism?)
- Equality ...
- This is spam - this is art
- Dystopic vs Utopia
What would work with Social Sciences on Gutenberg
- Is this progressive (when)
- Patronizing (paternalist?), Paternalism
"the system, principle, or practice of managing or governing individuals, businesses, nations, etc., in the manner of a father dealing benevolently and often intrusively with his children"
- Radical
Thinking through sources
---------------------------------------------
All questions on twitter / wikipedia / ...
Thread on Aneeta Sarkeesian / Comment threads
Thread on Zoe Quinn - Gamergate affair : http://dontdoitmag.co.uk/issue-7-good-worldsbad-worlds/problems-with-pixels/
NSA keywords http://attrition.org/misc/keywords.html
Something on Big Data?
The demise of Tom Preston Werner (what hashtag?)
Charlie Hebdo, newsite commentators
wikipedia discussion pages / Irak war discussion page (locked, disputed pages) / deleted pages
reddit, opinion spam
product reviews: books ... user submitted (was this helpful?)
(Historic) letter exchange ... (look for sth)
VS Naipaul finds no woman writer his literaly match... : http://www.theguardian.com/books/2011/jun/02/vs-naipaul-jane-austen-women-writers
SPAM as source
Research paper abstracts
Famous quotes
http://www.gutenberg.org/wiki/Technology_%28Bookshelf%29
http://www.gutenberg.org/wiki/Sociology_%28Bookshelf%29
What would we not mind to annotate
-----------------------------------------------------------------
A topical source ... a story ... topical
Sentences, rather than long documents
Some potential for context markers (?)
Take author, time, location into account, amount of follower
It should be social media / user generated content
- Twitter
- Comments
- Wikipedia discussions (hum, not social media?)
Roel: No twitter please
Modernist, universalist
- dystopic/utopic (...) paragraphs
--> the binaries are too 'drastic'?
--> progressive
What functions would we like to work on
-----------------------------------------------------------------
"from pattern.en import sentiment"
"from pattern.en import revolution"
"from pattern.en import utopia"
"from pattern.en import sensitivity"
"from pattern.en import nuance"
"from pattern.en import ambiguity"
"from pattern.en import relevance"
"from pattern.en import misogeny"
"from pattern.en import progressive"
"from pattern.en import racism"
"from pattern.en import frenchness"
"from pattern.en import americanism"
"from pattern.en import nationalism"
"from pattern.en import normality"
"from pattern.en import radical"
"from pattern.en import paternalist"
- --> would need a source that is constructed by multiple authors/sources
- --> maybe a function that speaks about what the source is about
- (rather than one that speaks about a way of speaking, like "nuanced")
Valid
- - -
for later:
Annotation Guidelines for Compound Analysis (part of CLiPS)
http://www.clips.ua.ac.be/sites/default/files/techreport.aucopro.ctrsprotocols.1.0.5.bv_.2014-07-16.final_.pdf
"Gender quotient: 0.16666666666666666"