abstract concepts
Latent Semantic analysis
hypothesis:"words that appear in the same context, may have the same meaning"
ex. The devaluation of the euro had begun
- The dollar is beginning to devaluate
-> feed in the entire wikipedia collection
Advantage: you don't need annotated data
Disadvantage: concept the words represent might not always be clear
--> how much clusters/context is needed,
Deep Learning
based on neural networks
in beginning of research: all nodes are fully connected btween vocabulary/input/hidden nodes
through training we add weight to connections
-> each node in hidden layer will have % of activation
-> every word of vocabulary is connected with value from 0 to 1
= a way to encode 'world knowledge' to the vocabulary
queen = king-man+woman
...
embeddings
needs some weeks of training
check: word2vec
-> is used for advertising in Amazon/Facebook/Spotify