Latent Semantic analysis hypothesis:"words that appear in the same context, may have the same meaning"
ex. The devaluation of the euro had begun
The dollar is beginning to devaluate
-> feed in the entire wikipedia collection
Advantage: you don't need annotated data Disadvantage: concept the words represent might not always be clear --> how much clusters/context is needed,
Deep Learning based on neural networks in beginning of research: all nodes are fully connected btween vocabulary/input/hidden nodes through training we add weight to connections -> each node in hidden layer will have % of activation -> every word of vocabulary is connected with value from 0 to 1 = a way to encode 'world knowledge' to the vocabulary
queen = king-man+woman ...
embeddings needs some weeks of training check: word2vec -> is used for advertising in Amazon/Facebook/Spotify