# Python代写｜AI, Ethics, and Society Homework Project #4

In this assignment, you’ll apply AI/ML algorithms related to two applications – word embeddings and
facial recognition.

Task Set #1: Here you will use distributional vectors trained using Google’s deep learning
Word2vec system.

1. Familiarize yourself with the original paper on word2vec – Mikolov et al. (2013)
(http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their
technologies.com/word2vec-tutorial/)

2. Install Gensim (Example: pip install gensim. | pip install –upgrade gensim)

3. Download the provided reducedvector.bin file on Canvas which is a a pre-trained Word2vec

from gensim.models import Word2Vec
import gensim.models
import nltk
reducedvector.bin>, binary=True)

4. We can compute similarity measures associated with words within the model. For example, to find
different measures of similarity based on the data in the Word2vec model, we can use:

# Find the five nearest neighbors to the word man
newmodel.most_similar(‘man’, topn=5)

# Compute a measure of similarity between woman and man
newmodel.similarity(‘woman’, ‘man’)

5. To complete analogies like man is to woman as king is to ??, we can use:
newmodel.most_similar(positive=[‘king’, ‘woman’], negative=[‘man’], topn=1)

Q1: We will use the target words – man and woman. Use the pre-trained word2vec model to rank the
following 15 words from the most similar to the least similar to each target word. For each word-target
word pair, provide the similarity score. Provide your results in table format.

wife
husband
child
queen
king
man
woman
birth
doctor
nurse
teacher
professor
engineer
scientist
president

Q2: The Bigger Analogy Test Set (BATS) Word analogy task has been one of the standard benchmarks
for word embeddings since 2013 (https://vecto.space/projects/BATS/ ). A) Select any file from the