# Sent2Vec

```
seqlearner.Sent2Vec(sequences, word_length, emb_dim, epochs, lr, wordNgrams, loss, neg, thread, t, dropoutK, bucket)
```

Sent2Vec Embedding Method. This class is wrapper for Sent2Vec Embedding method to apply on a set of sequences.
You can train Embedding layer on vocabulary in order to get embedding weights for each word in vocabulary. compress each in `emb_dim`

vectors with `sent2vec_maker`

method.
The `sent2vec_maker`

method returns the embedding weights of the vocabulary. You can access to the vocabulary via `vocab`

attribute.

### Arguments

**sequences**: Numpy ndarray, list, or DataFrame, sequences of data like protein sequences**word_length**: Positive integer, the length of each word in sequences to be separated from each other.**window_size**: Positive integer, size of window for counting the number of neighbors.**emb_dim**: Positive integer, number of embedding vector dimensions.**epochs**: Positive integer, number of epochs for training the embedding.**loss**: String, the loss function is going to be used on training phase.**wordNgrams**: Positive integer, max length of word n-gram**loss**: String, loss function, possible values are {"ns", "hs", "softmax"}**neg**: Positive integer, number of negatives sampled**thread**: Positive integer: number of threads**t**: Float, sampling threshold**dropoutK**: Positive integer, number of n-grams dropped when training a sent2vec model**bucket**: Positive integer, number of hash buckets for vocabulary

### sent2vec_maker

This is a function of Sent2Vec class which you can use to embed your vocabulary.
you can train Embedding layer on vocabulary in order to get embedding weights for each word in vocabulary. compress each in `emb_dim`

vectors.
This function accepts no arguments.

### Example: make the embedding of protein sequences

```
import pandas as pd
from seqlearner import Sent2Vec
sequences = pd.read_csv("./protein_sequences.csv", header=None)
sent2vec = Sent2Vec(sequences, word_length=3, emb_dim=25, epoch=100, lr=0.2, wordNgrams=5, loss="hs", neg=20, thread=10, t=0.0000005, dropoutK=2, bucket=4000000)
encoding = sent2vec.sent2vec_maker()
```