pip install -U sentence-transformers
Collecting sentence-transformers Downloading sentence-transformers-2.1.0.tar.gz (78 kB) |████████████████████████████████| 78 kB 3.1 MB/s Collecting transformers<5.0.0,>=4.6.0 Downloading transformers-4.11.2-py3-none-any.whl (2.9 MB) |████████████████████████████████| 2.9 MB 23.7 MB/s Collecting tokenizers>=0.10.3 Downloading tokenizers-0.10.3-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (3.3 MB) |████████████████████████████████| 3.3 MB 37.7 MB/s Requirement already satisfied: tqdm in /usr/local/lib/python3.7/dist-packages (from sentence-transformers) (4.62.3) Requirement already satisfied: torch>=1.6.0 in /usr/local/lib/python3.7/dist-packages (from sentence-transformers) (1.9.0+cu102) Requirement already satisfied: torchvision in /usr/local/lib/python3.7/dist-packages (from sentence-transformers) (0.10.0+cu102) Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from sentence-transformers) (1.19.5) Requirement already satisfied: scikit-learn in /usr/local/lib/python3.7/dist-packages (from sentence-transformers) (0.22.2.post1) Requirement already satisfied: scipy in /usr/local/lib/python3.7/dist-packages (from sentence-transformers) (1.4.1) Requirement already satisfied: nltk in /usr/local/lib/python3.7/dist-packages (from sentence-transformers) (3.2.5) Collecting sentencepiece Downloading sentencepiece-0.1.96-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB) |████████████████████████████████| 1.2 MB 46.3 MB/s Collecting huggingface-hub Downloading huggingface_hub-0.0.17-py3-none-any.whl (52 kB) |████████████████████████████████| 52 kB 1.7 MB/s Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from torch>=1.6.0->sentence-transformers) (3.7.4.3) Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.7/dist-packages (from transformers<5.0.0,>=4.6.0->sentence-transformers) (2019.12.20) Collecting pyyaml>=5.1 Downloading PyYAML-5.4.1-cp37-cp37m-manylinux1_x86_64.whl (636 kB) |████████████████████████████████| 636 kB 45.4 MB/s Requirement already satisfied: filelock in /usr/local/lib/python3.7/dist-packages (from transformers<5.0.0,>=4.6.0->sentence-transformers) (3.0.12) Requirement already satisfied: importlib-metadata in /usr/local/lib/python3.7/dist-packages (from transformers<5.0.0,>=4.6.0->sentence-transformers) (4.8.1) Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.7/dist-packages (from transformers<5.0.0,>=4.6.0->sentence-transformers) (21.0) Collecting sacremoses Downloading sacremoses-0.0.46-py3-none-any.whl (895 kB) |████████████████████████████████| 895 kB 43.5 MB/s Requirement already satisfied: requests in /usr/local/lib/python3.7/dist-packages (from transformers<5.0.0,>=4.6.0->sentence-transformers) (2.23.0) Requirement already satisfied: pyparsing>=2.0.2 in /usr/local/lib/python3.7/dist-packages (from packaging>=20.0->transformers<5.0.0,>=4.6.0->sentence-transformers) (2.4.7) Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.7/dist-packages (from importlib-metadata->transformers<5.0.0,>=4.6.0->sentence-transformers) (3.5.0) Requirement already satisfied: six in /usr/local/lib/python3.7/dist-packages (from nltk->sentence-transformers) (1.15.0) Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests->transformers<5.0.0,>=4.6.0->sentence-transformers) (1.24.3) Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests->transformers<5.0.0,>=4.6.0->sentence-transformers) (2.10) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests->transformers<5.0.0,>=4.6.0->sentence-transformers) (2021.5.30) Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests->transformers<5.0.0,>=4.6.0->sentence-transformers) (3.0.4) Requirement already satisfied: joblib in /usr/local/lib/python3.7/dist-packages (from sacremoses->transformers<5.0.0,>=4.6.0->sentence-transformers) (1.0.1) Requirement already satisfied: click in /usr/local/lib/python3.7/dist-packages (from sacremoses->transformers<5.0.0,>=4.6.0->sentence-transformers) (7.1.2) Requirement already satisfied: pillow>=5.3.0 in /usr/local/lib/python3.7/dist-packages (from torchvision->sentence-transformers) (7.1.2) Building wheels for collected packages: sentence-transformers Building wheel for sentence-transformers (setup.py) ... done Created wheel for sentence-transformers: filename=sentence_transformers-2.1.0-py3-none-any.whl size=121000 sha256=4aac30d88f9fa91593c3755eab4eec62aa8081cd4dba84d9808c26cd2e05ee89 Stored in directory: /root/.cache/pip/wheels/90/f0/bb/ed1add84da70092ea526466eadc2bfb197c4bcb8d4fa5f7bad Successfully built sentence-transformers Installing collected packages: tokenizers, sacremoses, pyyaml, huggingface-hub, transformers, sentencepiece, sentence-transformers Attempting uninstall: pyyaml Found existing installation: PyYAML 3.13 Uninstalling PyYAML-3.13: Successfully uninstalled PyYAML-3.13 Successfully installed huggingface-hub-0.0.17 pyyaml-5.4.1 sacremoses-0.0.46 sentence-transformers-2.1.0 sentencepiece-0.1.96 tokenizers-0.10.3 transformers-4.11.2
import scipy
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('distilbert-base-nli-mean-tokens')
sentences = ['This framework generates embeddings for each input sentence',
    'Sentences are passed as a list of string.', 
    'The quick brown fox jumps over the lazy dog.']
sentence_embeddings = model.encode(sentences)

print("Sentence embeddings:")
print(sentence_embeddings.shape)
Downloading:   0%|          | 0.00/690 [00:00<?, ?B/s]
Downloading:   0%|          | 0.00/3.99k [00:00<?, ?B/s]
Downloading:   0%|          | 0.00/550 [00:00<?, ?B/s]
Downloading:   0%|          | 0.00/122 [00:00<?, ?B/s]
Downloading:   0%|          | 0.00/229 [00:00<?, ?B/s]
Downloading:   0%|          | 0.00/265M [00:00<?, ?B/s]
Downloading:   0%|          | 0.00/53.0 [00:00<?, ?B/s]
Downloading:   0%|          | 0.00/112 [00:00<?, ?B/s]
Downloading:   0%|          | 0.00/466k [00:00<?, ?B/s]
Downloading:   0%|          | 0.00/450 [00:00<?, ?B/s]
Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]
Downloading:   0%|          | 0.00/190 [00:00<?, ?B/s]
Sentence embeddings: (3, 768)
/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py:2227: UserWarning: `max_length` is ignored when `padding`=`True`. warnings.warn("`max_length` is ignored when `padding`=`True`.")
import pandas as pd
import numpy as np
data=pd.read_csv('/content/drugsComTrain_raw.csv')
/usr/local/lib/python3.7/dist-packages/IPython/core/interactiveshell.py:2718: DtypeWarning: Columns (1,2,3,5) have mixed types.Specify dtype option on import or set low_memory=False. interactivity=interactivity, compiler=compiler, result=result)
data=data.drop_duplicates()