我已经训练好了一个bin文件,现在想直接在bin问件的基础上得到另一个问件的句子向量
代码如下:
import gensim, logging
import os
import sys
logging.basicConfig(format = '%(asctime)s : %(levelname)s : %(message)s', level = logging.INFO)
sentences = gensim.models.doc2vec.TaggedLineDocument('F:\\jj\\gj.txt')
model = gensim.models.Doc2Vec.load_word2vec_format('F:\\jj\\go.bin',binary=True)
out = open('F:\\jj\\vector.txt', 'w')
for idx, docvec in enumerate(model.docvecs):
for value in docvec:
out.write(str(value)+' ')
out.write('\n')
out.close()
但显示如下错误:
2016-08-24 21:46:38,813 : INFO : loading projection weights from F:\jj\go.bin
Traceback (most recent call last):
File "E:\java与python视频\python学习\练习\7.py", line 6, in <module>
model = gensim.models.Doc2Vec.load_word2vec_format('F:\\jj\\go.bin',binary=True)
File "D:\Users\GJ\AppData\Local\Programs\Python\Python35\lib\site-packages\gensim\models\word2vec.py", line 1085, in load_word2vec_format
header = utils.to_unicode(fin.readline(), encoding=encoding)
File "D:\Users\GJ\AppData\Local\Programs\Python\Python35\lib\site-packages\gensim\utils.py", line 217, in any2unicode
return unicode(text, encoding, errors=errors)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfa in position 2: invalid start byte
请问各位大神怎么解决
训练好一个文件。。。
那个bin文件就是我已经训练好的文件,我用的是python3.5版本,但为何显示以上错误