Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
553 views
in Technique[技术] by (71.8m points)

python - How to load a fine tuned model from BertForSequenceClassification and use it to tokenize a sentence?

I'm following this tutorial (https://mccormickml.com/2019/07/22/BERT-fine-tuning/#a1-saving--loading-fine-tuned-model) in order to fine-tune a BertForSequenceClassification. After I train the model, I want to load this model to code a function "classify_sentence(sentence)": it takes a sentence and return a logit vector of predictions.

def classify_sentence(self, sentence):


    self.model = BertForSequenceClassification.from_pretrained(output_dir)
    self.tokenizer = BertTokenizer.from_pretrained(output_dir)

    encoded_dict = self.tokenizer.encode_plus(
                sentence,                      # Sentence to encode.
                add_special_tokens = True, # Add '[CLS]' and '[SEP]'
                max_length = 64,           # Pad & truncate all sentences.
                pad_to_max_length = True,
                return_attention_mask = True,   # Construct attn. masks.
                return_tensors = 'pt',     # Return pytorch tensors.
    )

    # Add the encoded sentence to the list.    
    input_id = encoded_dict['input_ids']
    # And its attention mask (simply differentiates padding from non-padding).
    attention_mask = encoded_dict['attention_mask']
    
    input_id = torch.cat(input_id, dim=0)
    attention_mask = torch.cat(attention_mask, dim=0)

    with torch.no_grad():

        output = self.model(input_id, 
        token_type_ids=None, 
        attention_mask=attention_mask
        )

    logits = outputs[0]

    return logits

output_dir is a directory that contains these files: config.json, pytorch_model.bin, special_tokens_map.json, tokenizer_config.json and vocab.txt.

When I run this function, I get an error:

AttributeError: 'BertTokenizer' object has no attribute 'encode_plus'

However I used this method to encode the sentence during the train. Is there any alternative way to tokenize a sentence after load a trained BERT model?

question from:https://stackoverflow.com/questions/65846926/how-to-load-a-fine-tuned-model-from-bertforsequenceclassification-and-use-it-to

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...