1 | initial version |
TfidfVectorizer can be re-utilized again in several ways without relying on pickle:
Save the trained model and vectorizer as separate files: You can save the trained TfidfVectorizer model and the trained machine learning model as separate files in a directory. This way, you can use the saved model and vectorizer separately in other projects.
Export the trained model in JSON format: Another way of re-utilizing the trained TfidfVectorizer is by exporting the model in the JSON format. This can be done by using the scikit-learn's module json_utils.
Use joblib or dill: Instead of pickle, using joblib or dill to save and load the trained TfidfVectorizer and machine learning model can be a better option. This is because joblib or dill can handle large numpy arrays more efficiently than pickle.
Define the TfidfVectorizer and machine learning model as functions: If you define the TfidfVectorizer and machine learning model as functions, you can re-utilize them again in other projects simply by importing them. This way, you don't have to save the trained model and vectorizer in separate files.