big new feature: ability to inspect any paper to see the raw tfidf tokens and their weights that summarize the paper, and which powers the SVM recommendation engine. basically a bit of a debugging / insight feature, but a really good sanity check that papers are being properly represented

This commit is contained in:
Andrej Karpathy
2021-11-21 20:51:01 -08:00
parent e5798ddb2f
commit cf1bef6f53
6 changed files with 161 additions and 13 deletions
+2
View File
@@ -49,5 +49,7 @@ if __name__ == '__main__':
features = {
'pids': list(pdb.keys()),
'x': x,
'vocab': v.vocabulary_,
'idf': v._tfidf.idf_,
}
pickle.dump(features, open('features.p', 'wb' ))