aeb7ecf96a
Change DB access flag to create/write to DB
Heiko Joerg Schick
2023-08-29 23:23:44 +02:00
d7a303b410
add thumbnails for papers, which apparently ppl like
master
Andrej Karpathy
2022-02-13 18:30:14 -08:00
f980c7947a
link to arxiv-sanity-lite instead of directly to arxiv
Andrej Karpathy
2022-01-22 16:36:11 -08:00
48a7e01aa2
we will only send emails to serious users
Andrej Karpathy
2022-01-22 16:33:13 -08:00
23b0e109bf
fix bug in script due to schema change of pids variable earlier
Andrej Karpathy
2021-12-19 11:50:50 -08:00
759f7e73e6
fix bug in pagination, clean up the approach a bit more
Andrej Karpathy
2021-12-18 23:25:45 -08:00
c3cb157c9f
first version of pagination w00t w00t! it's a bit hacky i think, should be possible to improve this code and make it smaller and cleaner and etc.
Andrej Karpathy
2021-12-18 21:14:44 -08:00
4210a73d15
remove recommendation by default on / endpoint. this was causing some unintended confusion, e.g. the search function was using time_filter 7 and giving bad results from the main page. also getting recommendations takes time and is slowish, so only do it on actual demand
Andrej Karpathy
2021-12-18 20:21:04 -08:00
0f21e69125
Merge branch 'ajdintrejic-master'
Andrej Karpathy
2021-12-18 15:06:36 -08:00
6e85778b24
change the way we recommend papers. we'll do it per tag instead of just mixing everything together. i believe this should produce higher quality results
Andrej Karpathy
2021-12-18 14:53:57 -08:00
7a438d9c46
Made site a bit more mobile friendly
Ajdin Trejic
2021-12-18 23:30:40 +01:00
65d4bb8415
some sleep is good for you
Andrej Karpathy
2021-12-10 20:28:34 -08:00
cd2a88f375
when a user is logged in and hits the default endpoint '/' show recommendations over the last week instead of just totally random recent papers, by default
Andrej Karpathy
2021-12-10 09:15:29 -08:00
33b2b018ab
kind of big changes here: we can now inspect & see the most positive and negative words for a trained svm, to help tune the value C. then there is also the UI for setting value C in the SVM. Finally the value of C I adjusted to default to 0.01 (was 0.1 before) because the results and the weights look more sensible based on manual inspection. We need some dataset of people libraries in order to potentially cross-validate a good value C automatically. For now there are not enough active users of the site that such a thing could be attempted and succeed. Doing my best now just by eyeballing
Andrej Karpathy
2021-12-08 00:17:37 -08:00
6be574a0fe
hah don't crash if a user submits an invalid query, better to return empty results. in the future we may want a helpful error message too
Andrej Karpathy
2021-12-07 22:56:09 -08:00
4ec9a75293
add a secret GET argument svm_c that changes the C value in the SVM. no UI for this yet :)
Andrej Karpathy
2021-12-07 22:52:13 -08:00
6308e0cfa4
list out the tags i think
Andrej Karpathy
2021-12-07 22:33:50 -08:00
28163bfcb7
adding the user account into the email, ty @Randl for the suggestion. I think this should be ok to include in plain text here because it is an email direct to the user
Andrej Karpathy
2021-12-07 22:27:14 -08:00
1153a70563
print the total number of emails sent
Andrej Karpathy
2021-12-04 00:39:55 -08:00
6df5f1d178
add requirements.txt ty @Prakyathkantharaju and @e-tornike for help
Andrej Karpathy
2021-12-02 16:08:05 -08:00
31e02f267c
handle the edge case of zero positives... sigh
Andrej Karpathy
2021-11-30 23:34:47 -08:00
fe96775e6e
fix issues and improve logic of removing tags
Andrej Karpathy
2021-11-30 22:57:59 -08:00
9c00c8dacd
disallow the special tag null, which can result if user escapes the prompt
Andrej Karpathy
2021-11-30 22:51:16 -08:00
a01c637cc8
tweak email sending
Andrej Karpathy
2021-11-29 20:45:24 -08:00
d5b91270a9
allow to use fewer documents for training tfidf features to prevent OOMs
Andrej Karpathy
2021-11-29 15:38:36 -08:00
e182dda381
make send_emails into a proper script and add a few more widgets
Andrej Karpathy
2021-11-29 09:29:59 -08:00
6d7f648384
visually distinguish the special 'all' tag
Andrej Karpathy
2021-11-28 08:23:51 -08:00
8947dd3fc5
tweak search scoring for better results, i think
Andrej Karpathy
2021-11-27 18:09:17 -08:00
aff7c89695
use the older arxiv-sanity colorscheme for header
Andrej Karpathy
2021-11-27 17:15:40 -08:00
a83dbb2138
it doesn't align, the ocd is killing me
Andrej Karpathy
2021-11-27 14:53:14 -08:00
5990a938a4
add few more stats
Andrej Karpathy
2021-11-27 14:47:28 -08:00
1769b7b02f
minor profile page html/css cleanup
Andrej Karpathy
2021-11-27 14:31:18 -08:00
d236a5dff9
small tweaks to not do useless work in special cases
Andrej Karpathy
2021-11-27 13:06:00 -08:00
a169f0fa61
adding an about page as well
Andrej Karpathy
2021-11-27 11:47:17 -08:00
6c87055349
slight readme tweak
Andrej Karpathy
2021-11-27 11:41:04 -08:00
6d4d1eb78d
tweak readme
Andrej Karpathy
2021-11-27 11:38:31 -08:00
6685b0a05f
i think this detail matters...
Andrej Karpathy
2021-11-27 11:23:35 -08:00
5eece3f992
add email sending script to repo
Andrej Karpathy
2021-11-27 11:19:55 -08:00
49995465df
add ability to store an email for a user for recommendations
Andrej Karpathy
2021-11-27 11:04:36 -08:00
970a9e24cf
maintain last active state for users
Andrej Karpathy
2021-11-27 09:37:02 -08:00
47fba66712
add a new stats page
Andrej Karpathy
2021-11-26 21:57:48 -08:00
6a0ff6a9bf
clarify even further
Andrej Karpathy
2021-11-26 21:10:22 -08:00
b52b64e0f7
don't show the +/- controls for tags when a person is not logged in, it's confusing
Andrej Karpathy
2021-11-26 21:09:10 -08:00
e4fe77d118
show user warning if they are not logged in that things won't work
Andrej Karpathy
2021-11-26 20:57:20 -08:00
c3ce3e3f70
update the screenshot since the interface changed quite a bit
Andrej Karpathy
2021-11-26 20:33:10 -08:00
4538eedb84
update the readme
Andrej Karpathy
2021-11-26 20:31:57 -08:00
9c88a4eea4
use the process exit code to communicate whether any updates successfully made it into the database at all
Andrej Karpathy
2021-11-26 20:19:48 -08:00
d2582c5b77
if the very first API call fails it is very unlikely that there are new papers at all, no need to try multiple pages
Andrej Karpathy
2021-11-26 20:10:32 -08:00
aa877c9397
when writing features do it safely and atomically
Andrej Karpathy
2021-11-26 20:00:37 -08:00
c3161b2a49
do not reveal username since they are kind of secret now
Andrej Karpathy
2021-11-26 17:11:19 -08:00
5b5b503a93
let things breathe a bit more
Andrej Karpathy
2021-11-26 16:44:21 -08:00
c5b5c7833c
and i think that's it, we now support user accounts (lite)git commit -m 'and i think that\'s it, we now support user accounts litegit status sweet.'! sweet.
Andrej Karpathy
2021-11-26 16:38:36 -08:00
54157c6e9c
ok here we go, the beginning of user profiles. for now we assume there is a single hard-coded user root
Andrej Karpathy
2021-11-26 11:07:16 -08:00
dc55146e03
few notes on some outstanding todos
Andrej Karpathy
2021-11-26 10:28:49 -08:00
3e2d1248d8
speed up time filtering by 20X by loading all of metas into memory once instead of querying it item by item in a massive loop
Andrej Karpathy
2021-11-26 10:22:52 -08:00
3d5bbb9851
optimization: read only metas if that's all that we need to complete the request. that said, i expected this to run much faster and it doesn't. i think the way i'm using sqlites here is not exactly efficient or appropriate and i have to re-think the database schema. e.g. getting random papers over the last 7 days should be almost instantaneous, but currently this request takes like 3 seconds. have to better understand why this is happening
Andrej Karpathy
2021-11-26 09:51:48 -08:00
b283c85c72
this was bothering me
Andrej Karpathy
2021-11-26 09:36:09 -08:00
8c74150337
eliminate a whole spurious endpoint of /search and remove code duplication
Andrej Karpathy
2021-11-26 09:31:06 -08:00
470bd00563
refactor and clean up the code in preparation for some consolidation, coming up
Andrej Karpathy
2021-11-26 09:10:35 -08:00
f565eba1c7
use jinja templates so there's no copy paste code all over the place
Andrej Karpathy
2021-11-25 14:45:08 -08:00
46f556caf6
example makefile
Andrej Karpathy
2021-11-25 13:51:52 -08:00
1ed6e3f1b0
ok now we can sequester all the database files into data/ folder so everything is nice and clean yay
Andrej Karpathy
2021-11-25 13:47:45 -08:00
77279e1777
sequester all file sytem IO ops only to db.py, so it's not total chaos
Andrej Karpathy
2021-11-25 13:28:04 -08:00
bd11865a49
add an option to break out early when we've pulled in all new papers most likely
Andrej Karpathy
2021-11-24 09:15:36 -08:00
cf1bef6f53
big new feature: ability to inspect any paper to see the raw tfidf tokens and their weights that summarize the paper, and which powers the SVM recommendation engine. basically a bit of a debugging / insight feature, but a really good sanity check that papers are being properly represented
Andrej Karpathy
2021-11-21 20:51:01 -08:00
e5798ddb2f
make the header link to root of the site as the case should always be ok cool
Andrej Karpathy
2021-11-21 14:03:17 -08:00
548ee210df
better default parameters, based on qualitative inspection of tfidf features and word vectors
Andrej Karpathy
2021-11-21 13:46:14 -08:00
8be24dae43
allow for full sorting, slice and dicing with all options in the UI, and the previous shortcuts. now have a lot more flexibility to slice and dice as desired
Andrej Karpathy
2021-11-16 21:39:43 -08:00
abb83effed
add ability to remove a tag from a paper as well
Andrej Karpathy
2021-11-14 21:49:34 -08:00
7cbb90a480
minor logging changes
Andrej Karpathy
2021-11-12 22:49:29 -08:00
158ced647b
add a screenshot and rearrange the readme a bit
Andrej Karpathy
2021-11-12 21:36:45 -08:00
f02766be2c
oops need to open for writing if changing the tags dict
Andrej Karpathy
2021-11-12 21:23:22 -08:00
13a1d5ff48
sequester gross details about database instantiation in the filesystem away from the scripts
Andrej Karpathy
2021-11-12 21:12:09 -08:00
194b7f4b22
first leet codes
Andrej Karpathy
2021-11-12 20:40:19 -08:00
089adab199
Initial commit
Andrej
2021-11-12 20:34:22 -08:00