1.
Papers with Datasets
@paperswithdata
The Stack: a dataset for pre-training Code LLMs. It contains 3TB of permissively-licensed code in 30 programming l… twitter.com/i/web/status/1…
28 Oct 22
copy & paste +upvote -downvote 💻The Stack: a dataset for pre-training Code LLMs. It contains 3TB of permissively-licensed code in 30 programming l… https://t.co/bxNdTgTxbZ
2.
Papers with Datasets
@paperswithdata
MTEB: a benchmark suite spanning 8 embedding tasks with 56 datasets and 112 languages.
Tasks include classificati… twitter.com/i/web/status/1…
Tasks include classificati… twitter.com/i/web/status/1…
18 Oct 22
copy & paste +upvote -downvote 📚MTEB: a benchmark suite spanning 8 embedding tasks with 56 datasets and 112 languages.
Tasks include classificati… https://t.co/wHdxgcTIiR
3.
4.
5.
6.
7.
Papers with Datasets
@paperswithdata
ImDrug: a benchmark that consists of 4 imbalance settings, 11 datasets, 54 tasks and 16 baseline algorithms tailor… twitter.com/i/web/status/1…
05 Oct 22
copy & paste +upvote -downvote 💊ImDrug: a benchmark that consists of 4 imbalance settings, 11 datasets, 54 tasks and 16 baseline algorithms tailor… https://t.co/MfHPOwitIc
Papers with Datasets
@paperswithdata
Multimodal Lecture Presentations (MLP) is a new large-scale benchmark dataset for testing the capabilities of M… twitter.com/i/web/status/1…
14 Sep 22
copy & paste +upvote -downvote 👩🏫 Multimodal Lecture Presentations (MLP) is a new large-scale benchmark dataset for testing the capabilities of M… https://t.co/KTPnfl8WGm
Papers with Datasets
@paperswithdata
Simulacra Aesthetic Captions: a dataset consisting of 238,000 synthetic AI generated images and 40K user submitte… twitter.com/i/web/status/1…
11 Jul 22
copy & paste +upvote -downvote 🎨 Simulacra Aesthetic Captions: a dataset consisting of 238,000 synthetic AI generated images and 40K user submitte… https://t.co/h0WRoirsZ3
Papers with Datasets
@paperswithdata
️ArtBench-10: a high-quality dataset for benchmarking artwork generation. It comprises 60,000 images of artwork fr… twitter.com/i/web/status/1…
30 Jun 22
copy & paste +upvote -downvote 🖼️ArtBench-10: a high-quality dataset for benchmarking artwork generation. It comprises 60,000 images of artwork fr… https://t.co/rBSAk6aqyM
Papers with Datasets
@paperswithdata
ToxiGen - a large-scale and machine-generated dataset of 274,186 toxic and benign statements for adversarial and i… twitter.com/i/web/status/1…
01 Jun 22
copy & paste +upvote -downvote ❌ToxiGen - a large-scale and machine-generated dataset of 274,186 toxic and benign statements for adversarial and i… https://t.co/gmW4Zfl4nX
8.
9.
10.
11.
Papers with Datasets
@paperswithdata
Top Trending ML Papers of the Month
Here is a thread to catchup on the top 10 trending papers of May on… twitter.com/i/web/status/1…
Here is a thread to catchup on the top 10 trending papers of May on… twitter.com/i/web/status/1…
Retweet of status by @paperswithcode
31 May 22
copy & paste +upvote -downvote 🔥Top Trending ML Papers of the Month
Here is a thread to catchup on the top 10 trending papers of May on… https://t.co/aWJEZkMtMM
Papers with Datasets
@paperswithdata
Vision-and-Language models are trending!
In this week’s newsletter: we highlight progress in vision-and-language… twitter.com/i/web/status/1…
In this week’s newsletter: we highlight progress in vision-and-language… twitter.com/i/web/status/1…
Retweet of status by @paperswithcode
30 Mar 22
copy & paste +upvote -downvote 💫 Vision-and-Language models are trending!
In this week’s newsletter: we highlight progress in vision-and-language… https://t.co/g73Tis5TK4
Papers with Datasets
@paperswithdata
BigDetection is a new large-scale benchmark to build more general and powerful object detection systems.
It leve… twitter.com/i/web/status/1…
It leve… twitter.com/i/web/status/1…
25 Mar 22
copy & paste +upvote -downvote 💫 BigDetection is a new large-scale benchmark to build more general and powerful object detection systems.
It leve… https://t.co/2ssmJzjSc3
Papers with Datasets
@paperswithdata
V2X-Sim: a synthetic collaborative perception dataset in autonomous driving to facilitate collaborative perception… twitter.com/i/web/status/1…
21 Feb 22
copy & paste +upvote -downvote 🚗V2X-Sim: a synthetic collaborative perception dataset in autonomous driving to facilitate collaborative perception… https://t.co/kVsCUidLl4
...but wait! There's more!
1.
fakhright
@fakhright
astaghfirullah peng.krim guaaaaaaaa..............a *salto sambil solat*
14 Jan 13
copy & paste +upvote -downvote astaghfirullah peng.krim guaaaaaaaa..............a *salto sambil solat* 🙈🙈🙊