@MoAlQuraishi
Mohammed AlQuraishi
6 years
Excited to release a preliminary version of ProteinNet, a data set for doing machine learning on protein structure. Aim is to lower the barrier to entry to protein folding, and spur more ML researchers to tackle the problem. More here: (1/3)
1
74
170

Replies

@MoAlQuraishi
Mohammed AlQuraishi
6 years
Protein structure can become a source of innovation for basic ML research. Clean data with minimal noise, unusual for bio. Small data regime that will stay this way for a long time ($50k+ per structure), and complex inputs / outputs are helpful to push ML algos. (2/3)
1
1
11
@MoAlQuraishi
Mohammed AlQuraishi
6 years
I hope that a standardized data set will popularize the problem in traditional ML/CS circles. I think ProteinNet gets a lot of things right, but open to suggestions on how to make future versions better. (3/3)
0
1
10