Personally, I can confirm that every yeast protein I work with that does not have a structure, when fed through alphafold produces absolute garbage with mean predicted errors on the order of ten or twenty angstroms and obvious nonsense in the structure.
Granted I work with a lot of repetitive poorly structured proteins which, in as model-system of an organism as yeast, are the only ones without structures and someone has to get unlucky… but still.
“Meanwhile, an academic team has developed its own protein-prediction tool inspired by AlphaFold 2, which is already gaining popularity with scientists. That system, called RoseTTaFold, performs nearly as well as AlphaFold 2, and is described in a paper in Science paper also published on 15 July[2] ”
What this makes me think is that quantum computing is mostly doomed. The killer app for quantum computing is predicting molecules and electronic structures. (Perhaps someone would pay for Shor’s algorithm, but its coolness far outstrips its economic value). But it’s probably a lot cheaper to train a machine-learning based approximation on a bunch of painstakingly assembled data than it is to build enough 50 milliKelvin cyostats. According to this view, the physics labs that will win at superconductor prediction are not the ones working on quantum computers or on theoretical breakthroughs, they’re going to be the guys converting every phonon spectrum from the last 50 years into a common data format so they can spend $30K to train a big 3D transformer on it.
Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even where no similar structure is known.
Holy crap. I confess this one catches me by surprise; within my hopes, but beyond my expectations.
The previous AF2 discussions were largely a waste of space because the little abstract they had to provide for CASP14 provided hardly anything to go on. But now we have not just a full writeup but source code and models too! Now I consider it worth discussing.
As I recall the accuracy measurement was something of an average over the whole molecule deviation which could then allow small portions (local) of the predicted shape to differ from the true shape a good bit more.
First, is that a correct recollection? If so, does anyone know of any work on exploring the importance of local deviations from the global averaged type metrics? I would think that would be very important in this type of modeling.
Personally, I can confirm that every yeast protein I work with that does not have a structure, when fed through alphafold produces absolute garbage with mean predicted errors on the order of ten or twenty angstroms and obvious nonsense in the structure.
Granted I work with a lot of repetitive poorly structured proteins which, in as model-system of an organism as yeast, are the only ones without structures and someone has to get unlucky… but still.
Have you been able to try the academic copy (rosettafold)?
Not yet, I used the Google project where they are posting predicted structures of every known human and yeast gene.
https://alphafold.ebi.ac.uk/
The example that made me laugh:
https://alphafold.ebi.ac.uk/entry/Q59W62
Related development: https://www.nature.com/articles/d41586-021-01968-y
“Meanwhile, an academic team has developed its own protein-prediction tool inspired by AlphaFold 2, which is already gaining popularity with scientists. That system, called RoseTTaFold, performs nearly as well as AlphaFold 2, and is described in a paper in Science paper also published on 15 July[2] ”
Parallel HN thread: https://news.ycombinator.com/item?id=27848186
What this makes me think is that quantum computing is mostly doomed. The killer app for quantum computing is predicting molecules and electronic structures. (Perhaps someone would pay for Shor’s algorithm, but its coolness far outstrips its economic value). But it’s probably a lot cheaper to train a machine-learning based approximation on a bunch of painstakingly assembled data than it is to build enough 50 milliKelvin cyostats. According to this view, the physics labs that will win at superconductor prediction are not the ones working on quantum computers or on theoretical breakthroughs, they’re going to be the guys converting every phonon spectrum from the last 50 years into a common data format so they can spend $30K to train a big 3D transformer on it.
Holy crap. I confess this one catches me by surprise; within my hopes, but beyond my expectations.
Pretty sure this is the same (impressive) news as from CASP14 ( https://www.blopig.com/blog/2020/12/casp14-what-google-deepminds-alphafold-2-really-achieved-and-what-it-means-for-protein-folding-biology-and-bioinformatics/ ). But with fancier figures (edit: and more technical details of how they made the predictions) :P
The previous AF2 discussions were largely a waste of space because the little abstract they had to provide for CASP14 provided hardly anything to go on. But now we have not just a full writeup but source code and models too! Now I consider it worth discussing.
As I recall the accuracy measurement was something of an average over the whole molecule deviation which could then allow small portions (local) of the predicted shape to differ from the true shape a good bit more.
First, is that a correct recollection? If so, does anyone know of any work on exploring the importance of local deviations from the global averaged type metrics? I would think that would be very important in this type of modeling.