by inglesp on 2/19/14, 8:26 AM with 61 comments
by Blahah on 2/19/14, 9:38 AM
They can take an ancient paper with very low quality diagrams of complex chemical structures, parse the image into an open markup language and reconstruct the chemical formula and the correct image. Chemical symbols are just one of many plugins for their core software which interprets unstructured, information rich data like raster diagrams. They also have plugins for phylogenetic trees, plots, species names, gene names and reagents. You can develop plugins easily for whatever you want, and they're recruiting open source contributors (see https://solvers.io/projects/QADhJNcCkcKXfiCQ6, https://solvers.io/projects/4K3cvLEoHQqhhzBan).
As a side effect of how their software works, it can detect tiny suggestive imperfections in images that reveal scientific fraud. I was shown a demo where a trace from a mass spec (like this http://en.wikipedia.org/wiki/File:ObwiedniaPeptydu.gif) was analysed. As well as reading the data from the plot, it revealed a peak that had been covered up with a square - the author had deliberately obscured a peak in their data that was inconvenient. Scientific fraud. It's terrifying that they find this in most chemistry papers they analyse.
Peter's group can analyse thousands or hundreds of thousands of papers an hour, automatically detecting errors and fraud and simultaneously making the data, which are facts and therefore not copyrightable, free. This is one of the best things that has happened to science in many years, except that publishers deliberately prevent it. Their work also made me realise it would be possible to continue Aaron Swartz' work on a much bigger scale (http://blahah.net/2014/02/11/knowledge-sets-us-free/).
Academic publishers who are suppressing this are literally the enemies of humanity.
by yoha on 2/19/14, 9:34 AM
by atmosx on 2/19/14, 3:01 PM
Of course it's universal, it's not like everything is a set-up but happens more often than most would likely imagine, especially since betting came into play.
So there you got it.
by JackFr on 2/19/14, 2:52 PM
Oh yeah -- and they're big enough to fight academic publishers.
by tomp on 2/19/14, 11:31 AM
by Shivetya on 2/19/14, 11:06 AM
However this is more along the lines of validating what is published. Of any group you would hope that scientist and their like would jump on technology like this so as to provide the most accurate representation of their work as possible. The same for publishers, why wouldn't they want to brag the use the most advanced interrogation methods for the papers they publish?
I guess they are people too, hyper sensitive that fault will be found
by _greim_ on 2/20/14, 12:04 AM
There are lots of uncaught errors floating around out there in scientific papers, and many of them can now be found with this software. But the exposing the errors so that they can be corrected is tricky because: A) you have to have legal access to a paper in order to scan it, and B) even if you do have access, under the current rules only the publishers have the right expose the errors, and they're not interested because they want to avoid the embarrassment.
Am I understanding it?
by Udo on 2/19/14, 12:33 PM
by sov on 2/19/14, 9:43 PM
by bloaf on 2/20/14, 1:41 AM
by nder on 2/19/14, 5:41 PM
by dflock on 2/19/14, 8:02 PM
by ylem on 2/20/14, 4:27 AM
by nl on 2/19/14, 8:57 PM
If the referees ran the software on the preprint it would find the same problem.
I agree this isn't as good, but it would be a step forward.
by bloaf on 2/20/14, 1:47 AM