Microtask crowdsourcing for annotating diseases in PubMed abstracts. A. I. Su, B. M. Good, M. Nanis Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, CA.

   The scientific literature is massive, but only a tiny percentage of that biomedical knowledge is structured in a way that is amenable to computational analysis. Comprehensively annotating the literature in the form of concepts and relationships between concepts would result in a powerful knowledgebase for biomedical research. Although many biological natural language processing (BioNLP) projects attempt to address this challenge, the state of the art in BioNLP still leaves much room for improvement. Expert curators are vital to the process of knowledge extraction but are always in short supply.
   Recent studies have shown that workers on general-purpose microtasking platforms such as Amazons Mechanical Turk (AMT) can, in aggregate, generate high-quality annotations of biomedical text. Here, we investigated the use of the AMT in capturing disease mentions in PubMed abstracts, comparing to the NCBI Disease corpus as a gold standard. After merging the responses from 5 AMT workers per abstract with a simple voting scheme, we were able to achieve a maximum F measure of 0.815 (precision 0.823, recall 0.807) as compared to expert annotations on the same abstracts. These results can also be tuned to optimize for precision (up to 0.98) or recall (up to 0.89) by adjusting the voting procedure among AMT workers. We also found that providing AMT workers continuous feedback on performance led to improved performance over time. Finally, using AMT for biocuration had clear benefits in terms of both time and cost, requiring just 7 days and less than $200 to complete all 593 abstracts in the test corpus (at $.06/abstract).
   This experiment demonstrated that microtask-based crowdsourcing can be successfully used to identify disease mentions in biomedical text. Although there is room for improvement in the crowdsourcing protocol, overall AMT workers are clearly capable of performing this annotation task. Our experience using AMT motivates two orthogonal lines of ongoing research. First, we are investigating the use of AMT to perform other biocuration tasks, including relationship extraction. Second, these results strongly suggest that Citizen Science workers have both the skill and motivation to help structure biomedical knowledge.

You may contact the first author (during and after the meeting) at