NeuroTrends: Large-scale automated analysis of the neuroimaging literature; SciPy 2013 Presentation
Authors: Carp, Joshua, University of Michigan
Track: Medical Imaging
How do researchers design and analyze experiments? How should they? And how likely are their results to be reproducible? To investigate these questions, we developed NeuroTrends, a platform for large-scale analysis of research methods in the neuroimaging literature. Neurotrends identifies relevant research reports using the PubMed API, downloads and parses full-text HTML and PDF documents, and extracts hundreds of methodological details from this unstructured text.
In the present study, NeuroTrends was evaluated using a corpus of over 16,000 journal articles. Automatically extracted methodological meta-data were validated against a hand-coded database. Overall, methodological details were extracted accurately, with a mean d-prime value of 3.53 (range: 1.12 to 6.18). Results revealed both variability and stability in methodological practices over time, with some methods increasing in prevalence, some decreasing, and others remaining consistent. Results also showed that design and analysis pipelines were highly variable across studies and have grown more variable over time.
In sum, the present study confirms the feasibility of accurately extracting methodological meta-data from unstructured text. We also contend that variability in research methods across time and from study to study poses a challenge to reproducibility in the neuroimaging literature--and likely in many other fields as well. Future directions include improving the accuracy and coverage of the NeuroTrends platform, integrating with additional databases, and extending to research domains beyond neuroimaging.