- Number of videos:
Researchers have been modeling text difficulty for over 50 years. A variety of models have been developed, but few have focused on books for emerging readers (Grades K-2). We used Python for nearly every aspect of the project including collecting data from reading educators, analyzing text features and psychometric data, and creating a predictive model. Tools used include scipy, scikit-learn, pandas, and extensive use of the IPython Notebook which is demonstrated in the talk.
Many of you are probably familiar with NLTK, the wonderful Natural Language Toolkit for Python. You may not be familiar with Linkgrammar, which is a sentence parsing system created at Carnegie Melon university. Linkgrammar is quite robust and works "out of the box" in a way that NLTK does not for sentence parsing.