The field of text analytics is developing rapidly and new tools and techniques are coming out on almost a daily basis. Making sense of it all can quickly become overwhelming. We provide a curated list of links to reviews, presentations and tools that we've found especially thought provoking, engaging or useful. If you're a teacher or educational researcher who wants to explore the links between language, learning and teaching, these links are a good starting point. If you're handy with Python or R, we also provide links to some useful libraries and resources.Writing and writing guides
The Writer's Diet
by Professor Helen Sword is an accessible guide to writing - academic or otherwise. Writing@uni guide to academic writing
, and the companion teacher guide
from the University of Auckland.
The Writing Analytics
blog from the team at the Connected Intelligence Centre at UTS, Sydney. Follow their research into supporting formative feedback on student writing at scale.
A delightful series of videos
from Mary Norris, the New Yorker's 'Comma Queen', on the mechanics of language and writing.Commentary on language use The Language Log
is a lively blog, written by linguists, that explores language in use.
Stephen Fry on language, language, language
The Noba project, interesting in itself for teachers in higher education, includes this resource on language and language use
The Writing, writing everywhere
website and blog for 'teaching, learning about and experimenting with writing'. Professional development related to language use and learning
A professional development resource for U.S. school teachers, but very relevant for teachers in higher education as well, is this interactive primer on disciplinary literacy
Published by Routledge (2019) our own chapter, Text analytic tools to illuminate student learning
in Learning Analytics to the Classroom: Translating Learning Analytics Research for Teachers
.Web sites to help you to analyse text and languagetextalyser.net
provides basic analysis including, word frequency, word length, number of syllables, multi-word frequency (2 word-units or bigrams) and readability score (Gunning-Fog index).
Concordancing or exploring a specific word or phrase in context provides valuable information about how the word/phrase is used. This web-based concordancer
allows you to paste in a text to see selected words in context. It also identifies collocates (words which commonly occur in the vicinity of a selected word) and calculates word frequencies. https://www.jasondavies.com/wordtree
provides an alternative way visualising keywords and phrases in context. https://voyant-tools.org/
combines several text analytic tools in an intuitive and easy to use interface.
Marian Dörk and Dawn Knight at Newcastle University have produced http://wordwanderer.org/
where you can 'take your text for a walk' - an interactive exploration of words in context and the associations between words in a text.
There are many more tools, websites and apps that provide similar functions or variations on these themes. Try googling some of the words used on this page to discover additional tools for text analysis that are out there. Google itself provides some of these tools. Try Google's https://books.google.com/ngrams
to explore the Google books corpus. Comprehensive text analysis and corpus linguistics toolsAntconc
and a range of text analysis tools and resources by Prof. Laurence Anthony. Sketch Engine
is a comprehensive tool and includes an enormous range of corpora. It also provides SkELL
which is a simple concordancer designed for ESL students and teachers. Leximancer
is designed to extract high level concepts from texts. Useful for researchers and for the analysis of survey data. An alternative to NVivo
and similar tools.Online courses, guides and reference books
If you really want to develop your skills with language and text analysis the following are good places to start.
Led by Professor Tony McEnery, from the University of Lancaster, this is a thoughtful and accessible introduction to corpus linguistics. No prior knowledge of linguistics or computer coding is assumed. Taught by leaders in the field, there is a good mix of theory, examples and opportunities for practice. Well worth enrolling if you are serious about extending your educational research toolkit. https://www.futurelearn.com/courses/corpus-linguistics
The Natural Language Toolkit (NLTK)
is a comprehensive library for creating Python programs to analyse language. It also includes a huge range of corpora and lexical resources. The companion book, Natural Language Processing with Python – Analyzing Text with the Natural Language Toolkit
, by Steven Bird, Ewan Klein, and Edward Loper has become a classic. If you are interested in learning to code to help you to analyse text, it is the place to start. If you are new to text analysis but well-versed in quantitative analysis,
Julia Silge and David Robinson have produced an excellent introduction at https://www.tidytextmining.com
. It takes a much more information extraction
based approach to text and language analysis than the NLTK book or Futurelearn course above. You will need to be familiar with R. Libraries and APIs
If you want to integrate text analysis with your own applications we thoroughly recommend the wonderful Spacy
. Also Textacy
and of course, NLTK
. All libraries we use in Quantext!