Caltech Library Workshop – Text as Data: An Introduction to Natural Language Processing (Zoom Session)
Online Event
This introduction to Natural Language Processing (NLP) covers the management and analysis of text using core Python programming language, and the open source libraries NLTK (natural language toolkit) and spaCy. Some prior experience with Python programming will be useful, but is not assumed. The three one-hour workshops will include the following topics:
Friday May 5, 12:00-1:15pm: Text processing in Python
- strings and their properties
- strings as iterables, lists
- comparing and searching strings
- regular expressions
Friday May 19, 12:00-1:15pm: NLTK
- text preprocessing (spellchecking, stemming and lemmatization)
- word contexts, frequency distribution
- parts-of-speech tagging
- named entity recognition
- sentiment analysis
Wednesday May 24, 12:00-1:15pm: spaCy
- statistical modeling of text
- word vectors and similarity
- processing pipelines
Registration is required:
For more information, please contact Stephen Davison by email at or visit
Event Sponsors