Difference between revisions of "TextAnalysis"
Jump to navigation
Jump to search
(→Python) |
(→R) |
||
Line 12: | Line 12: | ||
==R== | ==R== | ||
− | * Matthew Jockers, Text Analysis With R for Students of Literature (PDF [http://onesearch.northeastern.edu/NU:NEU_ALMA51213317900001401&tabs=viewOnlineTab available for download] via the NEU Library) | + | * Matthew Jockers, ''Text Analysis With R for Students of Literature'' (PDF [http://onesearch.northeastern.edu/NU:NEU_ALMA51213317900001401&tabs=viewOnlineTab available for download] via the NEU Library) |
* [http://cran.at.r-project.org Download and install R] | * [http://cran.at.r-project.org Download and install R] | ||
* [http://www.rstudio.com Download and install RStudio], an Integrated Development Environment (IDE) for R | * [http://www.rstudio.com Download and install RStudio], an Integrated Development Environment (IDE) for R |
Revision as of 04:47, 24 February 2016
Resources for Exploring Text Analysis
- "Where to Start," courtesy of Ted Underwood
- Stanford's Introduction, from Tooling Up for Digital Humanities
- For the social scientists
Python
- Folgert Karsdorp, Python Programming for the Humanities
- Python for Informatics, an applied but comprehensive introductory Python text with sections on text parsing
- Download and install Python
- Download and install PyCharm, an Integrated Development Environment (IDE) for Python
R
- Matthew Jockers, Text Analysis With R for Students of Literature (PDF available for download via the NEU Library)
- Download and install R
- Download and install RStudio, an Integrated Development Environment (IDE) for R
- RSeek, a search tool for finding resources on R
- Simple data types in R
Topic Modeling
- Megan R. Brett's "Basic Introduction" (conceptual)
- Scott Weingart's "Guided Tour" (comprehensive, lots of links)
- Ben Schmidt's article about Latent Dirichlet allocation's (LDA's) limitations
Tools
- MALLET (An open-source, Java-based LDA package)
- GUI Tools that use MALLET
- Stanford Topic Modeling Toolbox
word2vec
- Ben Schmidt's Blog Post on Vector Space Models
- which links to his R wrapper package for word2vec
Miscellaneous text analysis tools
- Voyant Tools
- Laurence Anthony's AntConc, a GUI concordancing and text analysis toolkit
- CasualConc, a Mac OSX-native toolkit (AntConc's Mac version is ported)
- David McClure's TextPlot, a Python package that produces force-directed network of words in a text, based on estimated kernel densities
- Bookworm
Corpus building
Some places to get text
Plain text
- Project Gutenberg
- Early English Books Online (EEBO) (some texts TEI-encoded)
- Early Caribbean Digital Archive (ECDA)