Difference between revisions of "TextAnalysis"

From Digital Scholarship Group
Jump to navigation Jump to search
 
(13 intermediate revisions by the same user not shown)
Line 11: Line 11:
 
* [https://www.python.org/downloads/ Download and install Python]
 
* [https://www.python.org/downloads/ Download and install Python]
 
* [https://www.jetbrains.com/pycharm/ Download and install PyCharm], an Integrated Development Environment (IDE) for Python
 
* [https://www.jetbrains.com/pycharm/ Download and install PyCharm], an Integrated Development Environment (IDE) for Python
 +
* [https://ipython.org/ Download and install IPython], an interactive shell for Python
  
 
==R==
 
==R==
Line 21: Line 22:
  
 
==Topic Modeling==
 
==Topic Modeling==
*[http://journalofdigitalhumanities.org/2-1/topic-modeling-a-basic-introduction-by-megan-r-brett/ Megan R. Brett's "Basic Introduction"] (conceptual)
+
*[http://journalofdigitalhumanities.org/2-1/pacing-scholarly-conversations/ JDH's Special Issue] on Topic Modeling (2012)
 +
**[http://journalofdigitalhumanities.org/2-1/topic-modeling-a-basic-introduction-by-megan-r-brett/ Megan R. Brett's "Basic Introduction"] (conceptual)
 +
*[http://tedunderwood.com/2012/04/07/topic-modeling-made-just-simple-enough/ Ted Underwood, "Topic modeling made just simple enough"]
 
*[http://www.scottbot.net/HIAL/?p=19113 Scott Weingart's "Guided Tour"] (comprehensive, lots of links)
 
*[http://www.scottbot.net/HIAL/?p=19113 Scott Weingart's "Guided Tour"] (comprehensive, lots of links)
*[http://tedunderwood.com/2012/04/07/topic-modeling-made-just-simple-enough/ Ted Underwood, "Topic modeling made just simple enough"] (interpreting the results)
+
*[http://journalofdigitalhumanities.org/2-1/words-alone-by-benjamin-m-schmidt/ Ben Schmidt's article about Latent Dirichlet allocation's (LDA's) limitations] (also from the JDH special issue)
*[http://journalofdigitalhumanities.org/2-1/words-alone-by-benjamin-m-schmidt/ Ben Schmidt's article about Latent Dirichlet allocation's (LDA's) limitations]
+
 
  
 
===Tools===
 
===Tools===
Line 41: Line 44:
 
==Miscellaneous text analysis tools==
 
==Miscellaneous text analysis tools==
 
* [http://voyant-tools.org/ Voyant Tools], a simple web-based analysis and visualization tool
 
* [http://voyant-tools.org/ Voyant Tools], a simple web-based analysis and visualization tool
* [http://lexos.wheatoncollege.edu/upload Lexos], a tool that allows the user to scrub, chunk, and tokenize text, and to perform modest analysis and visualize clusters
+
* [http://lexos.wheatoncollege.edu/upload Lexos], a tool for scrubbing, chunking, and tokenizing text; in addition to performing modest analysis and visualizing clusters
 +
** [http://scottkleinman.net/blog/2014/07/25/how-to-create-topic-clouds-with-lexos/ Scott Kleinman's blog post] on "How to Create Topic Clouds with Lexos"
 
* [http://www.laurenceanthony.net/software/antconc/ Laurence Anthony's AntConc], a GUI concordancing and text analysis toolkit
 
* [http://www.laurenceanthony.net/software/antconc/ Laurence Anthony's AntConc], a GUI concordancing and text analysis toolkit
 
* [https://sites.google.com/site/casualconc/ CasualConc], a Mac OSX-native toolkit (AntConc's Mac version is ported from the PC, and has some bugs)
 
* [https://sites.google.com/site/casualconc/ CasualConc], a Mac OSX-native toolkit (AntConc's Mac version is ported from the PC, and has some bugs)
Line 52: Line 56:
  
 
=Corpus building=
 
=Corpus building=
 +
*Amanda Rust's [http://subjectguides.lib.neu.edu/textdatamining Subject Guide] on "Text and Data Mining Library Databases"  (Northeastern University Libraries)
  
 
==Some places to get texts==
 
==Some places to get texts==

Latest revision as of 04:14, 16 March 2016

Resources for Exploring Text Analysis

Python

R

Topic Modeling


Tools

word2vec

Miscellaneous text analysis tools

Corpus building

  • Amanda Rust's Subject Guide on "Text and Data Mining Library Databases" (Northeastern University Libraries)

Some places to get texts

Plain text

TEI-Encoded