Machine Learning Methodologies and Large Data Text Corpora

By Luke Barnesmoore and Jeffrey Huang.

Published by The International Journal of Communication and Linguistic Studies

Format Price
Article: Print $US10.00
Published online: October 30, 2015 $US5.00

With the rise of social media as a focal point for interaction in both global and local social communities, “big data” has become a key feature of social science research in the 21st century. As the size of corpora on sites like Facebook and Twitter have grown, a need has risen for more and more sophisticated computer science tools to collect data and both identify and visualize statistical trends therein. After describing our integration of Netvizz scraping software with our Statnews.org language analysis software we provide a case study of our tool’s application through analyzing Libertarian Facebook data, in particular a discussion about “milk rights” in the US, within the lens of Barnesmoore’s History of Assemblage Model (HoAM) and proceed (through use of thought experiment) to draw conclusions as to the ways in which the ontological regime(s) implicit in the analyzed data are likely to structure potential norms of thought, behavior, and being within publics socialized within the regime.

Keywords: Machine Learning Methodologies, History of Assemblage Model (HoAM), Statnews.org

The International Journal of Communication and Linguistic Studies, Volume 14, Issue 1, March 2016, pp.1-16. Article: Print (Spiral Bound). Published online: October 30, 2015 (Article: Electronic (PDF File; 928.773KB)).

Luke Barnesmoore

Master's Candidate, Geography, University of British Columbia, Vancouver, British Columbia, Canada

Jeffrey Huang

Research Assistant, UC Berkeley Statnews.org Lab, Berkeley, California, USA