Searching & Text-Analysis

Intertwingularity is more than just a fun word to say; the concept behind it is quite compelling.  The idea that there are no “subjects,” and that knowledge is one giant mass, is something that makes the possibilities of the web incredible and completely overwhelming.  The idea behind Intertwingularity is that topics can’t be categorized and divided in a clear-cut way and instead that knowledge is interconnected, overlapping, etc..  What do we do about this when it comes to the web?  Give up on categorization.  Nope!  Although sometimes I think the world would be better off without categorization, but if this were true, I don’t know how I would 1) conceive of things and 2) search them on google to learn more about them.

One basic idea I found interesting when it comes to text-analysis is that “The number of times we perceive something in a text, whether consciously or not, the more influence it has on our reading.”  Obviously there are stakes in analyzing the frequency of words and contexts in texts, but I am very curious about how the subconscious reading of words influences us.  I like that the “corpus analysis of meaning” section of this website pushes beyond frequency and into word usage and context.  It also walks you through how to begin creating a research question based on data from text-analysis. It’s nice that there is also cultural context that is taken into account.  The process goes through variants for different words as well.  This is incredibly helpful when it comes to breaking down research components involved with digital humanities, and how to understand the ethics of this kind of research, writing, and representation.

For my corpus, I compiled the lyrics from Amy Winehouse’s three albums, and played around with some of the tools found on Voyant.  Here is what I got on Cirrus, one of the cites I found through Voyant: Amy Winehouse Lyrics  Much of this is preliminary, and I am still working on how I can use these tools to gather and compile data in a more meaningful way.  From some of these graphs I can see how frequently the word “man” is used, for example.  By walking through some of the processes mentioned above, I can imagine new possibilities for understanding this data.  Here are some of the helpful charts and tools from Voyant: Amy Winehouse Data Charts.  Knots is also interesting because it allows you to see potential connections, tangles, and angles that come out of a corpus: Amy Winehouse–Knots  I particularly like this visualization on this page as a new way to see things.  In think about what corpus to choose, I thought of google books as connected to some of my research.  Google books has a section that shows by size the frequency a “significant” word appears in text: Angela Davis’ Blues Legacies and Black Feminism  I will definitely be using some of these tools for my final project to take a closer look at the frequency of these words in lyrics while also taking into context the historical time period, biographies, music, politics, economics, etc.  I intend to use “full texts” of songs in order to do this analysis.  This will give me a way to look at frequency of artists use of particular words and also over time as well.

Page & Brin’s ideas about PageRank describes the process through which Web pages can be rated “objectively and mechanically, effectively measuring the human interest and attention devoted to them.”  I’m still a little bit confused about the role of backlinks in the PageRank process and their connection to bias in the filtering process.  One thing I’m curious about is how new features of search engines like “current location” operate in the process of filtering out results in searching.  I thought PageRank acting as a kind of a “peer review” process was interesting as well.  Understanding hubs and authorities is also helpful in this context.  While I certainly don’t really understand all of the math behind it, the concepts in Kleinberg are useful.  This will be good to unpack a little further tomorrow 🙂

