Monday, March 21, 2011

Information Overload as Propoganda

 There is a rather annoying tendency google has when one is seeking specific information.  As information consumers we can only use search terms. Now, anticipating the obvious uproar over such an elementary statement I will say this; just because that's the way it is, that doesn't mean that there are also no consequences.

So what exactly are these consequences? Well for one google's ratings are hotly contested and rigging the system, like any other, isn't unheard of. Now the Borker case had its fifteen minutes but the problem as a whole exists on a deeper level. Near as I can tell there are a few issues that must be overcome. To be fair Google addresses these, yet they view their methodology as strictly proprietary as they are a for profit company.

  1. Volume vs. Relevance; Relevant to what?
    I find this particularly overwhelming and vexing. There are so many pages out there that certain information, such as current news, is flooded with keywords This is true for almost any simple phrase, which as everyone knows is the essence of any search engine.

    This problem is further compounded by the fact there is monetary and political interest in the "happening now" department. The Actors (Corporations, Webmasters, and Individuals) are  caught up in either reporting, or even "participating" in the current breaking saga that the search terms returned usually point a user to a corporatized or politicized page that offers little additional information and is quite liberal with speculation.

    In this way Google and webmasters as a group end up reinforcing the status quo, whatever it might be, for their constituency/ user base. Thus as it stands now, and as I suspect it will remain into the future, volume trumps relevancy.

    Beyond this generalized relevancy, which in the past was addressed in a collective manner through public interaction through the large news organizations, the Internet has that insurmountable problem of personal relevancy. But that, at the moment, is beyond the scope of this post. 
  2. "Mass" Opinion or Sentiment; - Thoughts from the mob.
    Sentiment analysis seems to offer be a way that some of the problems of volume can be addressed. However as any art critic will tell you such search methods, though useful at times, are at the moment far from cleanly addressing relevancy. By very necessity words must be translated into numbers and this leaves much unclear in terms of search. A word often  does not have the same meaning or connotation from person to person and culture to culture. In other words a true algorithm that captures the essence of just one aspect of human nature is surely as hard to derive as is the equation linking together forces of nature.

    Another aspect of this is the volume of opinions. We like to think that the Internet cuts a large swath across or society. Yet simple economics contest this. Those who cannot afford to use the Internet everyday lose out volume-wise to the more regular users. Online sentiment is sure to be affected by those who spend more time online versus those who spend little time online (the re-tweet factor). Also it is important to note that sentiment tracking will never be able to fully track the knowledge of the reporting individual. Thus relevancy might be lost as the mob's opinion might overrule a more level headed and less polarized or "ambiguous" opinion. 

    Noteworthy also is the fact that by its very design, such sentiment analysis is a measure of the status quo.  
  3. Now vs. Later; - Lies now often carry more weight then the truth later.
    The facts we know now might change as we learn more later. This has the most bearing on breaking news or trending topics. We all claim to be annoyed by the incessant chatter of the 24 hour news cycle. Yet we have done this to the detriment of old-school fact checking. Online desires for currency (both monetary and "relevancy"), far from emphasising truth, have rather let go the flood gates of uninformed opinion.

    Online stories and reporting are a re-tooling of the old cycle. Now the first bit "off the wire" is often reported and rehashed even if certain facts appear to be left out or open to further revision. This is the unfortunate norm in online society and now by extension the mainstream 24/7 cycle. This information is often censored so that the version of the story reported or linked too often only contains some of the facts. The obvious recent example of this is the NPR 'Schillergate' scandal.

    Thus lies seem to thrive more readily in the online ecosystem then they did under the old mainstream cycle.   
So we have a scary thing here. As with all things human the 'truth' is a relative thing, and online culture as it is expressed now, through search terms and blogs is one large mass of ambiguity. Adding measures of sentiment may help but it shares the largely conceptual problems inherent in both online access and post frequency. Access matters, and somehow we seem to have a system that is designed to flood a user's search with either popular picks, or ones placed though monetary means. In the end we have no redress to relevancy. This is why sites like Wikipedia will thrive, it is much easier there for the user to sort relevancy. Search engines even now are largely hit and miss and often glaringly irrelevant, we use then as we have no other choice. Google and other sites in many ways take their users for granted. This is unfortunate that the company cannot look more closely at the human side of the equation, for Google and the other search engines the PEBKAC.