Does the White House have something to hide?
Senator Hiram Johnson
href="http://www.bartleby.com/73/1925.html" style="text-decoration:underline">famously quipped that
“the first casualty when war comes is the truth.” As the war in Iraq
continues, is the White House intentionally preventing search engines
from preserving a record of its statements on the conflict with an egregious ROBOTS.TXT file? Or, did
their staff simply make a technical mistake?
When search engines “spider” the web in search of documents for their
indices, web site owners sometimes put a file called
href="http://www.robotstxt.org/wc/robots.html">robots.txt which
instructs the “spiders” not to index certain files. This can be for
policy reasons, if an author does not want his or her pages to appear
in search listings, or it can be for technical reasons, for example if
a web site is dynamically generated and can not or should not be
downloaded in its entirety.
According to
href="http://yro.slashdot.org/article.pl?sid=03/10/27/2052228&mode=nested&tid=103&tid=126&tid=95&tid=99&threshold=2">reports,
though, the White House is requesting that search engines
href="http://www.bway.net/~keith/whrobots/">not index certain pages related to Iraq. In addition to stopping searches, this prevents archives like Google’s
href="http://www.google.com/help/features.html#cached">cache and
the Internet Archive from
storing copies of pages that may later change. 2600
called the White House to investigate the matter.
According to White House spokesman Jimmy Orr, the blocking of search engines is not an attempt to ensure future revisions
will remain undetected. Rather, he explained, they “have an Iraq
section [of the website] with a different template than the main
site.” Thus, for example, a press release on a meeting between
President Bush and “Special Envoy” Bremer is available in
href="http://www.whitehouse.gov/news/releases/2003/10/iraq/20031027-1.html">the
Iraq template (blocked from being indexed by search engines) or
href="http://www.whitehouse.gov/news/releases/2003/10/20031027-1.html">the
normal White House template (available for indexing by search
engines). The attempt, Mr. Orr said, was that when people search,
they should not get multiple copies of the same information. Most of
the “suspicious” entries in the robots.txt file do, indeed, appear to
have only this effect.
According to the robots.txt of
href="http://www.bway.net/~keith/whrobots/robotsWHcurrent10-24-03.txt">October
24, though, the
href="http://www.whitehouse.gov/infocus/iraq/">In Focus: Iraq
section of the site was blocked from search engines. Some of the
href="http://www.whitehouse.gov/infocus/iraq/kay-20031008.html">information
there does not
href="http://www.whitehouse.gov/query.html?col=colpics&qt=%2B%22david+kay%22+%2B%22interim+progress%22&submit.x=0&submit.y=0">appear
to be available anywhere else on the White House site. However, it
seems that, in response to inquiries from 2600 and other
sources, the White House web team has recently changed their
href="http://www.whitehouse.gov/robots.txt">robots.txt so that
these files are no longer blocked. (The current
href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.29">Last-Modified
date on the robots.txt is 23:22 GMT,
October 27th, after work on this article had already begun.)
It is of course open to speculation as to whether the original
blocking of the content in question was malicious or an honest
mistake. Certainly anyone who maintains a large website has made some
sort of technical mistake at least once, and the promptness with which
the error was fixed after it was pointed out suggests that the White
House had no interest in keeping it in place. The White House, as an
entity responsible to the citizenry and an entity that has generated a
lot of criticism over its handling of the situation in Iraq, ought to
take special care to avoid similar mistakes in the
future. Nonetheless, we are pleased to learn that, at least this time,
the issue seems to have been resolved promptly.
2 Comments
you may also enjoy reading this:
http://condi.topcities.com/whrobots/index.html
I have a problems with this error when trying to go to windows updated site Error Ox80072EEZ .