<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Indistinguishable from Jesse &#187; Google</title>
	<atom:link href="http://www.squarefree.com/categories/google/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.squarefree.com</link>
	<description>Jesse Ruderman on Firefox, security, and more</description>
	<lastBuildDate>Sun, 05 Feb 2012 17:32:14 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Accidental Googlebomb</title>
		<link>http://www.squarefree.com/2009/01/01/accidental-googlebomb/</link>
		<comments>http://www.squarefree.com/2009/01/01/accidental-googlebomb/#comments</comments>
		<pubDate>Thu, 01 Jan 2009 22:06:04 +0000</pubDate>
		<dc:creator>Jesse Ruderman</dc:creator>
				<category><![CDATA[Google]]></category>

		<guid isPermaLink="false">http://www.squarefree.com/?p=411</guid>
		<description><![CDATA[This Google search now maligns C++. Oops!]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.google.com/search?q=footgun&amp;pws=0">This Google search</a> now maligns C++.  <a href="http://www.squarefree.com/2008/12/23/fuzzing-tracemonkey/">Oops!</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.squarefree.com/2009/01/01/accidental-googlebomb/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Meeting spot</title>
		<link>http://www.squarefree.com/2007/10/08/meeting-spot/</link>
		<comments>http://www.squarefree.com/2007/10/08/meeting-spot/#comments</comments>
		<pubDate>Tue, 09 Oct 2007 05:42:54 +0000</pubDate>
		<dc:creator>Jesse Ruderman</dc:creator>
				<category><![CDATA[Google]]></category>
		<category><![CDATA[Humor]]></category>
		<category><![CDATA[Mozilla]]></category>

		<guid isPermaLink="false">http://www.squarefree.com/2007/10/08/meeting-spot/</guid>
		<description><![CDATA[Google suggests holding tomorrow's leak meeting on a cruise ship. Somehow I don't think that would work very well. Leaks and ships don't get along perfectly.]]></description>
			<content:encoded><![CDATA[<p>Google <a href="http://www.squarefree.com/leak-meeting-ad.png">suggests</a> holding <a href="http://groups.google.com/group/mozilla.dev.planning/browse_thread/thread/a82eefd2240d4302">tomorrow's leak meeting</a> on a cruise ship.</p>

<p>Somehow I don't think that would work very well.  Leaks and ships don't get along perfectly.</p>]]></content:encoded>
			<wfw:commentRss>http://www.squarefree.com/2007/10/08/meeting-spot/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Googlebombing &#8220;leave&#8221;?</title>
		<link>http://www.squarefree.com/2007/05/13/googlebombing-leave/</link>
		<comments>http://www.squarefree.com/2007/05/13/googlebombing-leave/#comments</comments>
		<pubDate>Sun, 13 May 2007 23:39:57 +0000</pubDate>
		<dc:creator>Jesse Ruderman</dc:creator>
				<category><![CDATA[Google]]></category>
		<category><![CDATA[Porn]]></category>

		<guid isPermaLink="false">http://www.squarefree.com/2007/05/13/googlebombing-leave/</guid>
		<description><![CDATA[A Google search for "leave" still reflects the time when most porn sites had "age verification" on their front pages. "Age verification" often took the form of the text "You must be 18 to enter" followed by "Enter" and "Leave" links. The "Leave" link would often lead to a site appropriate for young kids or [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.google.com/search?q=leave">A Google search for "leave"</a> still reflects the time when most porn sites had "age verification" on their front pages.  "Age verification" often took the form of the text "You must be 18 to enter" followed by "Enter" and "Leave" links.  The "Leave" link would often lead to a site appropriate for young kids or to a sex-education site.</p>

<p>Even today, when few new sites follow this practice, "Leave No Trace" and "Leave It To Beaver" are beaten by Yahoo, Google, Scarleteen, and Disney.</p>

<p>I wondered why Google's algorithm continued to make this possible despite tweaks to prevent <a href="http://en.wikipedia.org/wiki/Google_bomb">Googlebombs</a> such as "miserable failure".  I came across <a href="http://www.mattcutts.com/blog/algorithm-to-reduce-googlebomb-impact/#comment-95117">this comment by Google engineer Matt Cutts</a>:</p>

<blockquote><p>[The algorithm change] really does have a very limited scope and doesn’t affect a large fraction of queries. The intent of the algorithm is to minimize the impact of “true” Googlebombs, which occur when someone is causing someone else’s page to rank for stuff that they wouldn’t want to rank for themselves. The algorithm could detect phrases such as [leave] as a Googlebomb in future iterations, but it doesn’t right now and I don’t think that Disney would care much either way.</p></blockquote>

<p>Googlebombs were slightly embarrassing, but I imagine that abandoning link text would have hurt search quality a lot.  I'm impressed that Google was able to come up with an algorithmic way to distinguish Googlebombs from other link text.</p>]]></content:encoded>
			<wfw:commentRss>http://www.squarefree.com/2007/05/13/googlebombing-leave/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Google 411</title>
		<link>http://www.squarefree.com/2007/04/20/google-411/</link>
		<comments>http://www.squarefree.com/2007/04/20/google-411/#comments</comments>
		<pubDate>Fri, 20 Apr 2007 19:00:29 +0000</pubDate>
		<dc:creator>Jesse Ruderman</dc:creator>
				<category><![CDATA[Google]]></category>
		<category><![CDATA[Travel]]></category>

		<guid isPermaLink="false">http://www.squarefree.com/2007/04/20/google-411/</guid>
		<description><![CDATA[Google launched a free 411 service just in time for my move from San Diego to Mountain View. I found it useful, but it could have been even more useful if it: Gave better location information for things in malls. For example, the Goodwill donation spot at 570 Showers Drive would be better described as [...]]]></description>
			<content:encoded><![CDATA[<p>Google launched a <a href="http://labs.google.com/goog411/">free 411 service</a> just in time for my move from San Diego to Mountain View.  I found it useful, but it could have been even more useful if it:</p>

<ul>
<li>Gave better location information for things in malls.  For example, the Goodwill donation spot at 570 Showers Drive would be better described as "in the parking lot near Mervyns and near Showers Drive".</li>
<li>Knew the difference between Goodwill stores and Goodwill donation spots.</li>
<li>Knew how to answer questions like "Where can I find a Denny's along I-5 North within the next hour?", rather than simple city and radius searches.</li>
</ul>

<p>Store hours would be nice too, but the service would also have to know when to say something like "<a href="http://www.yelp.com/biz/ciA8UcivFclYuTv1jOk4fQ">Beach City Grill</a> closes whenever the owner feels like closing, so you are advised to call before driving there."</p>]]></content:encoded>
			<wfw:commentRss>http://www.squarefree.com/2007/04/20/google-411/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Bundled software in security updates</title>
		<link>http://www.squarefree.com/2006/10/28/bundled-software-in-security-updates/</link>
		<comments>http://www.squarefree.com/2006/10/28/bundled-software-in-security-updates/#comments</comments>
		<pubDate>Sat, 28 Oct 2006 07:47:37 +0000</pubDate>
		<dc:creator>Jesse Ruderman</dc:creator>
				<category><![CDATA[Google]]></category>
		<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://www.squarefree.com/2006/10/28/bundled-software-in-security-updates/</guid>
		<description><![CDATA[Today's Java security update includes a checked-by-default "Install Google Toolbar for Internet Explorer" option. Shame on you, Sun and Google. Automatic security updates are no place to push unrelated, bundled software. Making security updates annoying hurts security almost as much as making security updates complicated: users will be less inclined to update next time. This [...]]]></description>
			<content:encoded><![CDATA[<p>Today's Java security update includes a checked-by-default "Install Google Toolbar for Internet Explorer" option.  Shame on you, Sun and Google.  Automatic security updates are no place to push unrelated, bundled software.  Making security updates annoying hurts security almost as much as making security updates complicated: users will be less inclined to update next time.</p>

<p>This is similar to how Flash updates attempt to install the Yahoo Toolbar.  It's certainly not as bad as the frequently updated AOL Instant Messenger, which turns on the "Today window" popup on every AIM account and adds a "Netscape ISP" icon to the desktop with every security update.  But I thought Google was trying to <a href="http://www.google.com/corporate/software_principles.html">set a good example</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://www.squarefree.com/2006/10/28/bundled-software-in-security-updates/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Squarefree succumbs to the Digg effect</title>
		<link>http://www.squarefree.com/2006/09/24/squarefree-succumbs-to-the-digg-effect/</link>
		<comments>http://www.squarefree.com/2006/09/24/squarefree-succumbs-to-the-digg-effect/#comments</comments>
		<pubDate>Sun, 24 Sep 2006 23:38:41 +0000</pubDate>
		<dc:creator>Jesse Ruderman</dc:creator>
				<category><![CDATA[Blogging]]></category>
		<category><![CDATA[DreamHost]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Mozilla]]></category>

		<guid isPermaLink="false">http://www.squarefree.com/2006/09/24/squarefree-succumbs-to-the-digg-effect/</guid>
		<description><![CDATA[Yesterday, at around 4pm, I noticed that the content on squarefree.com was missing, and the main page was an empty directory listing. I ssh'ed to my web server and noticed that the "squarefree.com" directory had been renamed to "squarefree.com_DISABLED_BY_DREAMHOST". Then I checked my email and saw a message from DreamHost support: Hello, I just had [...]]]></description>
			<content:encoded><![CDATA[<p>Yesterday, at around 4pm, I noticed that the content on squarefree.com was missing, and the main page was an empty directory listing.  I ssh'ed to my web server and noticed that the "squarefree.com" directory had been renamed to "squarefree.com_DISABLED_BY_DREAMHOST".  Then I checked my email and saw a message from <a href="http://www.dreamhost.com/">DreamHost</a> support:</p>

<blockquote>
<p>Hello,</p>

<p>I just had to disable your site squarefree.com as it's coming under some
load and spawning countless php processes that are crashing the
webserver.  I wasn't able to figure out exactly what's going on, as
leaving it up for more than a minute pretty much toasts the server.
Please don't re-enable it until you've figured out what's going on, or
disabled any possibly problematic php.</p>

<p>Thanks,</p>

<p>James</p>
</blockquote>

<p>I jumped into #dreamhost on irc.freenode.net and started looking through my web server logs for suspicious requests.  I was expecting to find that my blog had been DDoSed, perhaps by someone trying to leave comment spam.  Instead, I found a large number of requests for non-existant files, falling into two categories:</p>

<ul>

<li>Requests for favicon.ico, a file that does not exist on my site.  Some of these requests are expected: most browsers with tabs request favicon.ico to display it in the tab bar.  But there were also hundreds of IP addresses that requested nothing but favicon.ico for the entire day, and some requested it many times.  About 100 of these IPs were Internet Explorer users with the Google Toolbar, so apparently I was getting <a href="http://thecoder.blog.com/1057698/">DDoS'ed by a bug in the Google Toolbar</a>.  Another 100 were Firefox users; I haven't figured out why Firefox would request nothing but favicon.ico over and over.</li>

<li>Requests due to people using my <a href="http://www.squarefree.com/htmledit/">Real-time HTML Editor</a> to edit pages that used relative URLs for images, iframes, etc.  One user made dozens of requests for a file named "border=0".  Another user made a request for 14 gif files every time the editor refreshed.  I also saw from referrers that <a href="http://digg.com/programming/Real_Time_HTML_editor">the Real-time HTML Editor had been featured on Digg</a>, greatly increasing its traffic.</li>

</ul>

<p>But why would 404 requests create PHP processes?  Due to a recent change in WordPress, Apache was directing each 404 request to WordPress.  WordPress used to put detailed rules in .htaccess -- for example, it would ask Apache to direct requests for http://www.squarefree.com/2005/ to WordPress using <tt>RewriteRule ^([0-9]{4})/?$</tt>.  But newer versions of WordPress instead ask Apache to send it all requests for nonexistent files.  I imagine this puts less strain on Apache when a site uses lots of <a href="http://codex.wordpress.org/Pages">WordPress Pages</a>, but it hurts when a site gets lots of 404 requests.  Several months ago, I had instructed WordPress to serve <a href="http://www.squarefree.com/2004/08/22/custom-404-page/">my custom 404 page</a> for these requests, but WordPress still had to do a lot of work to determine that the requests should be treated as 404s.</p>

<p>Once I realized what had happened, and determined that reconfiguring WordPress would be difficult, I did what I could to reduce the number of 404 requests WordPress would have to handle.  I created a tiny favicon.ico file so those requests wouldn't be 404s, and I moved the Real-time HTML Editor onto <a href="http://htmledit.squarefree.com/">its own subdomain</a> so WordPress wouldn't handle the 404s it causes.  My site was only down for 40 minutes, with the Real-time HTML Editor down a little longer while I waited for the new subdomain's DNS to propagate.</p>

<p>Some things DreamHost could have done better:</p>

<ul>
<li>It would have been nice if James had disabled PHP for my domain instead of disabling my site entirely.  <a href="http://www.squarefree.com/pornzilla/">Pornzilla</a> did not need to be down due to PHP problems.</li>
<li>A per-user process limit might have allowed my site to send "503 Service Unavailable" in response to some requests instead of being down entirely.  It would have also prevented my site from causing problems for other sites on the shared server.</li>
<li>Better performance diagnostics would have helped both James and me isolate the problem.  For example, it would have been great to have a list of PHP processes showing the request URL that caused each PHP instance to be triggered, the lifetime of each process, and perhaps some performance information (CPU used, RAM used, number of database requests).</li>
</ul>

<p>Some things DreamHost did right:</p>

<ul>
<li>DreamHost allowed me to restore my site myself once I fixed the problems.  All I had to do was rename "squarefree.com_DISABLED_BY_DREAMHOST" back to "squarefree.com".</li>
<li>Knowing about <a href="http://www.squarefree.com/2005/05/02/snapshots-on-dreamhost/">DreamHost's .snapshot feature</a> kept me from panicking about data loss when my site appeared to have disappeared.</li>
<li>The employees in #dreamhost were helpful.</li>
</ul>

<p>If anyone is wondering: yes, I still <a href="http://www.squarefree.com/2004/12/23/why-i-love-dreamhost/">love DreamHost</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://www.squarefree.com/2006/09/24/squarefree-succumbs-to-the-digg-effect/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
		</item>
	</channel>
</rss>

