Received: with ECARTIS (v1.0.0; list gopher); Wed, 12 Oct 2005 18:46:47 -0500 (CDT) Received: from netblock-66-159-214-137.dslextreme.com ([66.159.214.137] helo=floodgap.com ident=nobody) by glockenspiel.complete.org with esmtp (Exim 4.50) id 1EPqJ2-0007g0-9K for gopher@complete.org; Wed, 12 Oct 2005 18:46:46 -0500 Received: (from spectre@localhost) by floodgap.com (6.6.6.666/2005.03.01) id QAA17070 for gopher@complete.org; Wed, 12 Oct 2005 16:45:56 -0700 From: Cameron Kaiser Message-Id: <200510122345.QAA17070@floodgap.com> Subject: [gopher] Re: New Gopher Wayback Machine Bot In-Reply-To: <20051012180132.GA19083@complete.org> from John Goerzen at "Oct 12, 5 01:01:32 pm" To: gopher@complete.org Date: Wed, 12 Oct 2005 16:45:56 -0700 (PDT) X-Mailer: ELM [version 2.4ME+ PL39 (25)] MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 8bit X-Spam-Status: No (score 0.4): AWL=0.367, FORGED_RCVD_HELO=0.05 X-Virus-Scanned: by Exiscan on glockenspiel.complete.org at Wed, 12 Oct 2005 18:46:46 -0500 X-archive-position: 1112 X-ecartis-version: Ecartis v1.0.0 Sender: gopher-bounce@complete.org Errors-to: gopher-bounce@complete.org X-original-sender: spectre@floodgap.com Precedence: bulk Reply-to: gopher@complete.org List-help: List-unsubscribe: List-software: Ecartis version 1.0.0 List-Id: Gopher X-List-ID: Gopher List-subscribe: List-owner: List-post: List-archive: X-list: gopher > Cameron, floodgap.com seems to have some sort of rate limiting and keeps > giving me a Connection refused error after a certain number of documents > have been spidered. I'm a little concerned about your project since I do host a number of large subparts which are actually proxied services, and I think even a gentle bot going methodically through them would not be pleasant for the other side (especially if you mean to regularly update your snapshot). Veronica-2 doesn't actually download content other than non-local selectors in a directory to get around this problem since it doesn't index the content in any case, just the titles and selector data. I do support robots.txt, see gopher.floodgap.com/0/v2/help/indexer -- ---------------------------------- personal: http://www.armory.com/~spectre/ -- Cameron Kaiser, Floodgap Systems Ltd * So. Calif., USA * ckaiser@floodgap.com -- "I'd love to go out with you, but I'm joining my split ends individually." -