RubyGems Navigation menu

medusa-crawler 1.0.0.pre.1

Medusa: a ruby crawler framework

Medusa is a ruby framework to crawl and collect useful information about the pages it visits. It is versatile, allowing you to write your own specialized tasks quickly and easily.

#### Features

  • Choose the links to follow on each page with ‘focus_crawl()`

  • Multi-threaded design for high performance

  • Tracks 301 HTTP redirects

  • Allows exclusion of URLs based on regular expressions

  • HTTPS support

  • Records response time for each page

  • Obey robots.txt

  • In-memory or persistent storage of pages during crawl using Moneta adapters.

  • Inherits OpenURI behavior (redirects, automatic charset and encoding detection, proxy configuration options).

Gemfile:
= Copy to clipboard Copied!

install:
=

Versions:

  1. 1.0.0 - August 17, 2020 (23 KB)
  2. 1.0.0.pre.2 - August 14, 2020 (23 KB)
  3. 1.0.0.pre.1 - August 06, 2020 (24 KB)

Runtime Dependencies (3):

moneta ~> 1.3, >= 1.3.0
nokogiri ~> 1.3, >= 1.3.0
robotex ~> 1.0, >= 1.0.0

Owners:

Pushed by:

Authors:

  • Mauro Asprea, Chris Kite

SHA 256 checksum:

36b72004627cc1abf81715777b29c34e67dc3c1f9420311103ea275e6d216733

Total downloads 4,222

For this version 1,263

License:

MIT

Required Ruby Version: >= 0

Required Rubygems Version: > 1.3.1

Links: