Gharvest

From DPWiki
Jump to navigation Jump to search

< Harvesting/Google_Book_Search

Google made changes to how they do some things. Older versions of the gharvest script do not know where the book ends and will loop back to the beginning and continue to download the book again. If you are experiencing problems, please upgrade to the latest version.


General Installation and Setup Instructions

First you will need to download the Perl Libraries. Follow the installation instructions for the perl libraries. This is also part of a narrative for setting up a line of products for the Post Processors called GuiGuts. Ignore those references. Questions on setting up Perl would be best posted in the DP forum.

This is the perl script for Gharvest.

How to Install gharvest on various Linux distributions

Different Linux distributions use different package managers to perform the task of installing the needed software. Below are short instructions for how to get all of the dependancies of the gharvest script installed.

Ubuntu Linux 6.06-LTS

Ubuntu Linux 6.06 contains the a recent enough version of Perl. By default it probably will not have to be installed, but if it dose, invoke the following command:

sudo apt-get install perl

Once Perl is installed, install the following packages, as gharvest is dependant on them:

sudo apt-get install perl-tk
sudo apt-get install libwww-perl

The gharvest script also needs a recent version of cpan. This can be installed by doing the following:

sudo apt-get install cpan

Once cpan is done installing, it must be configured. Type

cpan

At the command line, and accept all the defaults. When prompted to select a mirror, select one that is geographicaly closest to you. If you find yourself stuck in a

cpan>

prompt, simply type:

exit

and you will be returned to a normal shell.

At this pint, download the gharvest script, and name it gharvest.pl.

Running gharvest

The Perl Script explains how to run it.