================================================================== Lingua::Wordnet Copyright 1999,2000,2001,2004 by Dan Brian. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. ================================================================== NOTE: If you are upgrading from a previous version, you may need to rebuild your Wordnet database files to accomodate new or changed functions. See Changes for details. DESCRIPTION Wordnet is a lexical reference system inspired by current psycholinguitics theories of human lexical memory. This module allows access to the Wordnet lexicon from Perl applications, as well as manipulation and extension of the lexicon. Lingua::Wordnet::Analysis provides numerous high-level extensions to the system. Version 0.1 was a complete rewrite of the module in pure Perl, whereas the old module embedded the Wordnet C API functions. In order to use the module, the database files must first be converted to Berkeley DB files using the 'scripts/convertdb.pl' script. REQUIREMENTS Perl 5.005, Berkeley DB 1.*, Wordnet 1.6/1.7 are required. The Wordnet distribution does not need to be installed, but the data files must be accessible for creation of the new database files. Wordnet is available from http://www.cogsci.princeton.edu/~wn/. INSTALLATION To configure and install, type: perl Makefile.PL Next, run the script 'scripts/convertdb.pl' to rewrite the data in Berkeley DB format. It will also ask where you want to new data files stored (default is /usr/local/wordnet1.7/lingua-wordnet/). It will write the following files, and will take quite a while: lingua_wordnet.index - all indexes of all senses lingua_wordnet.data - all data files combined lingua_wordnet.morph - all exception data The files will be large (about 40MB total), but loading time is nominal, and searches are instant, since all data is mapped for lookup rather than scanned. The format of the new database is accessible with Berkeley DB, and consists of a hash mapping of each synset to a key, using the synset offset with the pos character as the key for a synset. Added synsets increment the synset offsets sequentially, but the original offsets are retained for legacy compatibility. Lingua::Wordnet will look for these files in the directory indicated at the start of the Wordnet.pm file. Then: make make test The test will load the new Wordnet data files and run some tests on them. If any tests fail, stop and find out why. Then as root: make install This will install the module among your Perl modules and install the new data files. Since these are large, you should do a 'make clean' after the install to delete the local copies. DOCUMENTATION You can access the Lingua::Wordnet documentation with: perldoc Lingua::Wordnet perldoc Lingua::Wordnet::Analysis There is additional documentation in the 'docs/' directory, and the scripts in 'scripts/' are fairly good references for examples. WHAT THEN? If you are not familiar with Wordnet you should download and read the l "Five Papers" document at http://www.cogsci.princeton.edu/~wn/. An article on this module appeared in the summer 2000 Perl Journal (#18), for the curious. EXTRA FILES docs/terms.txt - a brief summary of Wordnet terms scripts/LWBrowser.pm - an Apache/mod_perl module HTML font-end to Lingua::Wordnet. scripts/report.pl - generates statistics reports for databases scripts/10questions.pl - demonstrates Analysis.pm with a game of "10 Questions"