Version 3.0 September, 1996 fasta30t6 now contains a fully functional version of the pvm implementation of the fasta programs. The pvm versions (pvcompfa, pvcompsw) use exactly the same comparison/alignment functions as the fasta3(_t) versions. Most significantly, the program now produces sequence alignments as well as lists of high scoring sequences. Thus pvcompfa and pvcompsw are fully functional replacements for fasta3(_t) which allow large scale searches of one library against a second library. Sequence libraries: Unlike fasta and fastea3(_t), pvcompfa and pvcompsw keep the entire library sequence in memory (divided among the various machines). This has two effects: (1) currently, all database sequences must be in a single file. (2) The maximum number of sequences read by a single worker node is specified in w_mw.h as MAXSQL. If your database contains more than #_nodes * MAXSQL, the remaining sequence will be ignored. The query sequence library should be in "fasta" format. I have done the most extensive testing with searching libraries in FASTA and PIR format, although Genbank flat file should work as well. I have not tested other library formats with pvcompfa/pvcompsw. Load balancing: pvcomplib.c also has some additions that allow the program to be run efficiently on multiprocessor hosts. You can now specify the number of processes you would like to spawn on a single "host" by specifying the "SP" parameter in the "hostfile" used by pvmd. Thus, the following lines: * sp=1004 alpha0.virginia.edu will cause four worker processes to be started on alpha0. Likewise, if you have a network with different machines with different numbers of CPU's, you can use: alpha1.Virginia.EDU alpha2.Virginia.EDU alpha3.Virginia.EDU alpha4.Virginia.EDU alpha5.Virginia.EDU * sp=2204 alpha0.Virginia.EDU In this example, the "*sp=2204" also indicates that each CPU of the alpha0 machine is 2X faster than the alpha1-5 machines, so 2X as many sequences will be loaded on each CPU of alpha0. pvcomplib.c uses the last two digits of the "sp=" specification to specify the number of CPU's. Program limitations: At the momemt, only fasta (pvcompfa/c.workfa) and ssearch (pvcompsw/c.workgsw) are implemented. Clearly fastx and tfastx need to be included. Please send bug reports to: wrp@virginia.edu. Bill Pearson