Obtaining and installing StringTie ---------------------------------- The current version of StringTie can be downloaded from http://ccb.jhu.edu/software/stringtie In order to build StringTie from the source package, the following steps should be taken: * Unpack the downloaded StringTie source archive in a directory of your choice, e.g.: cd ~/src/ tar xvfz ~/Downloads/stringtie-V.V.V.tar.gz A directory called stringtie-V.V.V (where V.V.V is the current numeric version of the program) will be created in the current directory. * Change to that directory and build the stringtie executable: cd stringtie-V.V.V make release Note that simply running `make` will produce an executable which is more suitable for debugging and runtime checking but which can be significantly slower than the optimized version which is obtained by using `make release`. Running StringTie ----------------- Run stringtie from the command line like this: stringtie [options] The main input of the program is a SAMtools BAM file with RNA-Seq mappings sorted by genomic location (for example the accepted_hits.bam file produced by TopHat). The following optional parameters can be specified (use -h/--help to get the usage message): --version : print current version at stdout -h print this usage message -G reference annotation to use for guiding the assembly process (GTF/GFF3) -l name prefix for output transcripts (default: STRG) -f minimum isoform fraction (default: 0.1) -m minimum assembled transcript length to report (default 100bp) -o output path/file name for the assembled transcripts GTF (default: stdout) -a minimum anchor length for junctions (default: 10) -j minimum junction coverage (default: 1) -t disable trimming of predicted transcripts based on coverage (default: coverage trimming is enabled) -c minimum reads per bp coverage to consider for transcript assembly (default: 2.5) -v verbose (log bundle processing details) -g gap between read mappings triggering a new bundle (default: 50) -C output file with reference transcripts that are covered by reads -M fraction of bundle allowed to be covered by multi-hit reads (default:0.95) -p number of threads (CPUs) to use (default: 1) -A gene abundance estimation output file name -B enable output of Ballgown table files which will be created in the same directory as the output GTF (requires -G, -o recommended) -b enable output of Ballgown table files but these files will be created under the directory path given as -e only estimates the abundance of given reference transcripts (requires -G) -x do not assemble any transcripts on the given reference sequence(s) Transcript merge usage mode: stringtie --merge [Options] { gtf_list | strg1.gtf ...} With this option StringTie will assemble transcripts from multiple\n\ input files generating a unified non-redundant set of isoforms. In this\n\ usage mode the following options are available:\n\ -G reference annotation to include in the merging (GTF/GFF3)\n\ -o output file name for the merged transcripts GTF\n\ (default: stdout)\n\ -m minimum input transcript length to include in the merge\n\ (default: 50)\n\ -c minimum input transcript coverage to include in the merge\n\ (default: 0)\n\ -F minimum input transcript FPKM to include in the merge\n\ (default: 0)\n\ -T minimum input transcript TPM to include in the merge\n\ (default: 0)\n\ -f minimum isoform fraction (default: 0.01)\n\ -l