BIRCH - Frequently Asked Questions

BIRCH
Frequently-asked Questions

Send your own questions to psgendb@cc.umanitoba.ca

About BIRCH

What exactly is BIRCH?
Why should I choose BIRCH, as opposed to EMBOSS, GCG etc...
Can I install BIRCH on my PC?
Why don't BIRCH programs run through the BIRCH web site?
Why is BIRCH available for Unix and not for Windows?
Is there a way to install my own programs?

Setting up your account to use BIRCH

How do I set up my account to use BIRCH?
How do I turn off my BIRCH access?
Does it matter which shell I use?
I ran the newuser script, and now BIRCH works, but none of the global settings work anymore. For example, I just get a generic prompt like '>', rather than the systemwide prompt.

BioLegato

My output keeps comming up blank
My output pops up, but things like sequences and numbers don't line up properly.
Sometimes when I choose program from the bioLegato pull-down menus, the program doesn't launch. This only happens when I'm running a remote desktop (eg. VNC, SGD).
Launching bioLegato from the BIRCH launcher, versus the command line.

Launchers, browsers, editors

A program (eg. mesquite) has asked me to select a web browser or other application to launch. How do I find it?

How do I

Is there a way to copy sequences from BioLegato and paste into a Web form?

BLAST/FASTA searches

Why is there a lag of several seconds between the appearance of BLAST results in my web browser, and time when the hits pop up in blnfetch or blpfetch?

About BIRCH

What exactly is BIRCH? - The short answer: BIRCH is a system for organizing large numbers of bioinformatics programs and databases.

The longer answer: BIRCH is a conceptual framework for building a bioinformatics system that is user-oriented, powerful, easy to use, and tailored to the needs of the local user community. That being said, what you get when you download BIRCH includes:
- A hierarchical directory structure that organizes programs, data and documentation
- A large set of pre-configured, ready-to-use Open Source programs for analysis of sequences, markers, genetic data, gene array data...
- The bioLegato graphic interface, which greatly simplifies working with large datasets, and is particularly good at using output from one program as input for another program. Think of bioLegato as a program that knows how to run other programs.
- A database of all documentation on the system that is readable through the BIRCH web site
- Tutorials that are task-oriented, showing how to put together several programs to accomplish a task, rather than focusing on the details of how to use each specific program. Tutorials include numerous screen shots with real data. Tutorials cover the critical theoretical points, possible problems, as well as practial details of how to accomplish a complex task.
- A set of system administration tools that simplify the process of making everything work the same way for everyone
Why should I choose BIRCH, as opposed to EMBOSS or other packages etc... - You shouldn't have to choose. Just about any software package can be integrated into BIRCH. That being said, BIRCH is distinct from other bioinformatics packages in a number of ways:
The bioLegato interace unifies the access to all programs in a single graphic interface. Behind the scenes, bioLegato takes care of
- interconverting file formats, allowing transparent pipelining of data from one program to the next
- has workarounds for numerous bugs and features that can be problematic with many of the well-known programs that are in common use
- makes it easy to re-do analyses and ask "what if" questions
- automatically runs CPU-intensive programs such eg. phylogeny at a lower priority, preventing these jobs from interfering with the work of others on the system
- adds inormation to output that is not always available from the programs themselves. Examples would include parameters used during the run, and execution times for long-running jobs
BIRCH organizes the documentation for many different bioinformatics packages into a well-organized set of HTML pages.
BIRCH is highly flexible and configurable. It lets you add just about any program, documentation, or sample datafiles that you want, so that they appear seamlessly to the user as part of BIRCH and bioLegato.
Can I install BIRCH on my PC? - Well... yes, if your PC runs Linux or MacOSX. Otherwise, you have to use BIRCH on a Unix or Linux server. The great thing about BIRCH, though is that you can access your Unix account (and therefore, BIRCH) from any computer on the Internet. See Using Unix from Anywhere.
Why don't BIRCH programs run through the BIRCH web site?
- Been there, done that - There are already more web sites than you can count that let you paste data into a window and run a program. BIRCH goes way beyond that.
- Web pages are ackward interfaces - For anything but very simple tasks, Web pages are not a very efficient way to work with data.
  - the web program runs in your browser. It is not well-integrated with the rest of your desktop
  - output is almost always human readable, and not readable by other programs
  - you usually have to go to many different web sites to do a variety of tasks
  - web pages are usually geared to working with one data item (eg. sequence) at a time, rather than large datasets.
  - most web sites have to impose limits on the amount of processing you can do
  - web sites are less secure, because your data goes off-site
Why is BIRCH available for Unix-like systems (Unix, Linux, MacOSX) and not for Windows? - Windows does a lot of things differently from most operating systems. A windows version of BIRCH, using the CygWin framework for Unix compatability, is currently in progress.
Is there a way to install my own programs? - On any Unix or Linux system, each user can have a directory called 'bin' in their $HOME directory, in which they can install their own programs. You just need to make sure that $HOME/bin is in your $PATH. On some systems this is automatic. Just type 'echo $PATH' to see if a directory called 'bin' in your home directory is included. If not, you can add a statement like

PATH=$PATH:$HOME/bin

to your .profile file (if you use a BOURNE-type shell like bash) or

setenv path $PATH:$HOME/bin

to your .cshrc file (if you use a csh-type shell like tcsh). Once your bin directory is in your $PATH, any program you put into your bin directory will be found at the command line.

Setting up your account to use BIRCH

How do I set up my account to use BIRCH? - You need to run the newuser script which is found in the directory $BIRCH/admin. For example, if BIRCH was installed in /home/psgendb, you would type

/home/psgendb/admin/newuser
How do I turn off my BIRCH access? - Run the nobirch script which is found in the directory $BIRCH/admin. For example, if BIRCH was installed in /home/psgendb, you would type

/home/psgendb/admin/nobirch
Does it matter which shell I use? - A lot of work has gone into making BIRCH shell-neutral, so you should be able to use any available shell. BIRCH has been tested using csh, tcsh, ksh, bash and sh. There is an issue with sh and ksh. It is impossible to get sh to display the current working directory ($PWD) in the command prompt, and to get that prompt to update each time you change to a new directory. ksh seems to be able to do this on Solaris, but I have not been able to get it to work using ksh on Linux. Therefore, on all systems, I would recommend against using bash, and on Linux, avoid ksh.
I ran the newuser script, and now BIRCH works, but none of the global settings work anymore. For example, I just get a generic prompt like '>', rather than the systemwide prompt. All shells read a system-wide rc file that contains commands to set up the shell. After executing those commands, the shell then looks for an rc file in the user's $HOME directory, and if it exists, executes those commands as well. Usually the system-wide rc file is read automatically, but on one system of which I am aware, the command to read the system-wide rc file must actually be placed within the user's rc file. Specifically, when using tcsh, the user's $HOME/.tcshrc file must contain the line

source /usr/local/common/common.cshrc

It is probably a bad idea, from a system administration standpoint, to require this. In any case, the only alternative was for the newuser script not to create a .tcshrc file, which then means that BIRCH wouldn't work if the user was running tcshrc as their primary shell. The decision was to document this problem, and, on those rare systems where one must have such a source line, the user must add it manually. Sorry about that. I usually manage to come up with a solution that works everywhere, but I don't think there is a clean solution to this one. As I said, I have only seen this done on one other system, so I suspect the problem will be quite rare.

BioLegato

My output keeps comming up blank - There can be many causes for this, but the most common is that you forgot to select one or more sequences before opening a menu to launch a program. Another possibility is that you have run out of disk space, or exceeded your disk quota.

Special note: For menu items that include a chooser such as DNAML --> Evaluate user-supplied tree, there is an error in bioLegato that will prevent bioLegato from reading files whose names contain 'in1', eg. 'globin12.treefile'. This is a "documented bug", and will be fixed in the next release of BIRCH.
My output pops up, but things like sequences and numbers don't line up properly - Many bioinformatics programs are written with the expectation that output will be displayed in a fixed font. A fixed font, such as Courier is a character set in which each character takes up exactly the same with. However, some text editors will display characters using proportional fonts such as Helvetica or Times, in which narrow characters such as l or i are not as wide as characters such as O or R. The solution is to set the default font for your text editor (eg. jedit, nedit, gedit, kate, leafpad) to use a fixed font. Names for fixed fonts will usually include terms such as "mono", "monospaced", or "fixed", or perhaps "typewriter".
Sometimes when I try to run a program from the bioLegato pull-down menus, the program doesn't launch. This only happens when I'm running a remote desktop (eg. Thinlinc, VNC). Explanation: This is an artifact of network latency. When you hold down the mouse over the name of a menu (eg. Edit) and hold down the left mouse button, there is a brief time lag for the mouse-click event to be transmitted to the remote biolegato job, and a further lag waiting for the result to appear back on your local screen. Thus, if you try to drag down to the name of a program and release the mouse, the remote biolegato job doesn't have time to know that you have released the mouse over a program name. Solution: For remote sessions, click once on the name of the menu to open the menu. The menu will stay open. Now, click a second time on the name of the program.
Launching bioLegato from the BIRCH launcher, versus the command line.For reasons that are not entirely clear, when you launch an initial bioLegato instance (eg. bldna, blprotein) from the birch launcher, after a few bioLegato jobs have run, you can't launch any more of them, or run programs from them. This doesn't seem to be a problem if you launch bldna or blprotein etc. directly from the command line. The problem seems to be something to do with how Java utiilzes machine memory resources. We are looking into it.

Launchers, browsers, editors

A program (eg. mesquite) has asked me to select a web browser or other application to launch. How do I find it? - You can find the location of any command at the command line. For exampel, to find the path to Firefox, type 'which firefox'. If the output was '/usr/bin/firefox', then either type in that path, or, if the program gives you a file chooser, traverse the directory tree to the file.

How do I

Is there a way to copy sequences from BioLegato and paste into a Web form? - Yes. In bldna, blprotein, blnalign or blpalign, select the sequences, and choose File --> Export Foreign Format. Select export to Text Editor. The sequences will pop up in a textedit window, and you can copy and paste your sequences from there.

BLAST/FASTA searches

Why is there a lag of several seconds between the appearance of BLAST results in my web browser, and time when the hits pop up in blnfetch or blpfetch?
Blast searches run through BioLegato comprise three steps: 1) run the search, and write results in NCBI archive format. 2) Run blast_formatter and generate HTML output to display in the web browser. 3) Run blast_formatter and generate output as a TAB-separated value file (.tsv) to display in blnfetch or blpfetch. Steps 2 and 3 each require several seconds to go back to the database files and collect the information needed for each output file. Consequently, when you see the HTML report pop up in your browser, blast_formatter is still working on getting the report information for blnfetch or blpfetch. The more hits BLAST fines, the longer it will take for blast_formatter to get the data in step 3. For this reason, the Web report is programmed to pop up first, so that you can be examining the results while waiting to for blnfetch of blpfetch to appear for retrieval of hits.

Please send suggestions of comments regarding this page to psgendb@cc.umanitoba.ca