Trace Editor for cDNA
The trace display has been substantially improved in relation with the Y.Kohara cDNA mapping project in Japan, which has for aim to integrate previously unreleased data on almost 10000 genes of C.Elegans. The goal was to align 100000 reads on the genomic allowing for intron splicing and edit them in comparison with the genomic sequence(75 meg out of 100 already published) to maximize the readable length of a read. Optimized as it is a gene can be completely edited (ie to the end of each read) in a few minutes.
Left button to accept, middle button to move, right mouse button for details
Use 'up arrow' and 'down arrow' to move to next problem, and 'enter' or 'space' to edit as suggested
Tools
Control Bar
Display map
The display map is in the left section of the window and is limited on the right by the scale bar.
It shows the genomic sequence on the left, then the reads as compared to genomic sequence.They appear as vertical rows of dots when there is no difference with the genomic, the letters correspond to differences with the genomic a letter for a punctual error,a capital for deletion,a star for insertion, an X/# for multiple inserts/deletes, and a dot for same as genomic.
The reads are displayed in the same order (left to right) as in the trace display, up reads on the left and down reads on the right, separated by a white space if they cross.
The traces that are currently displayed are highlighted in green, you can select the traces you want to display by clicking on them. Holding shift while you do this will retard the redrawing.
Note that usually there are more bases shown in this map display than in the trace display, you can therefore foresee clips, errors or jumps.
Clips are shown in blue for start insert, orange for end of insert, yellow for quality clip(these can be opened with the double arrow in the trace display)pink for jump to other position, yellow for transplice to motif(SL1, SL2). It is useful to often refer to this display in jump zones, as the relative positions of the splice site on the different clones is more readily visible. Note also that a trace won't be displayed if the clip extends beyond the base you are centered on.
Scale bar
This is the black line+green rectangle just right of the display map. The green rectangle is the exact position of the traces displayed on the right. Left clicking on this rectangle recalculates and brings to the forefront the cDNA Gene map.This is not a fix, it's a way of knowing where you are on the gene.
Select the exact position and zoom of the trace display by middle dragging on the display map:left zooms in, right zooms out, up and down choose the base you center on.
The black line represents the length of sequence visible in the display map.
The Trace Display
This is the section of the window where the traces are graphically displayed. The traces are displayed according to the display map and show... options, and in the same order as in the display map.Errors are represented by the suggestion of the error tracker.
At the top of each trace is written the coordinate of the base at the top of the screen in the sequence. Just below is a triangular indicator of direction up or down. This is also a button and a pull-down menu:clicking on it brings the EST for this sequence, right-clicking offers the menu:
- Who am i:opens a more info display on the sequence.
- Search Vector:tries to to find a vector signal in this read. (ggaattcggcacgag)
- Discard from here:use this if the read seems not to fit in this gene.
The circle with the e means edit this sequence, it will edit the visible part of this sequence according to suggestion.
The traces themselves are displayed vertically, the bases called are aligned along the trace by a transparent internal zoom algorithm. The colored bars represent the Ace base call, their length gives an idea of the certainty evaluated by the program. Differences with the genomic are displayed in colored boxes. A purple X means multiple inserts, a # multiple deletes. Clicking on a suggestion causes it to be accepted.
Edit as suggested
In this version of the trace editor you have access to very powerful tools for accepting suggestions.
- Edit as suggested='e' or 'space' or 'enter' This accepts all the suggestions on screen, save for the top and bottom edge to insure there isn't a clip a few bases further along.
- The Magic Square:By dragging with the left mouse button you create a square over the traces; any errors inside the square when you release the mouse button will be edited as suggested.
- Left-click: as said earlier left clicking accepts a change.
Note that as errors disappear, in best+error mode the traces won't be displayed anymore. You can still display them by picking them in the display map, or by changing the Show... option to SHOW ALL.
Edit as IS NOT suggested
When you feel the suggestion is false, or is based on the wrong genomic sequence(ie splice site), you edit 'by hand'. Each base is a pull-down menu, and between any two bases there is a pull-down menu.
- Base menu options:
- you can change the letter to any other. a, t, g, c, or n, or delete it from this menu.
- The various CLIPS available are also accessed through this menu. See associated paragraph.
- Insert menu options: Between two bases is a button for inserts. You can insert a, t, c, g, or n. NB: it's sometimes difficult to touch when the bases are very near each other, you should zoom in if such is the case.
Clips, jumps, transplice sites
These sites are specific of cDNA sequencing and should be tagged as they are found in the traces.These tags all start on the base you select and extend either
forward or to end of trace or back or from start to here.
- forward clips:
- clip end insert:tag the end of a read that exits the insert.Two cases here, either you clip a quality too low for enhancing(3 or 5prime), or you tag the g of gagcacggcttaag on an ascending 5 prime read that reaches the start vector of the clone.
- clip end quality:ah try end insert.
- polyA:Place on the first A of a polyadenylation site on the end of a 3 prime read.
- from start clips:
- clip start insert:Place this on the first base of the insert. ie the first base after ggaattcggcacgag.
- Motif:placed on 3 prime side(top) after the vector
- SL1:GAG TTT GAA CCCA
- SL2:GAA CTC ATT GACC
- gcc gtg ctc:this nine base motif has been found so oftenwe made a button to tag it.
- other motif:tag any other motif that seems to be transpliced. One of the motifs found often was GAA CCA AAT Tg.
Note that you should be very careful whether you're clipping a 3 or 5 prime as you
always clip in the same direction as the trace.
If your clip seems exaggerated a confirm message will ask
Do you really want to clip -67 Bases meaning unclip 67 bases, or
Do you really want to clip 580 Bases meaning you' re clipping the wrong end.
If despite this you clipped wrong the right button menu offers the option undo last vector clipping
SCROOMING: scrolling and zooming