BIRCH

Tutorial: Designing PCR primers to amplify a gene from genomic DNA


March 22, 2018

PrimerBLAST publication: http://www.ncbi.nlm.nih.gov/pubmed/?term=22708584


Rationale: In many cases the quickest way to clone a gene is to amplify it by PCR from genomic DNA, and clone the PCR product. This approach is especially useful if the target is a complete copy of the gene, including the flanking regions which may contain important regulatory sequences that may be found in a genomic sequence.

Goal:
To demonstrate the process of finding a gene in a genomic sequence, designing PCR primers for the gene, and retrieving the expected PCR product that would be generated using those primers.

In practice, if the goal was to clone a PCR fragment, one could either add suitable restriction sites to these primer sequences before synthesis of the primers, or after PCR, ligate cloning adaptors to the blunt-end PCR fragments.

Overview:
For this tutorial, we will use the example of the Brassica napus gene LepR3 for resistance the blackleg fungus, Leptosphaeria maculans.

Larkan, N. J., Lydiate, D. J., Parkin, I. A. P., Nelson, M. N., Epp, D. J., Cowling, W. A., Rimmer, S. R. and Borhan, M. H. (2013), The Brassica napus blackleg resistance gene LepR3 encodes a receptor-like protein triggered by the Leptosphaeria maculans effector AVRLM1. New Phytol, 197: 595605. doi:10.1111/nph.12043

1. Create a working directory

I can't repeat this often enough. ALWAYS create a new directory for each project.

cd tutorials
mkdir primerblast
cd primerblast

go into tutorials directory
create a directory called primerblast
go into the primerblast directory

This directory will be used for all files associated with this tutorial.

2. Locate the Brassica napus LepR3 gene

Since we'll need to use NCBI PrimerBlast, it's best to start by finding the gene on the NCBI web site at https://www.ncbi.nlm.nih.gov. Using the search panel at the top of the page, Set the database to Gene, and type  in the search term

LepR3 [Text Word] AND Brassica napus [ORGN]

 as shown:


The top part of the results are presented at right. Note that the formal annotation for this gene lists it as a receptor-like protein. This makes sense, since most plant disease resistance genes are Toll-like receptor protein kinases.

The most recent release of the genome annotation tells us that this gene is found on Chromosome A10 (ie. in the A genome of B. napus.)



The gene can be viewed in the context of the chromosome scaffold further down the page:


If you mouse-over  the gene, represented by the green line with arrows, you'll see that the gene is annotated as spanning nucleotides 17,537,985..17,541,448. Note that the view is shown with respect to the orientation of the LepR3 gene, which is encoded on the reverse strand. Therefore, the coordinates in the genome viewer go from high numbers to low numbers, going left to right.


3. Design PCR primers to amplify the gene and its flanking regions

Since to goal is to clone the genomic copy of the gene for use in plant transformation, we want not only the transcribed region, but also the flanking regions that are likely to contain important regulatory sequences. If you were to click on the GenBank link, you would see a GenBank report encompassing 3464 bp, covering ONLY the mRNA coding region and a bit of the downstream flanking region. (See LepR3mRNA.gen). Therefore, if we want a larger PCR product that includes the flanking regions, we have to choose a larger region to be used for primer design.

To see the gene in the larger context of flanking sequences, click once on the Zoom out button (-)



While it is impossible to predict the actual extent of the promoter and other regulatory sequences, let's assume that we need 1000 bp upstream, and 500 bp downstream of the gene. That means that we want the PCR product to encompass nucleotides  
17,542,448 to 17,537,485 .

Keep in mind that these coordinates are the sequences that we want to guarantee will be found in the PCR product. Therefore, the primers must be located outside this region. When designing primers, we need to specify a large enough area upstream and downstream from the desired product to ensure that the program can find suitable primers. For simplicity, let's say that the region to be searched by PrimerBLAST should include 1000 bp flanking the desired product. That would put the coordinates for the search at
17,543,448 to 17,536,485. For convenience we could round these numbers to 17,543,500..17,536,500.

Using the ruler at the top of the chromosome view, drag across from 17,543,500 to 17,536,500. The numbers you get won't be precisely those coordinates, but as long as they're close we can proceed.



We are now ready to run PrimerBLAST. You can launch Primer BLAST from the popup menu, or from the Tools menu. Be sure to choose Primer BLAST (Selection).

Primer-BLAST lists the accession number of the genomic scaffold, and the coordinates of the selected region are shown in the Forward primer and Reverse Primer boxes.



In the Primer Parameters section, we set the Minimum PCR product size to 5000 (ie. a 3464 bp gene plus 1000 bases upstream and 500 bases downstream, rounded to 5000).


We also have to explicitly tell Primer-BLAST not to choose primers within the 5000 bp target region that we want included in the PCR product. Therefore, go to Advanced settings at the bottom of the page, and find the box reading "Excluded regions" in the Advanced Primer Parameters.

Primer-BLAST chooses an Excluded region using a starting point and a length. Therefore, type in

17537500,5000

(No commas are allowed in large numbers such as 17,537,500 ).


Click on the Get Primers button at the bottom to run the search. It will take several minutes before the results are returned.

How does Primer-BLAST work?

The search for primers is essentially a 2-step process:

1. Use the Primer3 program to design candidate primer pairs for the target sequence. Almost all of the parameters to Primer-BLAST are actually parameters for Primer3.
2. Use MegaBLAST to search an NCBI database for matches to the primer. Any good matches to genes in the database other than the target sequence will cause candidate primers to be discarded. This step is critical to ensure that the final primer pairs will amplify the desired target, and no other sequences in the genome. By default, the RefSeq mRNA database is searched for unwanted matches.


The graphical view shows the products produced by the 10 best PCR primer pairs, superimposed on the genome map. You can mouse over any product to see the length of the product, along with links for downloading the sequence of the product.


In the example, the shortest of the products 5294 bp, was produced using primer pair 6.

If you scroll down to the Detailed Primer Reports, you will see the sequences of the primers, as well as other data on the PCR primers. For comparison, results for Primer pairs 5 and 6 are shown.


Further examination of the map shows that all of the reverse primers start at about the same place within the promoter region (~17,542,648), and the length differences are primarily due to choice of the forward primer, in the 3' downstream region of the gene. Since we are mostly interested in maximizing the promoter region, we'll choose Primer Pair 6 for further work. To save the sequence of the PCR product, along with its gene annotation, mouse over the map for Primer Pair 6 to bring up the menu shown above, and click on GenBank view. The GenBank view will appear in a new tab or window.



Note that the Accession number is the same as the scaffold, and the region shown in this file is the 5294 bp fragment going from 17537355 to 17542648
. Save the file using the Send menu in the upper right corner, and choose Complete record, File, and GenBank (full).

Click on Create File to save, and save as LepR3-PCR.gen.

Note: The GenBank entry does NOT list the sequences of the primers! While it is true that the sequence begins and ends with the forward and reverse primer sequences, there is no annotation to tell precisely what those primers are. Therefore, you need to make a special note of the primer data from the Primer-BLAST results.