Designing gene fragments compatible with the eGene™ Prep Kit

This linked document is confidential to Nuclera and should not be further shared. If you are unable to utilize the eProtein Discovery Software to design DNA constructs, or unwilling to disclose your protein sequence information, follow these guidelines to design your gene fragment. If you are starting with an amino acid sequence, refer to Section A. If you are starting If you are unable to utilize the eProtein Discovery Software to design DNA constructs, or unwilling to disclose your protein sequence information, follow these guidelines to design your gene fragment.

If you are starting with an amino acid sequence, refer to Section A. If you are starting with nucleotide sequences, proceed to Section B.

Section A: Starting with amino acid sequence

Check your sequence to make sure that it does not contain the following string of amino acid sequences. This avoids duplication of elements encoded into the eGene that are essential for function. Duplication of elements can introduce ambiguity and errors into data generated on the eProtein Discovery system.

You can use the “find” function to search for overlapping sequences.

Amino acid sequence to be avoided
GGGSEGGGSEGGGSE
EAAAKEAAAKEAAAK
GGGGSGGGGSGGGGS
LEVLFQGP
ENLYFQS

Table 1. Duplication of amino acid sequences to be avoided.

If the N-terminus of your sequence begins with the translation start amino acid Methionine, remove it.
If your sequence is a length variant that begins with methionine that is not the translation start amino acid, retain it.
Remove ‘*’ indicating a stop codon at the C-terminus.
Once verified, upload the amino acid sequences on IDT codon optimization webtool (https://eu.idtdna.com/CodonOpt) to generate the corresponding nucleotides sequences. In the organism field, select E.coli. If you are not using IDT service, codon optimize with your gene fragment provider. We have validated IDT gblocks and do recommend that you use IDT services.
Once the corresponding DNA sequence is generated, copy the DNA sequence to clipboard.
Paste the sequence into a sequence viewer such as Snapgene or other similar DNA analysis program.
Inspect the DNA sequence for the following:
a. Ensure that your DNA sequence does not contain any of the following string of sequences to avoid introducing ambiguous priming sites. Ambiguous priming sites can result in multiple bands in your final eGene PCR product.

Nucleotide sequences to be avoided
GCACCGCCTACATACCTC
GGTTGTATTGATGTTGGACG
Table 2. Duplication of nucleotide sequences to be avoided.
b. Ensure that the DNA sequence length of your fragment does not exceed 2955 bp. This will ensure that the fragment remains within the gene fragment synthesis limit up to 3000 bp after appending Nuclera adaptors.
c. Upload the sequence to IDT website for synthesis complexity analysis and confirm that the nucleotide sequence is synthesizable. (https://www.idtdna.com/site/order/gblockentry) Alternatively, perform synthesis complexity evaluation with your gene fragment provider.
d. If the sequence passes the IDT-defined synthesis criteria, proceed to step 9. Otherwise re-optimize the sequence for synthesis based on the advice given by IDT and repeat step (a).
Essential: Add 3C nucleotide sequence (CTCGAGGTTCTGTTCCAAGGACCT) to the 5’-end of the gene of interest. These are priming sites which the left megaprimer will anneal during PCR.
Essential: Add TEV nucleotide sequence (GAGAACCTGTACTTCCAGAGC) to the 3’ end of the gene of interest. These are priming sites which the right megaprimer will anneal during PCR.
No codon optimization should be performed after the addition of 3C and TEV sequences.
Recheck the final sequence (with 3C and TEV added) in SnapGene or similar program. Make sure that it starts with 3C and ends with the TEV sequence.
Make sure that the translation of sequence output is 3C-Protein-of-interest-TEV.
If everything is correct, submit the nucleotide sequence for DNA gene fragment synthesis. We recommend that you order your DNA through IDT

Nucleotide sequences to be avoided
GCACCGCCTACATACCTC
GGTTGTATTGATGTTGGACG

Section B: Starting with Nucleotide sequence

If you are starting with a nucleotide sequence, translate your nucleotide sequence to amino acid sequence and check to make sure that it does not contain the following string of amino acid sequences. This avoids duplication of elements encoded into the eGene that are essential for function. Duplication of elements can introduce ambiguity and errors into data generated on the eProtein Discovery system.
You can use the find function to search for overlapping sequences.

Amino acid sequences to be avoided
GGGSEGGGSEGGGSE
EAAAKEAAAKEAAAK
GGGGSGGGGSGGGGS
LEVLFQGP
ENLYFQS

Table 3. Duplication of amino acid sequences to be avoided.

Once verified, continue with your original nucleotide sequences.
Inspect the DNA sequence for the following using SnapGene or a similar DNA analysis program: a. Check for the presence of START codon (ATG) at the 5’-end of the DNA. If it is present, remove it. The START codon is encoded in the eGene Megaprimers, duplication within the protein sequence will cause expression errors.
b. Check for the presence of STOP codons at the 3’-end of the DNA. If there is a STOP codon, remove it. STOP codon is encoded in the eGene Megaprimer, duplication within the protein sequence will cause premature truncation and failure to detect or purify protein.
c. Ensure that your DNA sequence does not contain any of the following string of sequences to avoid introducing ambiguous priming sites. Ambiguous priming sites can result in multiple bands in your final eGene PCR product:

Nucleotide sequences to be avoided
GCACCGCCTACATACCTC
GGTTGTATTGATGTTGGACG
Table 4. Duplication of nucleotide sequences to be avoided.
d. Ensure that the DNA sequence length of your fragment does not exceed 2955 bp. This will ensure that the fragment remains within the gene fragment synthesis limit up to 3000 bp after appending Nuclera adaptors.
e. Upload the sequence to IDT website for synthesis complexity analysis and confirm that the nucleotide sequence is synthesizable. (https://www.idtdna.com/site/order/gblockentry) Alternatively, perform synthesis complexity evaluation with your gene fragment provider.
f. If the sequence passes the IDT-defined synthesis criteria, proceed to step 5. Otherwise re-optimize the sequence for synthesis based on the advice given by IDT/your preferred gene fragment provider and repeat step (a) – (e).
Essential: Add 3C nucleotide sequence (CTCGAGGTTCTGTTCCAAGGACCT) to the 5’-end of the gene of interest. These are priming sites which the megaprimers will anneal during PCR.
Essential: Add TEV nucleotide sequence (GAGAACCTGTACTTCCAGAGC) to the 3’ end of the gene of interest. These are priming sites which the megaprimers will anneal during PCR.
No codon optimization should be performed after the addition of 3C and TEV sequences.
Recheck the final sequence (with 3C and TEV added) in SnapGene or similar program. Make sure that it starts with 3C and ends with the TEV sequence.
Make sure that the translation of sequence output is 3C-Protein-of-interest-TEV amino acid sequence.
If everything is correct, submit the nucleotide sequence for DNA gene fragment synthesis. We recommend that you order your DNA through IDT.

Nucleotide sequences to be avoided
GCACCGCCTACATACCTC
GGTTGTATTGATGTTGGACG

Nuclera Technical Support:
UK Phone +44 1223 942 761
US Phone: +1 508-306-1297
Email: techsupport@nuclera.com

Offices:
Nuclera UK (HQ):
One Vision Park, Station Road, Cambridge, CB24 9NP, UK

Nuclera USA: 1000 Technology Park Drive, Suite B, Billerica MA 01821, USA www.nuclera.com

Section A: Starting with amino acid sequence​

Section B: Starting with Nucleotide sequence​

Section A: Starting with amino acid sequence

Section B: Starting with Nucleotide sequence