ID GYPSY34-I_AG repbase; DNA; ANG; 4442 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY34-I_AG is an internal portion of retrotransposon GYPSY34_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY34-I_AG; GYPSY34-LTR_AG; Gypsy clade; KW MDG3 lineage; RNase-H; reverse transcriptase; KW integrase GYPSY34_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4442 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY34_AG, a member of the MDG3 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 60-60 (2004). XX DR [1] (Consensus) XX CC GYPSY34_AG is a family of gypsy-like LTR retrotransposons CC that, according to the aminoacid sequence of its Reverse CC Transcriptase, CC RNase and Integrase is CC phylogenetically grouped with representatives of the MDG3 CC lineage of other organisms. CC GYPSY29_AG, GYPSY30_AG, GYPSY31_AG, GYPSY32_AG, GYPSY33_AG, CC GYPSY35_AG, CC GYPSY36_AG, GYPSY37_AG and GYPSY38_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY34-I_AG consensus was reconstructed after multiple CC alignment of 4 copies. CC The consensus encodes the 1428-aa GYPSY34_AGp gag-pol like CC poliprotein CC (pos. 119-4402). CC The sequence of the LTRs flanking GYPSY34-I_AG is deposited as CC GYPSY34-LTR_AG. XX FH Key Location/Qualifiers FT CDS 119..4402 FT /product="GYPSY34_AGp" FT /translation="MLRKKVLRRALIETGVDVPDSATVTQFRQLYASHEPV FT ARSPRAAPPTTSATTPAPACANHQDAAILCLPHYNGDDDFAHHENVANAAA FT SMQNTTDAVSALPSAHGMAAALPRGPDDIEAQFEKLRQQQQLAELRQKVHQ FT LETPQPVALCVKDFEAFIEPLDVDKNPNVIRWFRDLERLFALYRVRDADKF FT FFTLRLLTGTAANVAKELVVTTYDELKKELIDNLHVVATPESVYRQLRNRR FT LRPQESALHYLFDMQRIADQASIADSELIPIVIDGLGSPSITSSLHFMPLT FT MNDFRKKLKLFESCRHLCTTQPPSADARATTNSCMERPRPSQEPIRCFNCS FT RFGHLQNACPRPKRPPGGCFRCFQTGHVYRNCPERRANATVEGNTSSDEAL FT ATNQEVSLTFFHPSAKRTTLPCVRSLLDTGSPVSFISDTIVPVKMLGPLSA FT TEYCTMIKGPLYSRGKIDCTIRFKNHSVRHSFIILPGIAWPVIIGRDLLNS FT LNIFLTYSSPTTSCITKPLSTELKEVDTILPEKLDDAIRSICALDVAEADN FT ELDLGKTLSLEQRSIVNSIVENSYLNYTSDVIPLKHPMKINLTHDTPIFTK FT PRRLSYGERQQVKQIVDKLLAENIIRPSNSPYASALVLVRKKSGEVRMCVD FT YRPLNKITVRDNYPLPLIETCLEHLCGKKFFSLLDLKSGFHQVPMSEESIP FT YTSFVTPDGQFEYLKMPFGLRNAPSEFQRFINSILREFIDDGRIVVYLDDI FT IIASTDLSSHFSTLRSVLEKIKQNNLELRLDKCKFVHEEIEYLGYKANFSG FT IQPSDRHIKALTNYPMPTNLKQLRRCLGLFSYFRRFVPSFSCIAKPMTKLL FT QKDEVFNFDSNCVHAFETLRDKLVHSPILSIFDPKRETELHCDASSFGFGA FT ILLQKQDDNKLHPVAYFSKTTSKDESKLHSYELETLSIIYALKRFHTYVHG FT LPIKIVTDCNSLVETLKNRNASAKIARWSLFLENYDYTICHRSGTSMPHVD FT ALSRTEAVGAIGEIDLDFQLQVAQTRDPSIEALKHRLESEEVDGFLLQDGL FT VYRDIPDGQPQLYVPSEMVDNVIRHTHERIGHLGINKTFSKISQHYWFPHM FT KPTIDKFIKNCLKCIVYSAPHHTNARNMYSIPKEPLPFDTIHIDHLGPLPS FT SPLRKKYILVVIDAFTKLTKLYPTSSTNAKEVCSALSQYMSYYSRPRRIVS FT DRATCFTSTLFEDFLESHNISHVLNATGSPQANGQVERVNRVLRPILSKLS FT DAPDQTDWVSKLRSAEYALNNTVHTSTNFCPSVLLFGVEQRGKVPDELAEY FT LDEKFDRASRDLEAIRAKALENIEESQRKNEEYFSKKHKPPQCYKEGDLVA FT IRYSDTTDSGNKKLNPKFRGPYVIHKVLPHDRYVVRDVEGCQLTQLPYDGV FT LEANKLRRWTESSD" XX SQ Sequence 4442 BP; 1178 A; 1178 C; 898 G; 1188 T; 0 other; tctcagaagt gggattacca acaaaaatcg cctgcaaaac cgcctgcaag ccgcctaacc 60 agcctgtatc cgtgtgtacc tgtgtacgtg tgtgtatgac acaacgccat tttccatcat 120 gctgaggaaa aaagtacttc gtcgggcgct tattgagacc ggtgtcgacg tgccggattc 180 cgctaccgtc acacaattcc gacagcttta tgcctcccac gagccggtcg ctcgttcccc 240 ccgtgcggcg ccgcccacta cctcagcgac gacaccggct cccgcttgtg caaaccacca 300 agatgccgcc attttgtgcc ttccacatta caatggcgac gacgattttg cgcatcacga 360 aaatgttgcg aacgctgctg cttcgatgca aaataccact gatgccgttt ccgcccttcc 420 ttctgcccat ggtatggccg ccgcccttcc ccgcggccct gacgacatcg aggcccaatt 480 tgagaagctg cgacagcagc agcagctagc tgaattacgc caaaaggtgc accaacttga 540 aacgccgcag ccagtcgccc tttgcgtaaa ggactttgaa gcttttatcg agccactcga 600 cgtcgataag aaccccaatg tcatccgatg gttccgcgat ttggagcgtc tctttgcact 660 ttaccgagtg cgcgatgcag ataaattttt cttcaccctt cggctcctca ccggcacagc 720 cgctaacgtc gcaaaagaac ttgttgtaac cacttatgat gagttgaaga aagagttgat 780 cgacaatctt cacgtcgttg ctacgcccga atctgtttat cgccaactcc gtaaccgtcg 840 attgcggccc caggaatccg ccctgcacta cttgtttgac atgcagcgca tcgcagacca 900 agccagtatc gccgattcag aactgatccc gatcgtcatc gacggcttgg gaagcccgtc 960 aattacgtcg agtctgcatt tcatgcctct tacgatgaac gacttccgga agaaattgaa 1020 acttttcgaa tcttgccgtc atctttgcac cacccagccc ccttccgctg atgcccgggc 1080 cacaacgaac agctgtatgg aacggccccg cccatcgcag gaacccatcc gctgcttcaa 1140 ctgctcccga ttcggacacc ttcagaacgc gtgcccgcga cctaagcgcc cacccggcgg 1200 atgttttcgt tgtttccaga ctggacacgt ctaccgtaac tgccctgaac gtcgggccaa 1260 cgccactgtc gagggcaata ctagttcgga cgaagctctc gccacaaatc aagaggtgag 1320 tttgacattt ttccaccctt ctgctaagcg taccaccctt ccctgcgttc gttcccttct 1380 cgacacagga agtcctgtga gcttcattag cgacacgata gtaccagtta agatgctagg 1440 acctctttcc gctaccgaat actgcactat gattaaggga ccactttact ctcgaggaaa 1500 aatcgattgt actatccgat ttaagaatca ttccgttcga cactctttta ttatattacc 1560 tggaattgcg tggccagtca ttatcggtcg cgatttactg aactcactta atatttttct 1620 tacgtattca tctcctacaa cttcatgtat tactaaacct ctatcgacgg aacttaaaga 1680 agtagatacg attcttccag aaaaattaga cgatgctatt aggagtattt gtgcgctcga 1740 tgtggctgaa gccgataatg aattggattt aggaaaaaca ctatctttgg aacaacgttc 1800 aatagtcaat tctattgttg aaaactcata cctcaactat acttcagatg ttataccgct 1860 caaacaccct atgaaaatca atctgactca tgatacacca atatttacta agccgcgaag 1920 actctcttat ggtgaaagac agcaggttaa gcaaattgtt gataaactgt tagcagaaaa 1980 catcatccgg cccagtaatt ctccttatgc ttctgcgctt gtcctcgtta ggaaaaagag 2040 tggcgaggtt cgtatgtgtg tggattaccg gcccctcaac aaaattacag ttcgggacaa 2100 ttacccccta ccccttatcg aaacttgttt ggagcatctg tgtggaaaaa aattcttcag 2160 tttgctggat ttgaaaagcg gattccatca agtcccaatg agtgaggagt ctatccccta 2220 cacttctttt gtgaccccag atggtcaatt tgaatatctg aaaatgccat tcggtcttcg 2280 taacgcccct tccgaattcc aacgttttat taattctatc ttaagggaat tcattgatga 2340 tggcagaata gtagtgtacc tcgatgacat catcatcgct tctaccgatc ttagctctca 2400 cttcagtacc cttcggtccg tattagaaaa gattaagcag aataatttag aacttcgtct 2460 tgacaagtgc aaatttgtcc atgaagaaat agaatacttg ggctacaaag ctaacttttc 2520 tggaattcag cctagtgata ggcacattaa agcacttact aattacccta tgcccactaa 2580 tttaaagcaa ctcagacgtt gtcttggtct gttttcatac ttccgacggt ttgttccatc 2640 tttctcttgc atcgctaaac ctatgacaaa acttcttcag aaggacgaag tatttaactt 2700 cgattcaaat tgcgtgcatg cttttgaaac cttacgtgac aaacttgtgc attctcctat 2760 cctttccata ttcgacccaa aacgggaaac cgaattacac tgtgacgcaa gttcctttgg 2820 ttttggcgct attctccttc agaaacagga tgacaataaa ttgcaccctg ttgcttactt 2880 ttccaaaacc acttcaaaag acgagtccaa gttacacagt tatgagcttg aaactctttc 2940 catcatttac gctcttaagc gcttccacac ttatgttcat gggctcccca ttaagatagt 3000 tactgactgc aactctctgg tcgagaccct taagaaccgt aatgcttccg ctaagattgc 3060 caggtggtcc ttgtttctgg aaaattacga ttataccatc tgtcatcgct caggcacttc 3120 tatgcctcat gtcgacgcac tgagtcgcac cgaagctgtg ggtgccatcg gtgagattga 3180 ccttgacttc cagcttcaag tagctcagac gcgtgaccca tctatcgaag ctcttaaaca 3240 tcggttagaa tcagaagaag ttgacggatt cttacttcaa gatgggcttg tctatcgcga 3300 catacctgat ggtcaacctc aattgtatgt cccttcggaa atggtcgaca acgtaattag 3360 acacactcac gagcgaattg gccacctggg cataaacaaa accttcagca aaatcagtca 3420 gcattactgg ttcccccaca tgaagcccac tatcgacaaa ttcattaaga actgcctcaa 3480 gtgcattgtt tattctgcac ctcatcatac taatgcccgg aatatgtaca gcatccctaa 3540 agagccctta cccttcgata ccatccatat tgaccattta ggtccgctcc ctagttctcc 3600 cttacgcaag aagtatatac ttgttgttat cgatgctttc actaaattaa ccaaacttta 3660 cccaacctcc tcaactaatg cgaaggaagt gtgttctgcc ctttcccaat atatgtctta 3720 ctatagccgc cctaggcgga ttgttagcga tcgagctact tgtttcacct caaccttgtt 3780 tgaggacttc ttggaatcgc ataacattag ccatgtcctc aacgccaccg gatccccaca 3840 agccaatgga caggtagaac gggtgaaccg tgtgttgcgt cctatcctta gcaaactatc 3900 tgatgctcca gaccagaccg attgggtatc caagttgcgg tcagccgaat acgctttaaa 3960 caataccgtc cacacatcta cgaacttctg cccctctgtc ctactctttg gtgtcgagca 4020 acgcggtaaa gttccagacg agttagccga atacctggat gagaaatttg atcgagcctc 4080 tagggactta gaagccattc gggctaaagc gttagaaaac atagaagagt ctcaacggaa 4140 gaatgaggaa tactttagca aaaagcacaa accaccacag tgctataagg aaggtgactt 4200 agtggctata cgttactctg atacgaccga tagcggtaat aagaagctca atcctaaatt 4260 caggggacct tacgtcatcc ataaagtgtt gccccatgat aggtacgtgg tacgcgatgt 4320 agaaggatgt caactcacac aactacccta cgatggggtt ctagaagcga ataagttgcg 4380 acgttggacc gagtccagtg attaggaaat tgagggcaat ttattgttca ggatagccga 4440 gc 4442 // ID GYPSY51-I_AG repbase; DNA; ANG; 5176 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY51-I_AG is an internal portion of retrotransposon GYPSY51_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; CsRn1 lineage; GYPSY51-I_AG; GYPSY51-LTR_AG; KW Gypsy clade; RNase-H; reverse transcriptase; KW integrase GYPSY51_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5176 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY51_AG, a member of the CsRn1 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 94-94 (2004). XX DR [1] (Consensus) XX CC GYPSY51_AG is a family of gypsy-like LTR retrotransposons CC that, according to the aminoacid sequence of its Reverse CC Transcriptase, CC RNase and Integrase is CC phylogenetically grouped with representatives of the CsRn1 CC lineage of other organisms. CC GYPSY48_AG, GYPSY49_AG, GYPSY50_AG, GYPSY52_AG and GYPSY53_AG CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY51-I_AG consensus was reconstructed after multiple CC alignment of 3 copies. CC The consensus encodes the 325-aa GYPSY51_AG1p gag-like CC poliprotein (pos. 844-1818) and the 1110?aa GYPSY51_AG2p CC pol-like poliprotein (pos. 1822-5151). CC The sequence of the LTRs flanking GYPSY51-I_AG is deposited as CC GYPSY51-LTR_AG. XX FH Key Location/Qualifiers FT CDS 844..1818 FT /product="GYPSY51_AG1p" FT /translation="MLHSPPVRDVSTPDGVTPSADPAASGSKSPHVPTPPV FT PNTPRVPGPSACDAMFMPPESQIDTLNAMQLKPPEMDTTDIQTFFFALENW FT FDAWNITTNQHIRRFNILRTRIPLRVLPELRPLLENIRQYATDRYEVAKRA FT IIEHFEESQRSRLHRLLAEMNLGDRKPSQLLAEMRRAANGAMTDSMLVDLW FT IGRLPPYVQSAVIATNTDTNDRAKVADSVMDSFALYHRTGPYQTIHEVRNE FT DFERLSRHVTELGQRLDAVLSKLNERERARPRSRTRQRQPNQDAVTPSGHC FT YYHTQYGQAARNCRAPCSFNNRRQGSNSATASD" FT CDS 1822..5151 FT /product="GYPSY51_AG2p" FT /translation="RLTRGQPQQIHVLSTHSYRLVITDPKTNIKFLIDTGA FT DVSVIPRQHSSVPSKPSTMKLFAANSTPIQVYGESLYTLDLGLRRSFLWNF FT IIADVGTAIIGADFLQHFHLLVDLRKKCLVDALTNVRSTGVPSQNPSEPTV FT KVCDSTSPIATLLKEFPGLTALSTPGTLLQSEVTHRIETTGQPTFARPRRL FT PPEKYAAARKEFESLVQLGVCRPSNSSWASPLHMTKKADGTWRPCGDYRAL FT NAKTVPDRYPLPFLQDFTMHLQDKIIFSKVDLHKAYHQIPIHPDDIAKTAI FT TTPFGLYEFTTMPFGLRNAAQTFQRLIHDVLRGLEFVFPYIDDMIVASTSE FT AEHHEHLRQLFERLEKHQLAINPAKCEFYRNEISFLGHLVNASGIRPLPDR FT VQAISELPQPTTIMELKKFLAMINYYRRFLPHALETQGILLEMTPGNKKKD FT RTPLTWSLEASEAFAQCKEQLKRATLLAHPVKNAELSLWTDASDFAAGAVL FT HQRTNEDLQPLGFFSKRLEKAQQKYSTYDRELTAIYLAIRHFRYQLEGREF FT CIYTDHKPLTFAFRQTHDNASPRRARQLDFIGQFSTDIRHIAGKDNVTADL FT LSRIETVHATPTIDYERLAEEQERDPELSDILSGKIQTDLFLQKTPIPGSP FT KSLYADCPGGIIRPYITRSFRTQLLHAVHDLSHPGARATARLITERFVWLN FT ARKESQDFARNCLACQRAKVGRHVKSPLIPYPATTARFSHINVDIIGPFPI FT SNGNRYCLTIIDRFTRWPEAIPISDITASTVVSALLFHWIARFGVPAHVTT FT DQGRQFESSLFKELTKALGTKHIRTTAYHPQANGIIERWHRTLKAAITCKD FT TARWSEHLPLILLGLRTTFKNDINASPAELVYGTTLTIPAEFFIAKPQNAL FT ADQSDFAKTLEETMSSIRPQSTAWHTNRTPFVHSDLNKCTHVFIRDDTVRP FT ALTTPYHGPYKVLTRNPKSFQILLRGQPTLVSIDRLKPAYGAEEEATPAPQ FT CSWEGLTTNLLPPTTDHSETLPLPDVQANSDRRDATAASKPTSREQPVRNQ FT TTPAPPSHPTTSRQTDRAAVDAPPPSILRRNDQTVSTGVTRSQRKVIIPLR FT YR" XX SQ Sequence 5176 BP; 1353 A; 1630 C; 1169 G; 1024 T; 0 other; actggtgacc ccgacgtgat cgcgtgcgcg agtgagtgag tggtaacctg acgaacatcg 60 tgtccagccg gaaaaaacgt gtttccattg ttccacggtc cggaccgacg gcaacgttcc 120 cccccatcat cgaggagcgg ccgaccacga aggaggcacc acgcaagcgc agccagcgaa 180 aaaaaccccc gtgcacaaac cccgaaccca cgtgagtgca aatcgacacc gaaggtggcc 240 gacagtgagg aacactgttc aggaacattt ttccccgacg gagcgaccga tcctagcgga 300 aaagttcctt ttcggtgctg agcgatcgcc gaacattttg ctgacacacc ccgcgccgtg 360 tcgcacaccc gccgatcatt ttggtacccg tacgtgtttg cgcacccgcc gatcataacc 420 tcacacgtac cgccgagcgc gctccagacc cacgcggttt ttgtgtgtgc accgtgtgtg 480 tgtgtgtgtg tgtgggtgaa tgtgcgcagg ccgacaccga gcggattgcg tcagaatttt 540 gctcgagcta cgttcgtcat ttttttcgac cgtgcaccga agacgtcgtc agcgcacgca 600 gccatcgttc tcttctcgcc gacaccaccg accgaacgcc accgaagatc atcgcccctc 660 gtttctcaca ccaccggcgt catcgacgaa cgcagccaac gagcgactaa tcctaacacg 720 atcgaccgcg tgtgcggatt tttcgtcgcc gaaggatcga cctagccaac ctcagctgga 780 cttgcttgcg cccccgccac taaggtaaga tccacccttt tttaactaac cttagtcgta 840 aggatgttgc acagtccgcc ggtccgcgac gtatcgactc ccgatggcgt aaccccgagt 900 gccgatccag ccgcgagtgg atccaaatcg cctcacgtac caacaccgcc cgttccgaat 960 accccgcgcg taccagggcc gtccgcctgc gacgccatgt ttatgccgcc cgaatcgcag 1020 attgacactt tgaatgccat gcagctgaaa ccaccggaga tggacaccac tgacattcaa 1080 acctttttct tcgcattgga aaactggttc gatgcgtgga atatcaccac gaaccaacat 1140 attcgccgtt ttaacattct tagaacgcgt ataccgcttc gtgtccttcc tgagcttcgc 1200 cccctgttgg agaacattcg acagtacgct acggaccgtt acgaggtagc aaagcgtgca 1260 ataattgagc actttgaaga gtcgcaacga agccgcttgc atcgtctgct tgccgaaatg 1320 aacctcgggg accgaaaacc atcgcagcta ttagcggaga tgcgccgcgc cgcaaatgga 1380 gcaatgacgg actctatgct ggtagatttg tggatcggcc gtctcccgcc atacgtccag 1440 tccgccgtta ttgccactaa cacggatacc aacgatcgag ctaaagtagc agactctgtt 1500 atggattcgt tcgcgttata ccaccgaacg ggcccgtacc aaaccatcca cgaagtacgc 1560 aacgaggact tcgaacgtct ttctcggcac gtaacggaat taggtcagcg cttggacgcc 1620 gtactgagca agctcaacga acgagaacgc gcgcgaccac gctcacgtac ccggcaacgt 1680 caaccgaacc aggatgcggt aacacccagc ggacactgct attaccacac gcagtacggg 1740 caagcagcgc ggaactgtcg tgccccctgc tctttcaaca accgacggca gggtagtaac 1800 tcagccactg cttccgattg acgcttaacc agaggccaac ctcaacagat acacgtactt 1860 tcgacccata gctatcgtct cgtaataacc gatccaaaaa ctaacatcaa attcttaatc 1920 gataccggtg cagacgtttc agtaatccct cgacaacaca gttccgtccc gagtaaaccc 1980 tccaccatga agctgttcgc cgctaattct acaccaatcc aggtttacgg agagtcgctc 2040 tatactctcg atttgggact tcgccgatct ttcctttgga acttcatcat cgcagacgtg 2100 gggacagcga ttattggagc cgattttctc caacatttcc atctgctcgt ggacttgcgc 2160 aaaaaatgtc ttgtcgacgc cttaacgaac gtacgttcta ccggagtgcc gagccaaaac 2220 ccgtcggaac caaccgtaaa agtatgtgat tccacctcac cgatcgccac tctcctaaag 2280 gaatttcccg ggttaactgc actatccact cctggcacct tactgcagtc cgaagtgaca 2340 caccgaatcg aaacgacggg gcaaccaaca ttcgcaagac ctcgccgatt accacccgaa 2400 aagtacgcag ctgcccgcaa agagttcgaa tcactcgtcc agctcggagt gtgccgcccc 2460 tcgaatagca gctgggccag cccgctacat atgacaaaaa aggccgacgg cacctggcgc 2520 ccttgtggtg attaccgcgc cctaaatgca aaaaccgtac ccgaccgtta tccactaccg 2580 tttttacagg acttcacgat gcatttgcaa gacaagatca tattttccaa ggtcgatttg 2640 cacaaagcat accaccagat accaattcat ccggatgata tagcgaagac agccatcacg 2700 acaccctttg gactttacga gttcactacc atgcctttcg gattgaggaa cgcagcgcaa 2760 acattccaac gccttatcca tgatgtccta cgaggactcg agtttgtttt cccgtatatc 2820 gacgatatga tcgtagcatc aacgtccgag gcagaacacc acgaacactt acgccaactt 2880 ttcgaacgat tggagaagca ccaactagcc atcaatccag ccaagtgcga gttctaccgg 2940 aacgagattt cctttctggg ccatctggtc aacgcttctg gtattcgtcc tctccccgat 3000 cgagtccaag ccatcagcga gctgccacag ccaacgacga ttatggagtt gaagaagttc 3060 ctcgccatga taaactacta ccgacgtttt ctgccgcacg ccctggaaac gcaaggtata 3120 cttctcgaga tgactccagg taacaaaaag aaggacagaa cgccattaac ctggtcgcta 3180 gaagcttccg aagcattcgc ccaatgcaaa gagcaactga aacgtgcaac gttattggca 3240 catcccgtga agaacgccga actttctcta tggaccgacg cttcagattt cgcagccgga 3300 gccgtacttc accaacgcac caacgaagac ctgcaaccac taggcttctt ctcgaaacgt 3360 ctcgaaaagg cacagcaaaa gtactcgacc tatgaccgag aacttaccgc catctatctc 3420 gccatacgac acttccgata ccagctagag ggtcgggaat tctgtattta tacagaccac 3480 aagcctctaa ccttcgcctt ccgacaaacg cacgacaatg cctcacctcg acgagcccgg 3540 cagttagact tcattggcca gttttccacc gacatccgtc acatcgccgg aaaagacaac 3600 gttacagccg atctgctctc ccgcatagag acagtgcacg cgacaccgac catcgattat 3660 gagcgattag cagaagaaca agagcgcgac cctgaacttt ccgacattct cagtgggaaa 3720 attcagacgg acttgttcct gcagaagaca ccaataccgg gaagccccaa gtcactctac 3780 gccgactgcc ctggaggtat catcagaccg tacatcaccc gatcgtttcg aacacaactt 3840 ctccacgccg tacatgatct cagtcatccc ggagcccgcg ccacagctag actaataaca 3900 gagcgtttcg tgtggctcaa tgcaaggaag gaatcccagg acttcgctcg gaactgctta 3960 gcctgccagc gcgctaaggt aggaaggcac gtcaaaagcc ccttgatacc gtaccctgca 4020 acaacagcga ggttcagtca tatcaacgta gacatcattg gaccatttcc catcagtaac 4080 ggtaaccgat actgccttac gataatcgac cgatttactc gctggccaga agcaataccg 4140 atctcggata tcaccgcatc taccgtcgta tcagcactac tattccactg gatcgcccga 4200 ttcggagttc cggcgcacgt aacaacggac caagggagac aattcgaatc ctccttgttc 4260 aaagagttga cgaaagccct aggaacgaaa cacatccgta cgacagccta tcacccgcag 4320 gcaaatggaa taatcgagag gtggcaccgc actcttaaag cagcaatcac ctgcaaagac 4380 accgcaagat ggagcgaaca cctaccgcta atactgcttg ggctacgaac cacgttcaaa 4440 aatgacatca acgcctcgcc agccgaactt gtgtatggaa cgacgttgac catcccggca 4500 gaattcttca tcgcgaaacc gcaaaatgcc ctcgccgacc aatccgactt cgccaaaacg 4560 ttagaggaga cgatgagcag cattcgacca cagagcaccg cttggcatac caaccgcaca 4620 ccgttcgtgc attccgatct gaacaagtgt actcacgtgt tcatacgcga cgacaccgtc 4680 cgacctgcac taactacacc ttaccacggt ccatataagg ttcttacacg caatcctaag 4740 tcttttcaga tactcctacg tggacagcca acgctggttt cgatcgaccg cttaaaacca 4800 gcgtatggcg cagaagagga agccaccccg gccccgcagt gctcgtggga agggctaacg 4860 acaaacctgc tgccgccaac aaccgaccac tcggaaactc tgccgttacc ggacgtccag 4920 gcaaattcgg accgcagaga cgccaccgca gcctccaaac cgacgtcgcg cgaacaacca 4980 gtgcgtaatc agacgacacc cgcaccacca tcgcacccga cgacatcgag acaaaccgac 5040 cgagccgccg tcgacgcccc accaccctcc atcctacgcc gcaacgacca gacggtatcg 5100 accggcgtca ccaggtctca gcggaaggtc atcatacctc tacgttaccg gtgacaccgc 5160 tctaggaggg gagtac 5176 // ID QUETZAL repbase; DNA; ANG; 1680 BP. XX AC L76231; XX DT 21-AUG-1997 (Rel. 2.07, Created) DT 21-AUG-1997 (Rel. 2.07, Last updated, Version 1) XX DE Quetzal, a DNA transposon of the Tc1 superfamily. XX KW Harbinger; DNA transposon; Transposable Element; Quetzal; TIR; KW Tc1 superfamily. XX OS Anopheles albimanus OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1680 RA Ke Z., Grossman L.G., Cornel J.A. and Collins H.F.; RT "Quetzal: a transposon of the Tc1 family in the mosquito RT Anopheles albimanus."; RL Genetica0-0 (1996)In press. XX DR GenBank; L76231; Positions 1 1680. XX CC TA target-site duplication. CC The closest known element to the Quetzal is the Uhu transposon CC from D.heteroneura. CC It has 236 bp TIRs (position 1-236 and 1445-1680). CC The transposase is ecoded by the sequence 373-1398. XX SQ Sequence 1680 BP; 561 A; 320 C; 355 G; 444 T; 0 other; cacttctcca caaaagtgaa tacacagcaa acagttttag gaataaatgc ctctagttgt 60 gcatagaccg aattaaaaat atgggaaaat tatcaattat gctttgacct atttgcaatc 120 gattgacgct tggtacgatg tccgtccgat gttgaagttc cgagaaattc tcggaaaact 180 gctgggatgc ctcaaaatcg ttccacaaaa gtaagtacac agcacaggtg ttcgttttgt 240 tttgaatcgt agataaactt ttaattgatg cgttaatgat cgattgggtg caatatgctc 300 agtcgttcag atagtttcga ggtgaacagt ttttagcttc caaatcgact gcagtttacc 360 ataattccaa aaatgaccag agaagaactt tctgtctcta aaagacaaga tattataaga 420 ttgcacggcg ctcagggcaa aagctacaca gaaattgcaa tgttaacaaa cattaataga 480 aatactgtcg ctagggtcat ccagcggtac aaatacgagg gccgtgtatc taatttacct 540 agaaagggtc ggccctcggt gtgcactgat cgtatgcgac gggcgataaa acgattggtg 600 gatgctgaac cagaaatcag tgctcaatct gtagctatag tacttaacga aaggcacggt 660 attgccattt catgtgagac agtgcggcgg tacattcata aatttggcta caaggcttac 720 aacaggcgca aaaaacctca gatcagccct atcaatcgga aacggcgatt agaatttgcg 780 aaaaaatacg ttaaccaccc acccgagttt tggaaaaaag ttttatttac agacgagagt 840 aaatttaaca ttttcgggtg ggatggcaca ataaaggttt ggcggccacc cggagaaggc 900 ctgaacccta aatacacagc caagacggta aaacataacg gagggggtgt gctagtttgg 960 gggtgtatgg cggcaaatgg tgttggaaat ttgcaagtta tagatggaat tatggaccaa 1020 tatgtttata tcaacatttt aaagcaaaat ttaggaccaa gtttggaaaa attagggatg 1080 tctcaagatt attggttcca acaagacaat gatccaaaac acacggcatt caattcacgg 1140 ctatttttgt tgtacaacac tccccaccag ctaaaatcac cgccccaaag tcccgacttg 1200 aacccaatag aacatgcttg ggaattactt gaacgaaaaa ttcgtcaaac acgaattaaa 1260 aaccgtgtcg atctagaaaa caaattaaaa gaagcgtgga tcacaatttc tgaagattat 1320 acgcaaaatt tggtaaattc aatgccacga aggttggcag aagttataaa aatgaaaggg 1380 tatgctaccc gatattgaaa acgttacaaa atgtcgaagg acacgaaata aaaacgaaat 1440 acccaacgaa cacctgcgct gtgtacttac ttttgtggaa cgattttgag gcatctcagc 1500 agttttccaa gaatttctcg gaacttcaac atcggacggc atcgtaccaa gcgtcaatcg 1560 attgcaaata ggtcaaagca taattgataa ttttcccata tttttaattc ggttctatgc 1620 acaactagag gcatttattc ctaaaactgt ttgctgtgta ttcacttttg tggagaagtg 1680 // ID GYPSY71-LTR_AG repbase; DNA; ANG; 205 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY71-LTR_AG is an LTR of retrotransposon GYPSY71_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY71_AG; GYPSY71-I_AG; GYPSY71-LTR_AG; Gypsy clade; KW MDG3 lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-205 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY71_AG, a member of the MDG3 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 182-182 (2004). XX DR [1] (Consensus) XX CC GYPSY71-LTR is a long terminal repeat of GYPSY71_AG (its CC internal portion is deposited as GYPSY71-I_AG). XX SQ Sequence 205 BP; 62 A; 34 C; 64 G; 45 T; 0 other; tgtgaagtag ggaaaagaga tagcgatatc gtggcgagcg ccgagtgagc gaaggagggt 60 atttaagcgg cgcgcgctgc tgggtcgaga ggtcagttga acgcgaaatg tcgaaggtga 120 aggacgtgcg aattggagaa taaataaagt gaactaaacg aaagccaccc atcttcttct 180 tttgatgttt aactgaattc tcaca 205 // ID TRANSIB2_AG repbase; DNA; ANG; 2542 BP. XX AC . XX DT 29-JAN-2002 (Rel. 7, Created) DT 29-JAN-2002 (Rel. 7, Last updated, Version 1) XX DE TRANSIB2_AG is a TRANSIB-like DNA transposon - a partial DE consensus. XX KW Transib; DNA transposon; Transposable Element; KW TRANSIB superfamily; TRANSIB2_AG; transposase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-2542 RA Kapitonov V.V. and Jurka J.; RT "TRANSIB2_AG."; RL Direct Submission to Repbase Update (27-DEC-2002). XX DR [1] (Consensus) XX CC TRANSIB2_AG is a young Transib-like DNA transposon. CC Its copies are only ~1% divergent from the consensus sequence. CC The consensus is incomplete at its termini and encodes a 662-aa CC TRANSIB2_AGp transposase. XX FH Key Location/Qualifiers FT CDS 0..0 FT /product="TRANSIB2_AGp" FT /translation="TFYYKDDFLQIQRYRKTIRIQKPTDYNVNKVLQYISS FT VININEIEKQRLVGNLKYLYYRFRTLFAKSSRNMLRFENKNRPWLALELKV FT SNVNLSDSSRGRPKKPFAELSIRNKRRFVANEQKSSVDIEQELYRVRLLAY FT REKNCNLMAVIDKLLSHPENVFESIKNCGKSVSLEESLALFIDNRWSKAQY FT INMYQKTKNMFPSYTALSNFKKTCSPCEDFINVSETKASVGLQAVLNHTAS FT RIINMKKDKIIQNFDSENVSFKNINLLCSWGIDGSTGHSNYQQKFDGVNES FT MVTDSELLVTAFSPIRLAQSENDGNIFWLNLLPQSTRFCRPLAIEYVKESK FT EKVLESINFIKTEISNLIPFKIDLSETKYVIITYSFYMSMIDGKVLAYVTN FT TSSMQCCCICGAAPNEMNSKDNLENGFLAREESLHYGISPLHCWMRFFECL FT LHISYRLEFKQWKVTKNFKDIFTQRKKSIQQKIYEEFGLRVDEPRPMGANS FT TTGNVCRRAFSDVTKLSRILEIDEQLISRXKNILIAINCSQPIKPHALSLY FT CKDTYSIYLNNYSWFKMPSTVHRVLAHIGEVILRAPAPIGALGEEAAEGRH FT KLYRQDREIHARKNSRINNLKDIFMQALYSSDPYISSISLDKRLQKTSKNQ FT YPDEVKQFFPEFL" XX SQ Sequence 2542 BP; 926 A; 397 C; 463 G; 756 T; 0 other; gaatgtcaac aaatgataat tagtaaatat tttaattatg aatgaacatg aacttcatta 60 aatttatgaa ttatatacgt ttaaaatggt tatctaaaca ccaaaacaaa tatttgctgc 120 aaatttattg attcattaac tgccgattta aactccataa tgattggaat cgggactgac 180 attatgacag atcaaatgtc agactggtgt atcccacgag accccccccc cctccgaaga 240 ggggaacaga aaccgccatg tttacagttt ggtatgtgca agcttatcta gtagtttgtt 300 ggtgcataat ataaactttt tattataaag atgacttcct tcaaatacaa cgatatcgta 360 aaacaattcg aatacaaaaa cctactgatt ataacgtcaa taaagtgcta cagtatatat 420 caagtgtgat taacattaat gaaattgaaa aacaacgact agtgggcaat ttaaaatatc 480 tttactatcg gtttcgtact ctttttgcca aaagctctag aaatatgtta cgttttgaga 540 ataagaatag gccgtggctc gctttagaac ttaaggtatc caatgtaaat ttaagtgata 600 gttctcgagg tagaccaaaa aaaccattcg ctgaactttc catacgaaat aagcgacgtt 660 ttgttgctaa tgagcaaaaa agcagtgtag atatagaaca agagctgtat cgtgtccgtc 720 ttttagcata tagggaaaag aattgcaatt tgatggccgt tattgataag ttactcagtc 780 atccagaaaa tgtttttgag agcattaaga attgtggcaa aagtgtatct ctagaagaaa 840 gtttagcgtt attcatagat aatagatggt caaaggcaca atacataaat atgtaccaaa 900 aaacaaaaaa tatgttccct tcttacacgg ctttaagcaa ttttaagaaa acttgttcac 960 catgtgaaga ctttattaat gttagtgaaa ccaaagccag cgtaggactg caagcagttt 1020 tgaaccatac ggcatctaga attatcaaca tgaaaaaaga taaaattatt caaaattttg 1080 atagtgaaaa cgtaagcttc aaaaacataa atttattatg ttcttgggga atcgacgggt 1140 caactggtca cagtaattat cagcagaaat ttgacggggt gaacgaaagc atggtaacgg 1200 acagtgaact gctagtaact gctttcagtc caataagact agcacaaagt gaaaatgatg 1260 gaaatatttt ttggttaaat ttgctgccac agagcactag attttgcagg ccgttagcga 1320 tagagtatgt aaaggaatct aaagaaaaag tattagaaag tattaacttc ataaaaaccg 1380 aaatttcgaa tttgattcca ttcaaaattg atttgagtga aacaaaatat gtaattatta 1440 catattcttt ttatatgagc atgatagatg gaaaagtatt ggcctatgta acaaatacaa 1500 gttcgatgca atgttgttgt atttgtggag ctgctcctaa cgagatgaat agtaaggata 1560 acttagaaaa cggattttta gctagggaag aatcgcttca ttacggaata tcacctttgc 1620 attgttggat gcgctttttt gaatgcctat tacacatttc gtacagactt gaatttaagc 1680 aatggaaagt aacaaaaaat ttcaaagata tatttactca gcgtaagaaa agcatccaac 1740 agaagatata tgaagagttt ggtttgcgag tggatgaacc aaggcctatg ggtgctaaca 1800 gcacaactgg aaatgtatgc cgtcgtgcat tttctgacgt gactaaactt agtcgtatat 1860 tagaaattga tgaacaactt atcagtagat aaaaaaatat tttgattgcc atcaattgtt 1920 cgcagccaat aaaaccgcac gccttaagtt tatactgcaa agacacatat tcaatttatt 1980 taaacaatta cagttggttt aaaatgcctt caacagtgca ccgcgtactt gcacatattg 2040 gagaagttat attacgagcc ccagcaccaa taggcgctct aggagaagaa gctgctgagg 2100 gtcgacataa actgtataga caagatcgtg aaattcacgc gagaaaaaac tcaagaatca 2160 ataatctaaa agatattttt atgcaagccc tttattcttc agatccctac attagttcca 2220 tttccttaga taaacgcttg caaaaaacat caaaaaatca atatccagat gaagtaaaac 2280 agttttttcc agagttttta taacactagt tggtctagtt ggcacatatt gagtatgcaa 2340 atgatgtttt agagtgatat caatgaagga gaagaagaga atcaagaatt tcaaaatact 2400 agcactacta aatactactg cacgacaatg ttgaagaatt atgaaagctt ttaaactatg 2460 cagaactatt actattaatt taaagaccct gaagtacttg ttgaattgtt gaattgagaa 2520 agtctaccga tgttgtgcat tg 2542 // ID COPIA1-I_AG repbase; DNA; ANG; 4129 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE COPIA1-I_AG is an internal portion of the COPIA1_AG LTR DE retrotransposon - a consensus sequence. XX KW LTR Retrotransposon; Transposable Element; 5-bp TSD; COPIA1-I_AG; KW COPIA1-LTR_AG; COPIA1_AG; Copia clade; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4129 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "COPIA1_AG, a family of copia-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(3), 49-49 (2003). XX DR [1] (Consensus) XX CC COPIA1_AG is a young family of copia-like LTR retrotransposons. CC COPIA1-I_AG, an internal portion of COPIA1_AG is flanked by 99% CC identical COPIA1-LTR_AG LTRs. The COPIA1-I_AG consensus sequence CC was reconstructed based on multiple alignment of 4 copies. CC The consensus sequence encodes the 1323-aa COPIA1-I_AGp protein CC (positions 151-4119). CC MSESHVTIEKLNDQNYAIWKFKMELLLAREKVLTVVKDSKPASPDAAWIANDERARALIGLSLDDSQLIH CC VMQTSSSKDMWDALKGYHERSSLSSKIHVMRKMFATKMTEGGDISNHLKELCSLRLRLIALGEEMKDPSF CC VALMLSSLPKSFDGLIVALESRPDEDLTVDYVKGKLLDEGRRRADGADEDKALLSGGKNxTKFWKDRKLT CC TNKEKQCHYCKKNGHIRKDCRKWAADKRSKLDGESVNVANEDNREEVCLFIGExNETGPWCFDSGATSHM CC TNDTSILKLIDKSKQSSISLANGDSIKSAGVGxCKLFSMDGNGKRKKVSLDxVCHVPSLTTNLLSVSKIT CC DNGFExxFDRxGCRVLKGKQVLLIGERKGGLYYLKQTEQAMLVDKNHEASCIHLWHRRFGHRDIEAIMKI CC ARNNLGSGLNINRCHVKSICGSCCEGKMSRDPFPNSSSSRTSGVGELIHTDLGGPFEVSTARGSRYFMTM CC VDDFSRYTIIYLxQNKCETENRIREYCxMMKTQFGHYPKVIRSDGGGEYRSNSLKEFFVDHGIVHQQTAP CC YSPQQNGVAERKNRYLVEMMRCMLAESNMDKVFWGEAITTANYLQNRLPSSLLESTPYEMWHGKNPSYEH CC LRVFGSEAFVHIPKEKRRKLDKKAEKLVFVGYADNQKAYRFVNLETKTITISRDAKFLEQCEIEKIGTKP CC KPTTSGGVVVLPLGSTPSLCRAEETTTRENIVQMEASAESSCIRESNMNDTVDxLDVTPYNSASDGELSD CC EPGAIEMHQSVRRSxRTTKGIAPVRFREESYMAGSSEQNEEPRNLKEVFVCAAREKWISAMENELKSHEE CC NGTWDALVELPAGRKVVGCRWIFKLKRNAAGQVIKHKARLVAQGYSQQFGEDYDQVFAPVTSHTTFRLML CC AIASKTQMKLRHLDIKTAYLYGDLDQELFMRQPPGYExKGKEHLVCRLKKSIYGLKQSARCWNQKLHGVL CC LEIGFQQSAADQCLYIKTEDGKRVYILVYVDDMIVGCVDETLIDSVYHALTEHFEMTDLGPVSYFLGMEV CC KxEKGNYSVSLEGYIEKLIRKFGLSEAKTAKTPMDEGFLKQQDSSSILKDSTQYRSLVGALLYISVCTRP CC DIAVSTGILGRNVSNPTESCWVAAKRVVRYLKATKHFKLTFNKAGSDLIGYSDADWAGDTITRKSTSGYV CC FFYASGAVSWASRKQTSIALSSMESEYISLSEATQEQMWLTRLMKDLGEHIENPVKIFEDNQSCICFVNS CC DRTNRRSKHIETKEHFVKQQCESRKMMLEYCPTEEMVADILTKPLGATKQRKFTEMLGLHGTR. XX SQ Sequence 4129 BP; 1357 A; 731 C; 960 G; 1064 T; 17 other; ggttatgggc ccagaagtac tgtcgaacaa gtgttaagtt tgtgttatat tgaagtaata 60 atctatcgaa aattttaatc ttgaagaaat actgaaaact tatcgaaaat tttaatcttg 120 aagaaacact gaaaacttaa atctagaaac atgtctgaat cacacgtcac cattgaaaaa 180 ttaaacgatc aaaattacgc aatatggaaa ttcaagatgg aacttttgtt agcaagggaa 240 aaggtgctga ctgtcgtgaa agattcgaaa ccagcaagtc ccgacgctgc atggattgcg 300 aatgatgaac gtgctagggc actgatcggt ctgtcgttgg acgacagcca actcatccat 360 gtcatgcaaa cgagttcatc gaaagatatg tgggatgccc taaaaggcta tcatgagcgt 420 tcatctttgt ccagcaaaat acacgtcatg cgaaaaatgt ttgccacaaa aatgactgaa 480 ggtggagaca tttctaacca tctcaaagaa ctatgttctc tgcgacttcg tttaattgcg 540 ctgggagaag aaatgaaaga tccatccttt gtcgcgttaa tgttgtccag tttgccaaaa 600 tcctttgatg gtttgatcgt ggctttggaa agtaggcctg atgaagatct tacggtggat 660 tatgtaaaag gcaaattgtt ggatgaagga agacgtcgag cagatggtgc agatgaagat 720 aaagcgttac tatctggagg aaagaayawc acgaaatttt ggaaggacag gaaactaaca 780 accaacaagg aaaaacagtg ccattattgc aagaagaatg ggcacataag aaaagactgt 840 agaaaatggg ctgcagacaa aagaagtaaa ctagatggtg aaagcgtcaa cgttgctaat 900 gaagacaatc gagaggaagt atgtttgttc attggagaar gaaacgaaac tggaccatgg 960 tgtttcgatt ctggtgcaac ttctcatatg acgaacgata cgtctatttt gaaattaata 1020 gataaatcga agcaatcctc gatttcatta gcgaacggag attccatcaa gtcagctggt 1080 gtcggaarct gcaaattgtt ttccatggat ggaaacggaa aacgcaagaa agtttccttg 1140 gacaakgtgt gtcatgtacc atctttgacg acaaacctat tatctgtaag taaaattacc 1200 gataatggat tcgaartgyt tttcgatagg twtggatgtc gtgtcctgaa aggaaaacaa 1260 gtattgctga ttggtgaacg taaaggtggt ctgtattatc taaaacagac tgaacaagcc 1320 atgttggtag ataaaaacca tgaagcttcc tgtatacacc tatggcatcg tcgatttggc 1380 catcgtgaca tagaagccat aatgaagatt gcgcggaaca atttgggaag cggcttgaac 1440 atcaaccgat gtcatgtgaa atccatttgt ggatcatgct gtgaagggaa gatgagccgt 1500 gatcctttcc caaattcttc atcttcaagg acatctggtg ttggcgaact gatacatacg 1560 gacttgggag gaccgtttga agtatcaaca gcccgaggaa gccgatattt tatgactatg 1620 gttgatgatt ttagtcggta tacaattatc tacctgktgc aaaacaagtg tgagacagaa 1680 aaccggatca gagaatattg crccatgatg aaaacacagt ttggacacta tccgaaagtc 1740 atcagatcgg atggtggtgg tgaatacagg agcaattctt taaaggaatt ttttgtagat 1800 cacggsatcg tgcatcaaca aactgcccca tattctccac aacaaaacgg cgtggctgaa 1860 cgtaagaacc ggtacctcgt tgaaatgatg agatgcatgt tggcagaatc gaacatggac 1920 aaggtgttct ggggtgaagc gatcaccact gccaattatt tacaaaatcg cttgccatcc 1980 tccttactgg aatcgacacc ttacgaaatg tggcacggaa agaatccttc gtatgaacat 2040 cttcgagtat ttggttcaga agctttcgta cacattccta aagaaaaacg gcgtaagttg 2100 gataaaaagg ctgaaaagtt ggtattcgtc ggatacgcgg acaatcagaa ggcctatcga 2160 ttcgtaaacc tggagacgaa aacaattacc attagtcgtg acgcaaaatt tttagaacaa 2220 tgcgagattg agaaaattgg aacaaaaccg aaaccaacga catcaggagg agtagtggta 2280 ctgccacttg gatcaactcc ttcattatgt cgcgcagaag aaactaccac gagagaaaac 2340 atcgttcaaa tggaggcttc tgctgaatcc tcctgcatta gggaatcgaa catgaacgat 2400 acygtagatg awcttgatgt tacaccatac aacagtgcat ctgatggcga actatcagat 2460 gaaccagggg ctattgaaat gcatcaaagt gtacgtaggt ccaygcgaac aacaaaaggc 2520 atcgcacctg ttcgattcag ggaggaaagt tatatggcgg gatcttctga acaaaacgaa 2580 gaacccagaa atttgaaaga agttttygtc tgtgcagcgc gcgaaaaatg gatatcggca 2640 atggaaaatg aactgaaatc acacgaagaa aacggaacat gggatgcatt ggtagagcta 2700 cctgctggca gaaaggttgt tggttgccgc tggattttta agttgaaaag aaatgcagct 2760 ggacaagtaa tcaaacacaa agctcgtcta gtggcgcaag gctattccca gcaatttggc 2820 gaagactatg atcaagtatt tgctccggtc acaagccata caacatttcg tttgatgctc 2880 gctatagctt ctaaaacaca gatgaaatta cggcatttag atattaaaac ggcctattta 2940 tatggtgatc tagatcagga gctttttatg cgacaaccac ctggatacga gayaaaaggc 3000 aaagagcatt tggtttgtcg attgaaaaag agtatttatg gtctgaaaca atcagctcga 3060 tgttggaacc agaaactgca cggtgttctg ctagagattg gcttccaaca aagtgctgct 3120 gatcagtgtc tgtacattaa aactgaagat ggaaaaagag tctacatttt agtgtacgtg 3180 gatgatatga tagtcggttg tgtggacgag actctcattg attctgtgta tcacgcttta 3240 accgaacatt tcgaaatgac ggacctggga ccagttagtt actttctggg aatggaggtt 3300 aaatrtgaaa aaggtaacta cagcgttagc ctcgaaggtt acattgaaaa attgattcgt 3360 aagttcggat tgagcgaagc aaaaactgcg aaaacaccga tggatgaagg atttttgaag 3420 cagcaagact caagctctat tttgaaagac tctactcaat atagaagtct agttggtgct 3480 cttctataca tatcggtgtg tacgcgacca gatattgctg taagtacggg aatacttggt 3540 cgtaatgtta gtaatcctac tgaatcatgc tgggttgcgg ctaagcgtgt tgtaagatat 3600 ttaaaagcaa ctaaacattt taagctcact ttcaacaaag ctggtagcga tttgattggt 3660 tattctgatg ctgactgggc aggtgatact ataacaagaa aatcgacttc cggatatgtg 3720 tttttctatg ctagtggagc tgtgtcatgg gccagtcgca aacaaaccag cattgcatta 3780 tcatcgatgg aatcagaata tatttcctta agtgaagcta ctcaagaaca aatgtggctt 3840 actcgattga tgaaagactt aggagaacat attgaaaacc ccgttaaaat ctttgaggat 3900 aaccagagtt gcatttgttt cgtcaactct gatagaacca atcgtcgatc gaaacacatt 3960 gaaacaaaag aacactttgt caaacaacag tgtgaatcta gaaaaatgat gcttgaatat 4020 tgtcccacgg aagagatggt tgcggacatt ctaacgaaac cactaggagc aacaaaacaa 4080 agaaaattta cggagatgtt agggcttcat ggcacacgtt gaggaggag 4129 // ID RETRO49_AG_LTR repbase; DNA; ANG; 164 BP. XX AC . XX DT 06-FEB-2003 (Rel. 8.01, Created) DT 06-FEB-2003 (Rel. 8.01, Last updated, Version 1) XX DE Anopheles gambiae long terminal repeat from RETRO49_AG DE retrotransposon - a consensus. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW Long terminal repeat; retrotransposon; BLASTOPIA; INVADER2; KW RETRO49_AG_I; RETRO49_AG_LTR. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-164 RA Jurka J. and Drazkiewicz A.; RT "RETRO49_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 11-11 (2002). XX DR [1] (Consensus) XX CC Related to INVADER2 and BLASTOPIA from Drosophila melanogaster. CC 4 bp target site duplication. XX SQ Sequence 164 BP; 64 A; 24 C; 47 G; 29 T; 0 other; tgtaaactcg ccttaagtgc gaaacgataa gcgcaaacga taaagatgca ggcagataag 60 gatgagagcg atcattatcg aaagagagag agatagagca ttagatgagc gacagtgagt 120 gaagacggac gtttatggaa gcaagcgaat aaagtaccgt taca 164 // ID CR1-5_AG repbase; DNA; ANG; 4525 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 12-MAR-2003 (Rel. 8.02, Last updated, Version 1) XX DE CR1-5_AG is a CR1-like non-LTR retrotransposon - a consensus DE sequence. XX KW AP endonuclease; CR1 clade; CR1-5_AG; DNA/RNA-binding; PHD finger; KW Non-LTR retrotransposon; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4525 RA Kapitonov V.V. and Jurka J.; RT "CR1-5_AG, a family of CR1-like non-LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(2), 16-16 (2003). XX DR [1] (Consensus) XX CC CR1-5_AG is a young family of CR1-like non-LTR retrotransposons. CC The CR1-5_AG consensus sequence was reconstructed based on CC multiple alignment of ~50 copies identified in the CC sequenced portion of the genome. Given the ~2% divergence CC of these copies from the consensus sequence, transposition of CC CR1-5_AG occurred less than 1 million years ago. CC The 3' terminus of CR1-5_AG is composed of the ATAAAC CC microsatellite. CC CR1-5_AG encodes two protein sequences: a 378-aa CR1-5_AG-ORF1p CC (positions 228-1361) and 968-aa CR1-5_AG-ORF2p (positions CC 1492-4395). CR1-5_AG_ORF1p is DNA/RNA binding protein composed CC of the PDH domain (aa positions 5-60). CR1-5_AG-ORF2p is composed CC of CC the AP endonuclease (aa positions 1-250) and reverse CC transcriptase CC (aa positions 520-750) domains. XX SQ Sequence 4525 BP; 1158 A; 1175 C; 947 G; 1245 T; 0 other; tcagctttga cagttgtcaa catagcgtgt ctactgctat cgctccgcat ttatccgcta 60 ttttattggt gtaatctggt tttgctataa tcaacacaag ttttagtgtg tccataagtg 120 tagtctcgtg accctgccat cgtgttcctg acgaaaacca ccgtgttcgc gatatatttg 180 gcttttattt gtggctcctt cgacgttcat ctgcagcggc atctaaaatg gaggatgtgt 240 gttatgcatg ttctgaagca ctgggcccgg ttgaagggtc aattacgtgc ctatattgtg 300 agggtacata ccatctggcg tgcactaagg tgcctatatc ggtcatcgac gaagtgaaac 360 ggatggcttc cctgcactgg agttgcattg ggtgtactaa cgccatcggg aatccgcgta 420 gcaaggcgat caagggcatg ggtatgcagg ttgggtttca ggcagccctc acagctgctg 480 ttgatgccat gaaggctagc ttggtcccac cggtaataca ggagattcgg gatggctttg 540 ccggaatcgc cacggcccat tcggcatcct tgcaacatag caatcaattg ctcccagatg 600 cactaccaaa cggaaaaaga cgaagattat ttcgtgatgc agttgcatcc tcagatgccg 660 tgtttgttga cagtgcagca atctcaaacg tcacaaatac aaacaacact caccggatcg 720 atcgtccttc actaccacca ataatcacag gtactaatac aacctccgct tcaataccaa 780 ctgtttcaca gacaccaaga accgattatt tatggctgca tctttcaagg ttggctgcgt 840 ccgtcaccat agagcaagtt gtctcgttgg tgtgtttgca gttggacact gcagacgcaa 900 ttgcttttag tctgctaaaa gcaggaacgg ttcctagccc attgagtgct gtatcgttca 960 aggtcagaat ccctgctgcc cttcgtagta aggcacttaa cgcagcctcc tggccggtcg 1020 ggctcggtgt gcgtgagttc atctcgctcc cgccgcgatc tctacattca cgaacacaac 1080 acacgaacac ttacacaccg ataccgcaac aaccccgtac tcagcagaca ccggcaccac 1140 tacagcctcg cactcaacac tcaccggcac cacagcagcc ttgtacccaa cacaataatc 1200 cccacaataa tccatactca cctgcacata tccatgaacc cgcacgaaca caaaccgaac 1260 aatcaccaac cgatatgaac tgcactctcc cttcacccac tcttccgtcc acgtcacgtt 1320 caaaaacact tcaaactaca ctggttcaat tttttccgaa gtaaccgcac tacagcagcg 1380 cacccatatt ttgcgcatac ggctcgagca aatccttcga caacaaacac aatacactca 1440 ccttctacga taagctcgct caattgcaac accacagcgc ctactaacac tatgcaacaa 1500 gcacccactg atcgagcggc ttgtatcact gtgtactatc agaatgtgcg tggcctgcgt 1560 tctaaagccg atgagttccg tttgtcggtt cttgagacgg aatacgatgt attggtactc 1620 actgaaactt ggctcgatcc ctgtattccc accgctctcc ttctaggaga tgagtatcgt 1680 gtttaccgat gcgaccggga tgccactaac agctctcact ctcgtggggg tggtgtttta 1740 attgcatgca acgtttctct tctctcttat gtgcttccta cgttgccgca actgttagaa 1800 ctcacttgcg tatgtattca gttatgcgac caccgtttgt ttattgccgc tgcctacctg 1860 cctcctaacc atagcatgaa tgaggagaaa ataagtgcat tgatcgactt cgttgctaat 1920 atctgcacta ctcttggtcc gagagacagg ttcatgctta ttggtgattt caaccagtct 1980 acgctatcct ggacatcggc gtcagaggaa aatggagctg cttttgattt ctatgagcct 2040 catgcacgct ccgcgcgcag tgtccagttc gttgatggtt tgcatcagag tggactatac 2100 cagcttaact tcaatactaa ctcatcgggg cgaatacttg acctaattta cgcaaattgg 2160 ccagctgcat ctacatgttc ttcaatacgt gtctgtgaat atccccttac tactatcgac 2220 gaacaccacc ctccgcttga ctttgatttg gacaacgtag ccccagttac gatcgctaca 2280 gccattgatg ctgataaaag gctgaattat gctcgtgttg accttccaaa attggagcga 2340 ttgattttat cttttgacaa ttcttttaac tgttcggact attcaaccat tgacgatgcc 2400 acagaggctt tttgtgagtt catgagatct gctatatgtg aatgcactcc tgttaaaatt 2460 cctcgtcgtg gaccaccatg ggctgatcgc acactaaatg cattaaaaaa agaaaaaaga 2520 aaagcctatt ctgatcaacg agcagaacga aacagcacgt cacgtcttta ttacaacaga 2580 gttcattctc tctatcgccg atataataga tcctgtcacc gaagttattt gaaacaaact 2640 gcacgttccc tctgcaagta ccctaggcgc ttttggagtt atatggacaa aaaacgaaaa 2700 tctgctgggc taccaagcat tatacgttat gacggtgaca gtgcctgttc cctgccagaa 2760 atgtgtaaat tgtttgcctt gcgcttcaag gacaacttcg cttctcaaac tactggtcca 2820 gaagatgtgg ccgatgctct ctctaacaca cctgttggtg cactgctccc tatgctacca 2880 gtcatcactg tcgatacgat cacctctgcc atcaaacgtg tcaaatcctc atatacccct 2940 ggcccagatg gtataccggc cgttatccta aagtggtgtg cttctgcgct tgctccctcg 3000 ctgatgaaga tttttaagga gtccctgaga tgtggaacct ttcctgccac ttggaaatct 3060 tcctggatga cacccatcta caagaagggc tgtaagaatg atgctgtaaa ctatcgtggc 3120 ataacctcac tcagcgtgtg tgcaaaggtg ttcgaaatgc taatctacga gccattgttg 3180 gcctcggcct gcaactatat tagtgtgaat caacatggtt ttgtacccag gcgatctaca 3240 acgactaatc tactagaatt tgttagcaaa tgccataaat ctatcgataa cggtttacaa 3300 ctagatgcta tttatacgga tatcaaggca gcattcgaca gtgtatcgca ctccatcttg 3360 ctagcgaaac tcgacttact tggtctccca aaccctatga taatgtggct tagatcgtac 3420 cttacagatc gtcaatattc tgtgaagtta ggcccgtaca tgtcaagtcc agtgcatgca 3480 tcttctggag tcccgcaggg cagcaacctg ggtcctttgc tgttccttct tttcatcaac 3540 gacgcgacat tgatccttcc ggctgataat catctactgt acgcagatga cgcgaaaatt 3600 tttcgtgtta ttcgtgaacc agaagaccac gcccgactgc aaacttcctt gcatgagttc 3660 cagtgttggt gcaaccgtaa tgctttatcc ctatgcacac ataaatgtga agtcatcact 3720 ttcagtcgtt ctcgctgccc atcattgtat gaatatgcgc ttgatggaca gtccttggcg 3780 cgaaaacaat gtgttaagga tctaggtgtt ttgctcgata caaaactatc atttaaggat 3840 cagctggatc acgtagtagc caccagcaat agaatgctag gactagttat caatatgact 3900 cgcgagctta atgatatacc ttgcaccaag gcgctctatt gctcacttat ccggtcgttg 3960 atggaatatg caaatatcgt atggtggcca actgcagcgc gtccgttagc tcgattggaa 4020 tcaatccagc gcaaaatttc acgatttgca cttcgctcat ggaaccaaag gctcgattac 4080 cggactaggt gtttactact cgggctaccc accctaagtg agcgaatacg aaaagccagg 4140 ttgtcgttca tcacgggact tctcgacggc cgtattgact ctccgtcact actggctgcc 4200 atcaacctgt acgttccggc caggccgctc cggactcggg caatgttggc cctcgacgac 4260 cgtagaacgc aatttggctc ctctgacccg ttcctactca tgtgccgtgc tttcaacgca 4320 gtcagcgacg cttttgagcc gaggatttcg ccaactgagt ttaatgatcg tgtttctgtg 4380 ttaaatttgg ttccatagtg cacattttta tgttctatgt tattgtaatc cattgtaaag 4440 aactcattgt aaaacattgc taaaatggtt cgagagggca ttattgtcca tcgatagaca 4500 aataaacata aacataaaca taaac 4525 // ID AARA8_AG repbase; DNA; ANG; 4188 BP. XX AC . XX DT 09-DEC-2004 (Rel. 9.11, Created) DT 09-DEC-2004 (Rel. 9.11, Last updated, Version 1) XX DE Mosquito putative non-LTR retrotransposon, partial sequence - a DE consensus. XX KW Non-LTR Retrotransposon; Transposable Element; KW reverse transcriptase; putative non-LTR retrotransposon; KW AARA8_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RA Cook M.J., Martin J., Lewin A., Sinden E.R. and Tristem M.; RT "Systematic screening of Anopheles mosquito genomes yields RT evidence for a major clade of Pao-like retrotransposons."; RL Insect Mol.Biol 9(1), 109-117 (2000). XX RN [2] RP 1-4188 RA Gentles A. and Jurka J.; RT "Anopheles putative non-LTR retrotransposon."; RL Direct Submission to Repbase Update (30-NOV-2004). XX DR [2] (Consensus) XX SQ Sequence 4188 BP; 1371 A; 1046 C; 677 G; 1090 T; 4 other; ttcactgtaa actgtcgcct acctgcacta actgtggtac ccctgcgcat ggagagtgac 60 cgtagatccc gtgtgcataa ctgcaaggac aaccacagca ccacgtctcg tacctgtccc 120 cctacctcca agaagaacag gtcatcaata aagatcgaca acaactgttc ttacggtgag 180 gcacgtaaag ctctgcttaa tgttaccaat ccacaaattc tagtcccatt caaaaccgca 240 tcacatacgc acaacagaca cacacataga tgaaaaagac acccaaatta agaactcaaa 300 aaaagaatga atctctagaa aaaatgataa ttttgcttac acaaacagta actaatctaa 360 caacaaaaat cgaagatcag gataaaacta tgaaaagaat gtgcaagaat tttgaaaaga 420 gtttagaaaa aaagaaaata gaatctcaaa tggaaattga agaggatgaa agtgaagaag 480 aattagacga ttttagcacg gaagaaagca cggaagaaag cactgaagaa gaaactgaag 540 aagataatga agacaaaaaa gtagaaaaag taaaacatgt aaaacaaaat gagtagagaa 600 caaaataaca ccactaacaa acggaaacgt tccaccaaaa gacaccccct cctcggaaga 660 ggaatcccaa ccccgaaaaa tcatccccat ctctctcatc ccaatccccc aaccctccca 720 catccaatct tctctctgtc gcctctaaaa aaactctaga tctaacacga aatctgtaga 780 ttaatttata atttcaatat taacaatatc cttctgatta taagcattaa cttttatcca 840 tttttccttt actacataat tcaattcatc atatcaatct aataactatt catccatatt 900 tgcattatcc tggaatatca gaagcatgaa ccaaaatata gaagaactaa aactactaat 960 caatagacac gatccaacaa taatcaccct tcaagaagta atgcagttcc ctctctcctt 1020 agctctcctc ttaatagata caattggtac actaacatac acccaaccat acctcatcat 1080 agtgtagctg taggcatttt aaaacacata tcttcccgca gtagctgtag aaatttctct 1140 cccgtgcaat gctccatagt atgtttaatc tttctcattc aatcaaccct tttacaataa 1200 tttcgattca ctagtacttt cagcaaatta ataataattt aattgtttta ggggatttta 1260 atgcccaaca taccacctgg ggtgctcgac ctcatgtaac agagggaaca gcattgcttg 1320 tgtttttgaa acaattggtc tggaagtaat ttcaaaccaa tccgctcacg tatttctccc 1380 ataaacggca aaggctccat acttgactac tgcgctgtgt catcttctat ggcacagcag 1440 ttcacagtta cagtttctaa tgacactctg gtagtgatca ttttccactt ttaatccaca 1500 gtagtttaaa tccccagcgg ccacttctcc gccccaggtg gaaatatgag gaggccaatt 1560 gggcagcata tcaaagagaa ataatgttca atcttccact tgatgaaaat ccccctcttc 1620 tcgctttact gcatgcattg aatacgccgc ttctcggagt attccccgta caactggaaa 1680 acctggaaaa agatgtgaac catggtggaa cgcaacagtt gctgcagcta ttaaagctcg 1740 cagagctgct ttgcgcaaat tccgccgagc tagcaaaaat ccggataatt ttttcacacc 1800 catttttgct gaggaatatc gtacagccaa tcgacttgcc aaggaagcag ttcgtcttgc 1860 caaaaagaat aactgggata atttcattaa tgaaattaat cctcagttat ctagcaagga 1920 agtgtggagg cgagtcggtt gtttgaatgg taaaaatcaa caaagctcca caatagttct 1980 caaaacccag gacactatca ttgcccctgc tgaagtacct gaagcctttg ccatccactt 2040 ttctgatgtt tctgccacac acaattatcc gagtaatttt caaacccata aacttaacac 2100 tgaatctgtt ccaatttctt tccctgatgc cactgaacat agatacaata gtcttttcac 2160 aataactgaa ctccattggg cactaaggaa atgtaaaggt agatctgctg gtcccgataa 2220 cataggatat cccttactgc aaaacctcct gtagaatcta aatccaccct tcttaacatt 2280 tacaatagta tttggtcttc yggtaatatt cctaatgatt ggaaaagtag tttaactatc 2340 cccataccca aacctaacaa acctaacmac aacgttgata gctaccgtcc aatctccctc 2400 cttagctgca tgggtaaagt tttagaacgc atggttaatc gccgcctctc gcaagaatta 2460 gaagacagaa acctgcttag ctctgaccaa cacgcttttc ggagcgggtt gggcacggaa 2520 acatattttg ccaaattaga tgatactatt caaaaatcaa tagaccagga tcatcacata 2580 gactttgcca tcatagatat ttccaaagct tttgatcgca cttggcgcca ttccattctt 2640 tctcaactag ctttttgggg ctttggtggc cggctcacta atttcataga caattttttt 2700 acagacagat catttagagt cctaatcggt aatagtgttt ctaatccata tcctctagaa 2760 aacggtgtcc ctcagggcgc catcctttct cctacacttt tccttatcag tatagaatct 2820 ctgttttgct ccattccata tgaaatcaaa ccttttgtct atgccgatga tatcattctg 2880 gtctcttcat cgagatctgt tagcacgtct cgtcagcttc tccaaaaagg agttgacaag 2940 ataaggcaat ggtctcggtg gacaggtcat gaaatatccc attccaaatc tcaaatcctc 3000 catatctgca aaatgggctt tcatcgcaaa ctaccaatca aacttcacga ccacatcata 3060 ccaaatgtaa actcagcgaa aatattaggt gttacatttg acacaaaact tactttcatt 3120 ccccactcca accggataag aaaagaagca aaaaccagac tgaacctctt taggatgtta 3180 ggagctggac aacatcgtgc atcacgtcaa acacttctgc agatcctcaa tagctggctg 3240 cttcccaaaa ttctctaygg tattgagatt gtctctcgcc aacgagaaaa ctttgagaag 3300 cgtattgctc ccacatacca tacagccatc cgcctttcaa ctggagcctt ctgtaccagt 3360 ccaatccact ctctactttg tgagagcgga cttcttcctt ttgattacat tatcaccaac 3420 agattaacag cagcagcagg acgcatccta gagaaggaca tcaaagccga atcatttatc 3480 aacagagtca atgcacaatt tcataccctt accaacaacc agcttcccca tatcagcaag 3540 cttaccggac gcggaggacg cccttggtat cgtcaaccgc ctacaataga ctggtcttta 3600 aaagataaac tacgtgcagg aaattgcagc catatcgccg gccatcattt taccagtctc 3660 atccaaagca aataccagaa tcaccatcat atctttaccg acggctctgt cctcaatgaa 3720 tccgccggtt tcggtatttt ctcctccaac aacagttgtg ccatcaaact accagaccat 3780 acctctatat tctcagctga agcaatagcc atgatagtcg ccgcgcaaga aggtatttgc 3840 cttaataaac caaatattat tttcaccgac agtgccagtg ttctagccgc cctagaacac 3900 ggtaacatcc gagatccgca catcyaacta ttagatcaac tagataactc cccggttata 3960 acattctgct ggatacctgg tcactcgggt atcagcggaa acgagaaagc tgaccaactt 4020 gcaaaccagg gtcggctccg tcctccagaa aacaacacaa ccatagcccg cagagacttc 4080 aacaaatgga gcaaaacaat agttgacgaa aaatggaacc tcacctggca ccggaaacaa 4140 aattctttct tcgaaccatc aaaccatcac aacagcctgg aaagacac 4188 // ID WaldoAg2 repbase; DNA; ANG; 4895 BP. XX AC AB090815; XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE Anopheles gambiae retrotransposon WaldoAg2 DNA, complete DE sequence. XX KW Non-LTR Retrotransposon; Transposable Element; KW reverse transcriptase; gag-like domain; WaldoAg2. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RA Kojima K.K. and Fujiwara H.; RT "Evolution of target specificity in R1 clade non-LTR RT retrotransposons."; RL Mol. Biol. Evol 20(3), 351-361 (2003). XX DR Genbank; AB090815; Positions 1 4895. XX SQ Sequence 4895 BP; 1364 A; 1112 C; 1279 G; 1140 T; 0 other; ctggtattat cagtgttgcg tgagttacag tgcaaaggcc cttcattcgt ttgtgttgtg 60 tgagtgtctg caaattaagt gattatacac taggctcgct aggtgagttc ctcgtaaatt 120 gattttcttc atgtgggatt cggttgagca aaaccaccaa tgagcgggaa cgaatcgcta 180 cctccccgtc ctctggggtc tgccttgaag gatataggcg cgttttttgg ccgtagtagc 240 aagacgccta gatctcctcc gtcggatctc ggagagtgtt cagcttctcc tacggttgaa 300 gtggttgcca gtacgtcggt cgattctgct gttgttgaag cggtaaccgg cgaggaaaat 360 gtgatggcag cagaggattc cgtgacctca caaccagtgg aatcgttttc ttcatcgaaa 420 gagccagcgc ttgttgtcgg tggaagcaag ctgcaggaag ctttaaaggt ggctggagaa 480 cttcatgcct atacgaaaga tcggaataac gttcaccatc cgatcaagaa gatggcagtg 540 agcatcttat cggcgttagc ctgtgttgaa cgtgagctca tgacgacacg gctgcgagcg 600 gaaaggactg aaaagtccct aaaagaggcg ctagaaggct gctcccaaac cgagacgcca 660 gtgaatggta aacggggcag aaatttgagg tcgactgagg aagcggacga tgctaagaga 720 gcaaaaaatg atgccccctc tggcagctcg ctggtcgcga gtgccggttg tggccccgaa 780 atgaacgaga caaaggggtc gtggagcacg gtcgtgcgga aaaatcgccg gaagccgaag 840 gaaagtgtca ttccggataa caccggcgaa aagcaagttc accctgcatc tactcgggag 900 gcgcttccgc cgaggcgtcc aaaaaccgaa gccgtactag tggcgcccgg cgaaaatatt 960 acccatgtgg aaatcctccg caagctaaaa gcagatcctg agcttcaagc tttcggtaag 1020 aaagtagtgc gaattcgagg aacaaaaaat ggaggcttgc tattcgaact gggcaaaagc 1080 gatgatgatt gcggagtcga ctacgccaag gtggttcaga attccattgg caacaatggg 1140 acagtaaaga ccttaggcca aatggaaacg gtggaaattc gatacttcga cgcagagacg 1200 cagaccagcg acgttgaaaa agatcttcgg gatttgttca ccgagctgga cggggtaact 1260 ttcgagacca agatgaccaa atccttcaac ggaatgcaga ccgcttcagt gaagctcccg 1320 acgaaactag caacgctagt ggcagcacgt ggcaagatta ggattggatg gtcaatctgc 1380 ccggtaaaaa tacaaatacc gaaaaggagg tgtttcaaat gctgggagac gggccatttc 1440 tcccgcgatt gcaaaggccc agataggacc gactgcgatg cggtgaaaca gggcattttg 1500 ccaaaacatg cgtcaaacct cccagatgtg tcctttgtcc accgggtaca aacatgtatc 1560 ataccagcgg cgttttctgt ccagcgagca agcagaaatc gacatggaga tagttcagtt 1620 gaacctgaac cactgtgaag aggtccaaga catgcttggg caattgctta tagaggagaa 1680 gggagatgtg gcaatgctgt ccgagccgta tcgctgtcct agcggcgtaa acaattgggt 1740 ttctgactct acagggaccg ctgctatttg ggcttctggt agatttccga tccagcagat 1800 catttcaagg cacggcgagg ggtatgtgat tgcagtcatc aacaaaataa ccttctgcag 1860 ctgctacgct ccgccaagat gggatctaga gaaattcgaa gaaatgctta agaggatttc 1920 agatgaggtg tacgatgtta atcctatcat catttcagga gattttaacg cttgggccac 1980 ggaatggggc agtaaaagca caaacgccag aggaaacgcc gtgttggagc acttttctag 2040 actgaactta gtactggtaa atgttggctt ctgtcccaca tttgtaagga atagcagaac 2100 ttccattata gaccttacgt tctgtagtcc agcattggct tcttccatga actggagggt 2160 aagcaacgcc tacaccctca gcgaccaccg agtgatacgc tacacggcag gaagcaagtg 2220 ccacagggtt gcccagggct ccggttttcc agcctggaaa acacaatgct tcaacgaaga 2280 actgtttatt gaggcattga ggtttggtga tttctcaaat acctcgtcag cattgaagct 2340 agcgtcagcg atcgccaacg cctgtgacac ctctttgcca cgaaggaaag ggggacctta 2400 cccacgacgg agagcttact ggtggaccac cgaaatagcc cagtgccgaa gccattgcat 2460 cgaagcacgc agaaagatga atcgagccaa atcctcggaa caaagggagg atctgagacg 2520 tttgtacatc ctggcacgat caaatctaaa acggaagatc aaggcaagta aaaggagatg 2580 ctttttagcc ctatgcgatg aggttgaaaa caacccgttt ggtgcttacc gaacgctaat 2640 gggtaagatg gtcggccaag acctacctag ggaaaggaac ccaaccgcgc tcaagactat 2700 cattgaacag ttgtttccta accatgagcc acaaactcca cgtgatatat cccgcaatcc 2760 tgatgttgaa cctgtgtcaa tatccgctga tgaaatacag aaagcagcag accatctcaa 2820 actggggaag gcacctggtc cggacggtat tcccattgaa gcgatcaaag cagctatcaa 2880 agcgtacctg gaagcttttt tatcagtgtt ccagaactgc ttcgatactg gcttttttcc 2940 gataccctgg aagcgacaaa aactcgttct tctaccaaag cctggaaagc cccccgacga 3000 cgcatcagca ctaaggcctt tagcattgat agataatttc gcaaagatac tagaaatcct 3060 aatactcaac cgattagttg tctatacgga aggcgaacac ggactgtcgg acagacagtt 3120 tggctttaga aagggacggt ctactggtga agcgatcgca gctgttctaa agaaacgacg 3180 aagcgccttg ctaaaaaaga ggacaggaaa cagatactgc gcgatcgtta cgattgacgt 3240 aaagaatgct ttcaacagtg ccaactggga ggccatacat gcagctcttt ccaagatgat 3300 gattccaccg tacctctgta ggctgctgag aagctattta gatcaccgtg tgtttttata 3360 cgacacagct ctgggaatca aaaaaatgag cctcaccgct ggcgttcctc aaggttcgat 3420 tttgggtcca actctctgga atgttatgta caacggagtt ttgaccctag ggcttcctcc 3480 aggacctgaa gtgataggtt ttgctgatga tattgctctt actgtccttg gtgagtcaat 3540 agaagagatt gaactcctca catccgactc cgtaagcaga atcgagtcgt ggatgcaaca 3600 aatgaggcta gaaatcgccc ataaaaagac ggaattcctc atcataagta gtcacaagac 3660 agtgcagtca ggtagcatcc gggtcggtga tgaacgtatt gagtcgatac gtcacctaaa 3720 atatcttggt gtgatcatcg atgaccgctt aagtttccgg aagcatgtcg agtacgcctg 3780 taacaacgtc tttaaggcag caatctcgtt gatacaaata atgccgaaca ttggagggcc 3840 caagagtagc tggaggcgac ttctagctga cgtggccttc tctcggttgc gttataacgc 3900 tgcgatctgg gcacatgttt tggtgctaaa ggaaaaccga cagttggcga acagagtgca 3960 ccgattgcta gctatgagag ttgttcgtgc ctacaaaaca atatcgcacg tggcagtttg 4020 tgttatcgca agcatggttc ccatctgcct catattagca gaggattctg agtgttgcag 4080 tttctccgga gtttcaaacg cgggattatc cagatcctca gcaaagcagc tctctatgag 4140 aaagtggcag tctgaatggg attgttcaac aaaggggcgt acgacacatg cactgatccc 4200 caacatcgct gcatggacga gcagaaaaca cggcgaagtg aacttctaca tgactcagtt 4260 cctctccgac catgggtgtt tccggagtta tcttcacaag taccgtcacg caagctcgcc 4320 agactgccca gcgtgtgtga gcatcgtgga gtcaacagaa cacgtgcttt ttcattgtcc 4380 tcgttttgct gaggaacgtc atgaaatcac cgtgaagtgc ggaacaacaa tcaacggaac 4440 aaacctgacc gagttgatgt taaagaatgc gggaacatgg gaagtcatag caaacggcat 4500 gcgatcaata ctgttgaagc tgtaagcctt atggaaagcg gatcaacgac ttgggcgttc 4560 gtaaatagtg ttgtgcgtat tgtgaaagtg tttagtgtat ctatgcacga attgaagtga 4620 gcctgtaact caagtgaatg cgtactttct gtgcttttgc gtggtattat gagtgtctcc 4680 aggtttatca ttctgtctcg cgaatgtggt ccaagttgca atagcgagat gtaatgctaa 4740 tgcatagccc tgccccaaga agcataccga aaggtgaacc catggggaag ggtagatggc 4800 ccaaggaggg ggtttactgg gtaaaaatca tatgtcaaca cccgtgcgac aacgggagtc 4860 tttcgaagat tccccctcct tgtaaaacaa aaaaa 4895 // ID P4_AG repbase; DNA; ANG; 7669 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 12-MAR-2003 (Rel. 8.02, Last updated, Version 1) XX DE P4_AG, a P-like DNA transposon - a consensus sequence. XX KW hAT; DNA transposon; Transposable Element; HATN2_AG; KW P superfamily; P4_AG; composite transposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-7669 RA Kapitonov V.V. and Jurka J.; RT "P4_AG: a family of P-like DNA transposons from African malaria RT mosquito."; RL Repbase Reports 3(2), 26-26 (2003). XX DR [1] (Consensus) XX CC The A. gambiae genome harbors many divergent families of P-like CC DNA transposons. One of those families is P4_AG. CC P4_AG elements are flanked by 8-bp target site duplications. CC Terminal inverted repeats are 27 bp long (4 mismatches). CC Subterminal inverted repeats are 28 bp long (4 mismatches), their CC positions are 64-91 and 7606-7580. CC The P4_AG consensus sequence was reconstructed from CC 6 copies that are only ~2% divergent from each other. CC Presumably, P4_AG copies have multiplied in the genome CC during last 2 million years. P4_AG elements carry a copy of the CC hAT-like HATN2_AG transposon, that was inserted into an ancestral CC form of P4_AG. CC The P4_AG encodes a 877-aa P-like DNA transposase called P4_AGp. CC Putative exons (based on FGENESH): 310-570, 2573-4146, 4211-5011. XX FH Key Location/Qualifiers FT CDS 0..0 FT /product="P4_AGp" FT /translation="MSTCAASFCQHSRYIVKKMGLDVIFHKFPTDPTLLRK FT WVEFCQREEAWVPSISSILCSAHFNKTDYQLINSPSKANRKILKKLKPSAF FT PSVIKSQAREPSNNVVQQCSTNNVVQQIETDEGRIDTSVEHHDDDLPSDNI FT THAKCQNCVQNETEIELLNQTLKKTQDKCNNLLEVNTFLSKQLEIVSKELT FT QSQKEIELLKTNHNKFKDVAISPNEFTTRMKNVLKDTLTSNQIDLITEERK FT RVRWTKAELSKFFTLRYLGKRAYQYLRDDLNFPAASISTLQRYGRTLNLKQ FT GILDDVINLLKNITVDLPECHRECVLSFDEMKVNRILEYDPASDEVLGPHN FT YLQVVMARGLFKNWKQPVFIGFDQQMTKEILFELIKRLYAIKINVVAIVSD FT NCQSNIGCWKDLGAHDYCHPFFSHPITKCNIYVFPDAPHLLKLIRNWLIDT FT GFEYNNKLIKADKLFELVAYRNAAELTPVHKLTQNHLVMTPQERQNVRRAA FT EVLSRTTAIALQRYFPDDCDAQELASFIEKVDMWFSVANSYSPCAKLHYKK FT SFNANENQLAALSDMFELMSNITALGKKSMQVFQKSLLMHITSLKMLYEDM FT RKKHSIVYISTYKLNQDVLENFFSQLRQIGGVHDHPSPLHCMYRIRMMILG FT KSPTTLKNHTELKNDDVENSHEHHEEFLSATVFSVADIPQSVPDISVMEKT FT NQICQAIEECSQESDLISTVSSTCNVQSAQESDGLQYVMGYIANKYNTKYP FT ELDLGVQTFKLTTDHCYSQPPTFVQHLSAGGLFEPSPTFLLLGNRMEKIFL FT KMHPDGTFSKTKKIVAKIAKNIQNQISELPVEIIRTFAKQRMIVRMRFLNL FT KSSTENLMKSKRKHVNQHGKGAKK" XX SQ Sequence 7669 BP; 2520 A; 1306 C; 1374 G; 2469 T; 0 other; caaggttagt tgaggataga ggttgaagca tccagtgtaa attgacaagc agctgtggga 60 aaattttgac agttcggtaa gaggttgttt atgttctcag acctacggca tgtggatgtg 120 gactcgtgta gaaacaaaca ttgtaaatct gtgtcgcgca atagagttgt gaaaaatcac 180 taaatttgta caaagcatat cgtttatatt gatctaagta catcacacgg ctttctaaaa 240 aggtaaattt ttgattatta tgtgttttac tgcaacaagc ctaatttatg tttgctattt 300 cagtgtgcat catgtcaacc tgcgctgcat ccttttgcca acacagccgc tacatagtga 360 aaaaaatggg tctggatgta atttttcata aatttccaac agacccaacg ctgttacgca 420 aatgggtaga attttgtcaa cgagaggaag cttgggtgcc atcgataagc agcattttat 480 gttcagcaca tttcaacaaa actgactacc agctaatcaa ttcgccttcg aaggctaata 540 gaaaaatttt aaaaaagctt aagccgtctg gtaagttttc atgtaatgtc ccacactaat 600 ctatactaat ataagcatac atttgataaa gttactaaat ttgaagggtt catacatgta 660 gctgctgttt cttcgacgtt tcgcccaacg tcgatgtttt actccgaatc gagttcatgt 720 tttccttcgt agaccataag gttactccga cttttaagta ttgaacgagt tattaggaac 780 gcaatgataa gcaaccagaa agtaaataat caagttcatt attaaaacca ggactaatta 840 tcgtgaacga ttcgcaaaga ctaatatgat ctgcatacat catctacatc agtggtctcc 900 aacctgtggt ccgttgcgtt aacagacgtg gtccacgaaa gaattatttt tgaaaaagaa 960 aagcttaata ctatggatta tagtacacaa tcattataaa attttcaaca agcatatcta 1020 aaatgagatt taaacgtatt tttgatcaaa aacattaagt aaagttaaaa aaaagtatgc 1080 actcattaga tgtgagccgc ggcaaatgta ttttcaagcc aagtggtccg cgatcctaaa 1140 aaggttggag agcactgatc tacatcattg cacacatctt gctttctcaa acttcctcct 1200 caaattgtgc cattaaccct taccgtttta aattttgttt gcttcgttgt tcccagttgc 1260 ttcattagaa tgtgcgcaac atatattctt acaggtcatc caccacatcc gaagttttgt 1320 tttactttat gtgattttag tactattttc agttccaaca tccgttcaac gcaggtttta 1380 caggaaagag cttcttttga aaatcctgcg attgatttgt tcgtaaaatc ctaccgtgta 1440 tgcttccgcc gtgaatcgct ccggtgcgat acttttcgat gttcatatag atctgcacat 1500 acattcattt atttttggta atcgatggtt tgtccgccag ttcaccatcg aatgaattgt 1560 cttatgcagt aatacaacat acatgacggt atcgttgctg ttcgattcca ttgtagcaga 1620 aggcttagca gaaggtccat ttgaaacgtt gttttcgttt ttgttcgatt tttctgcaag 1680 agaaaatgtg gaagttgatt caagtggcag tcccactact cctcgcgaag ccgttggctt 1740 tgggtttgat tcaattttct tgtcttcttt tggtacttga ttcagttttc ctgtgtgtat 1800 cagattattt attgactcat tatgaagcac actttgttac gcaccattgt gtagtatatt 1860 tgcaaataaa aaaatcatat ttaaactaca taaattcttg cctttcctaa tgccatcgaa 1920 gtaactttga ttttggatac tgtcgccatg tagtagctgt gaactcgtgc tcgccagagt 1980 ggccttcaaa catagttaaa gaaggtaatc gttgcgaaag aaggttacaa gttacaagtt 2040 ataaacagtc gctttttgcg tgattagttg ctatgcattc tcgattcacc tacttttcta 2100 taagtgtgta tcaattgctt ttgtatgaat ggtgtggcct gctttattcc ttaactggta 2160 ctatacaaat tattatacaa ataataataa taataataat aataataata ataataataa 2220 taataataag ttattataag tcaaatataa ggtaaaaatt ataataattt aagactcttt 2280 tcttcaatga tgcactaaac aaaagcttag attgaaatat tactattcta agctcgaaag 2340 tgataaataa cggatgggaa ttaacgtaga taagtgaaac ggtaatattt tgctgaatca 2400 tctatttgtc tgcaaagact attatacata actttaagcg cttaattgta acaataacag 2460 ttattcacat ataataaaat gtaacaaata tttcttaaat aaaaaataca taaatatgag 2520 gcaaatctat ttcaattcgt taattatttc attcttacct ttttctattt agctttcccg 2580 tcagtgataa aatcacaagc tcgtgagcct tctaacaatg tagtacaaca atgtagtact 2640 aacaatgtag tacaacaaat tgaaacggac gagggtcgca tcgatacatc tgtagaacat 2700 catgatgacg atctaccatc agataatatc acacatgcga aatgccaaaa ttgtgtacaa 2760 aatgaaacag aaattgaact tttaaatcaa actcttaaaa aaacacaaga taaatgtaac 2820 aacttattag aggttaatac atttttatca aaacagcttg aaatcgtcag caaagaactt 2880 acacaatccc aaaaagaaat tgaactttta aagactaatc ataataaatt taaagatgtt 2940 gctatatcac caaacgaatt cacaaccaga atgaaaaatg ttttaaaaga tacgcttaca 3000 tcaaatcaaa tagatctgat tactgaggaa cgtaaaagag ttagatggac taaagcagaa 3060 ttaagtaaat tttttacact tcgttattta ggaaaaagag catatcagta tttaagagat 3120 gatttaaatt tccctgcagc atcaatttca acactacaac gatacggaag aacattgaac 3180 ctcaagcaag gaattttaga tgatgtaatt aatttgctaa aaaacattac cgttgatctg 3240 ccggagtgtc atcgggaatg tgttttgtca ttcgatgaaa tgaaagtgaa tagaatttta 3300 gagtatgatc cggcctctga tgaagttctc ggtcctcaca attatttaca agtagtaatg 3360 gcaagaggat tgttcaaaaa ttggaagcag ccagttttta ttggttttga ccaacaaatg 3420 accaaagaaa ttttatttga attaatcaaa cgtctttatg ctataaagat aaacgttgtt 3480 gcaatagtta gtgacaattg ccaatctaat attggatgct ggaaagattt aggtgctcat 3540 gactactgtc atcctttttt cagtcatcca ataacgaagt gcaatatata tgtttttcct 3600 gatgctcctc atctgttaaa actaataaga aattggttga tagatactgg ttttgaatat 3660 aacaataagt tgataaaggc agataaattg tttgagctgg tagcttatag aaatgcagct 3720 gaattaactc ccgttcataa actaacacaa aatcatttgg ttatgactcc tcaagaacgt 3780 caaaatgttc gaagagcggc cgaagtttta tccagaacca ctgctattgc attacaaagg 3840 tactttcctg atgattgtga tgcacaagag ttagcctcat ttatagagaa agtcgatatg 3900 tggtttagtg tagctaactc atattctcca tgtgctaaac ttcattacaa aaaatctttt 3960 aatgcaaatg aaaaccaatt agcagcatta agtgacatgt ttgaactgat gtcaaacatt 4020 acagcattgg ggaaaaaatc tatgcaagtt tttcaaaagt ctcttttgat gcacataaca 4080 tcgctaaaaa tgttgtacga agatatgaga aaaaaacaca gcatcgtata tatctctact 4140 tataaggtta gtattgaaca atagtgaaag atttcagtca actaatgtta tgttttattt 4200 ttgtttacag cttaatcagg atgttttgga gaacttcttc tctcaacttc gccaaatcgg 4260 aggcgtacac gaccatccat ctccattaca ttgcatgtat agaatccgaa tgatgattct 4320 tggtaagtca ccgactacat tgaaaaatca tacagagcta aaaaatgatg atgtagagaa 4380 tagtcatgag catcacgagg agtttctatc tgcaacagtt ttttctgtag cggatatccc 4440 tcaatctgtt ccagatattt cggtcatgga aaaaacaaat caaatctgcc aagcaattga 4500 agagtgtagt caagagtcgg acttaataag tacagtcagt agtacctgca atgtgcaatc 4560 agctcaagaa agtgatggac tgcaatatgt aatgggatac atagctaata aatataatac 4620 taagtatcca gaattagatt taggcgtcca aacttttaaa ttaacaactg accattgtta 4680 tagtcaacct cctacctttg tgcaacattt gtccgcagga ggattgtttg aaccatcgcc 4740 tacattttta ttattaggta atcgaatgga aaaaattttc ttaaaaatgc atccggatgg 4800 tacctttagt aagactaaaa aaattgttgc caaaattgcg aaaaatattc aaaaccagat 4860 aagcgaacta ccagtcgaaa taatacggac ctttgccaaa caaagaatga ttgttagaat 4920 gcgcttcctt aatttaaaaa gtagtaccga aaatctcatg aaaagcaagc gaaaacatgt 4980 aaatcaacat ggaaaaggcg caaaaaaatg agaaaaatat tgaactagtc tatacattat 5040 ctattctcat gtaaatcaat taatgagaca tgtaattaat ttgttttatt tttgttgtgt 5100 taatgccgat agtatttatt taattattta ttggactatt tactaagata tttgctagtt 5160 agtttatgta aattatcttt ttatctatta ttttatttat ttattcattt atttatttat 5220 ttatttattt atttatttat ttgtttgtta gtataatcaa attgttggta gtcgtataca 5280 tagattttcc ttgtgtttta tccattaaac agacttcatt tttttattct gaaggctacc 5340 atatgtgatg cttatttatt aatctgtgtt atgtagtaat ctgcatccat ttataatatt 5400 ttctttggtt atttgtaatt ggctaccggt ttttgattca atgtttcagt cctaatactt 5460 gtaaatgttt gaatgagtga tttatcgtgc attttttggg acagtctggt ggtacagtca 5520 agaacatacc cgtcatgtgt ttaagcccgt atctatcgta ctccccaaaa gattagaaga 5580 tgtacaagaa gattcataac gccgaataga tctagtttgc tttcatttca tcgtctgcaa 5640 tgccccagaa ttaattacat cacactacaa tggtcttcat ggggttctca attaggtata 5700 tttaaataga tcatgatgaa tcgttgtagt aagaaaatac tgatatgctc ttttatattt 5760 aataagaacg gacctgccgt attccacgta ttattttttt ttattctgta ttattgtaat 5820 gaactattga ttttttgctt taataacccg taatattttg aaatgtatta tagcaaatat 5880 gtaaaatagt ccacattacc tcattttcat gtataaataa ccattttgta acaacaatat 5940 gatctataat cgtaaatcac ttgttaaggt ttgtcagtat attttaggat ttcacactga 6000 tgtaaaacac tgaattaaaa catgcaaaaa cgctagcacg taaggtattc cgctgtatta 6060 gaaattattt agaataatac ggtaataaaa catgagaact ttttaaattg ttatttaatg 6120 catgaaagtg cattgtgtta ccaatacact gaaatgtaca aataaagcaa aaacaatcaa 6180 gaaactttac ccgttcacca actgtatttt ctcactcact catccgaaaa attaccattt 6240 tgttaaagag taagtaagta agtaatggaa gatccaaaat taaggtgtta cgatcagtgg 6300 ctcggattaa gggtgtcggg ggccctaggc ggtaagacta gttgaggccc cctgtcaatt 6360 gtaaatggtg ttctaggggg tcagtcgata atcagtcaca ggctctaaaa tttgctggca 6420 gtggaggggg gggggggggg ggagggtgaa aatgatttgc cagccttggg gccacaacgt 6480 ccatccttcc gggggccatt gccgctatta ctctgtccat acggcatact atagacttca 6540 gaactacggg tcccctaaat cggcggggcc ccaggcgacc gcctagtccg tctaccggta 6600 gatccgccac tggttacgat gagtatctca tagaaatgca ttattttgca tacttcttgt 6660 ttcctattct caagatgagc taaaacgcac cagtttttgc caatatatct aagcagtttg 6720 ttccttcctg ggcaaatctt cctaacgtgg ccacgtatgg aatcataaat tagaaaataa 6780 ttaaaaccgg tataaactta cagctacgcg cagttgttaa ccattatacg ctgtcagtaa 6840 aattttcgag gttaatgata agcatatggc acaattcacc gcttgactgt aagtggttcc 6900 tgacactctt gcggaaggta aaccattatc ctgattacaa actgcttcca atttttaggt 6960 ataactaatg ccatacaaat caacaatgct aagctttctg aggtgtacgg tgaatcggat 7020 atcctaacgg tagtcagggc ctatgatacg atggctgggg catgtaatga aaatgccgta 7080 gtaatgccct ccaagaaagt ggtcggtcct gtgtttggta ctaggtggat ccggaatcag 7140 gctaagcagg gtctgtcggt gatcggatgc ctgcattgat ggacgctgaa ggcatcgagc 7200 ctcctgataa aataatagta gtcccggcca ttcaacggcg ctgggctcaa tcgtaagcag 7260 gccaatgaga aagaaaaagt gagataagag caggatttga ggattggagc ttttggaagt 7320 tattggagtt agtttccgtg acctgtgatt acatggagct agatatgtac acgaagctgc 7380 atatgaagtt atcagcacga gtaacaagga actggaaaga ttgcgataaa taaaagttta 7440 gtttgtaaca cctaaagcaa taaaccacat agtcagtttt acatatacgc acgatttgta 7500 ttatttcaca tgttttttct gtatttgaac taaaaaatgc gagctaaaag aacagcaccc 7560 atctgctgca attatttgta aacactttct tactgaactg tcaaaacagg attcaaaatg 7620 tcagaccaga aacaagctat aattcgacct ctatcctcaa tagaccttg 7669 // ID AGAM2 repbase; DNA; ANG; 1751 BP. XX AC . XX DT 14-SEP-2004 (Rel. 9.08, Created) DT 14-SEP-2004 (Rel. 9.08, Last updated, Version 1) XX DE Anopheles gambiae Agam 2 non-LTR retrotransposon - a consensus. XX KW Non-LTR Retrotransposon; Transposable Element; KW reverse transcriptase; endonuclease; pol-like domain; AGAM2. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1751 RA Cook M.J., Martin J., Lewin A., Sinden E.R. and Tristem M.; RT "Systematic screening of Anopheles mosquito genomes yields RT evidence for a major clade of Pao-like retrotransposons."; RL Insect Mol. Biol 9(1), 109-117 (2000). XX RN [2] RP 1-1751 RA Gentles A. and Jurka J.; RT "Anopheles gambiae Agam 2 non-LTR retrotransposon - a RT consensus."; RL Direct Submission to Repbase Update (30-JUN-2004). XX DR [2] (Consensus) XX CC 99% average similarity to consensus. XX SQ Sequence 1751 BP; 447 A; 428 C; 353 G; 523 T; 0 other; ccaactttat ccgtctagct gaactgcttc aatctactga ctggtctttc ctggaaggta 60 atccggatgt taactcagca ttggaactat ttaacacaac actactggcg cttatatctg 120 attgctgtcc tctaatgcca ccgcgtcgca gtccaccatg gtctgatgca cgtttacgcc 180 gtctaaagca aaacaaagcg tcatgctttc gcttttacag tacaaatggt acacaacact 240 ccaagttaag gttcattgct gcccataatg tatatagaag ctataacaga caacttcata 300 ctcgttatct cattcgagtg aaattctcgc tcatcagaca cccaacacga ttctggaggt 360 atgttgaatc aaaacgcggt aacagctcgc tccccgatgt tctcacatat aacaacgtct 420 ccacaagcag taaggaaggc atgtgcaact tatttgctga tcgtttcaag gactgcttta 480 ctacggatga tgcaagctta tcattagaag cagctttaaa taatgtgccc cgtgatgttg 540 ttgacattga tgtacgcgat attattatct cggtagatac tgtactacgt gctctgcagc 600 aggtcaaaac ctcatacaac cctggacccg atggtattcc tacggcaatc cttgccaaat 660 gccgcgaatt tttagcagag cctctgtctc aaatctacca actctctttt gcacaaagta 720 ctgtacccac ggcctggaaa tcctctgtga tgtttccggt gtacaaaaaa ggagataaaa 780 actctgctga gaactaccgc ggtataacca ccttgccttc ttgtgccaag gtgtttgaga 840 tcgtcataca aaactcgcta atgtatcact gtcgttctta tatttctaca cgccagcatg 900 gtttctttcc tcgacgcagt gttaccacaa acctggtgga attcgtctcc aactgccatg 960 cagcctttac ttccggagct cagatggatg cagtatacac tgatcttaag gctgcgtttg 1020 atcgtgtgaa ccatcgcttg ctgttggcta agctcgcccg gatcggtctc tctactccgc 1080 tggtgaattg gttcaggtcc tatatctctg aacgtagcta ctacgtacaa atcgatggtg 1140 tctcctctaa cgttttcgag agttcatctg gcgtccctca gggcagcaac ttgggaccac 1200 tgctgttctc gctttttatt aacgacgtca cactggccat tacggaagca gattgtctgc 1260 tttatgcgga tgatgttagg ctgtttcgta tcgtacggaa cacttccgac tctctttctc 1320 tgcaaagatc gattgatgtt ttctctgact ggtgtatcaa caacgacctg ctaatttctg 1380 ttgataagtg tacgtcaatg tctttcttta gaatagctag tccgataagg tatatctata 1440 gcatatcagg gacacaacta ccgcggtgca atacggtcag ggatttagga gtcaccctgg 1500 atcgcaaact agatttccga caacattact gtgatatttt agacaaagct aacaaaatgc 1560 taggatttat tcgtcgacat tcgagagaac ttaatgaccc acactgcctg ttaactctgt 1620 ataagtccta tgttcgttcc atcctcgagt ttagttctac agtttggtgt ccgttctcta 1680 gtgtttggtc caatagaata gaagctgtcc aaaagagagt tactcgtatt gtcctacact 1740 tcactccgtg g 1751 // ID P3_AG repbase; DNA; ANG; 4394 BP. XX AC . XX DT 13-DEC-2002 (Rel. 7.11, Created) DT 13-DEC-2002 (Rel. 7.11, Last updated, Version 1) XX DE P3_AG, a P-like DNA transposon - a consensus sequence. XX KW P; DNA transposon; Transposable Element; P superfamily; P3_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4394 RA Kapitonov V.V. and Jurka J.; RT "P3_AG, a young family of P-like DNA transposons from African RT malaria mosquito."; RL Repbase Reports 2(11), 23-22 (2002). XX DR [1] (Consensus) XX CC The A. gambiae genome harbors many divergent families of P-like CC DNA transposons. One of the youngest families is P3_AG. CC Some P3_AG copies are identical to each other. It's possible, CC that P3_AG is an active DNA transposon. CC P3_AG elements are flanked by 8-bp target site duplications. CC Terminal inverted repeats are 29 bp long. CC The P3_AG encodes a 879-aa P-like DNA transposase, called P3_AGp. CC Predicted exon/intron structure (based on FGENESH and GENSCAN): CC 356-614, 679-2198, 2260-3120). The P3_AGp transposase is CC 43% identical to the P1_AGp transposase encoded by P1_AG. XX FH Key Location/Qualifiers FT CDS 0..0 FT /product="P3_AGp" FT /translation="MPRSCAAAFCKNNAENVKKRGLNITFHSFPSDDSLPK FT WIDFCKRDEHWKPTKISTVCSLHFKPDDYQMAKSSLPQTLPVLKRLKPYAI FT PSLIQPADFIQNEPSNMTAPLKECNQPNVEFQTDSEYFSDVENVPSQTIID FT MKRELDQVKEDNRKLIEVNTNLRDKLHSYFNENKRLKAEIDNLQKHISKKD FT AGIDEAALVTAMKERLKPTLSENQIDIILKKKKRVVWTKEEIGSALTLKYF FT GLRCYKYLAKDRKFPLPADATLKRYTKNLVVKEGILDDVLKLISNLTSTFT FT EKDRLCALSFDEMKVNRIIELDKASDEIIGPHNYLQVVMARGLCNKWKQPV FT YIGFDKKMTKEILLKIIEKLSEININVVAIISDNCSTNVSCWKELGAKDYE FT RPYFQHPTTLNNVYVIPDAPHLLKLLRNWFLDSGFTYNGKHIKADLLFDMI FT ASRNETEITPLYKLSKTHLVMTPQERQNVRRAAQLLSHTTAISLRRYFKNN FT AEATDLANFIEKVDLWFSISNSYSPFAKLDYKKSYTASDDQIKALDEMFEI FT VSNMTVIGKHSLQIFQKSLLMQITSLKLLYDDLHKRHNISFISTHKLNQDV FT LENFFSQLRQIGGVYDHPSPMSCIHRIKMIILGKAPTFLKNQTDLEPSTFS FT CTDEYISSQIRTSIEIENEGSEQANDGDIISASIITSALNQPPVKQSIQSD FT TLSSRSSEMLSSVSSSAIELPEQDSDGLEYIMGYIGRQCFEKFPHLNLGNL FT SLNLNSDHSYSHPPSFVKHLSVAGLFVPSEAFLKQGYKMEKIFQKLHPNGN FT FNKKRYISKRLVKRLQKEFPELPLIVVQQFAKHRINIRIKFLNMKIANEKR FT VNNKRKAPSHSKTAKKCEKLQISCFV" XX SQ Sequence 4394 BP; 1513 A; 768 C; 847 G; 1266 T; 0 other; caaagtgaat gaaagggagg tgagcttatg tagaacattt ggcttgcggg attttagaaa 60 aatgaaataa agtttgacag tttgtaaccg ggtcggctgg acggtttgtt tacatttgat 120 agctattgga ctggtccatg ttgttgatgc aacggcgaga agtgtgtgac taaaattcgt 180 gtaaaagaaa atagtacgct ttcgggtgtt aatattttat tataccaggt aaaatataaa 240 ttgtacaagt tgcagggaat gtattaaatg gatttatttt ctatttttag agaaaaagca 300 aatctggaag cagcgttgtt tgtagctgtg tccacagaag tttgtaggaa gcaacatgcc 360 tcgctcatgt gcagctgcat tctgcaaaaa taatgcagaa aatgtaaaga agcggggttt 420 gaacattact ttccactcgt ttccatcaga cgattctttg cctaagtgga ttgatttctg 480 taagcgggat gaacattgga aaccaaccaa aatatctact gtgtgctctc tccatttcaa 540 acccgacgac tatcaaatgg caaaatcatc tttaccacaa accttgccag tactgaagag 600 attgaaacca tatggtaagg aaagatatga gctttatgca gtaattatca tttcatcaca 660 ttctcatctc tttattagct attccatcat tgatacaacc agccgatttt attcaaaacg 720 agccatcgaa tatgacagcc ccattgaaag agtgtaacca gccaaatgta gaatttcaaa 780 cagattcaga atatttcagc gatgtagaga acgtgcccag ccaaacaatt atagacatga 840 aaagggaact tgatcaagtg aaagaagata atcgaaaact gatcgaagtg aatacaaatt 900 taagagataa actgcattca tacttcaatg aaaataagcg actaaaggca gaaattgata 960 acttacagaa acatatttca aaaaaggatg caggtataga tgaagctgca cttgtcacag 1020 caatgaaaga aagattgaag ccaacattat ctgaaaacca gatagatatt attttgaaaa 1080 agaaaaaacg tgtagtttgg acgaaagagg aaattggctc cgctttgaca ctcaaatatt 1140 ttggattgcg atgctacaaa tatttggcta aagatagaaa gtttccttta cctgcagacg 1200 caactttaaa acgatacaca aagaacctcg ttgtaaagga aggaattttg gatgacgttc 1260 ttaaattaat aagcaattta accagcactt ttactgaaaa agatcgcctt tgtgctctgt 1320 ctttcgatga aatgaaagtt aacagaataa ttgaactgga caaagcatcg gatgagataa 1380 ttggaccaca taactatctg caggtcgtga tggctcgagg actgtgtaac aaatggaaac 1440 aacctgtgta cataggattt gataagaaaa tgacaaaaga aatactcttg aagataattg 1500 aaaaactaag tgaaataaat attaacgttg tagctatcat cagtgacaac tgctctacaa 1560 atgtaagttg ctggaaagaa ctgggagcta aagactacga aaggccatat ttccaacatc 1620 ccacaacttt aaataacgtg tacgtaatcc ctgatgcacc tcatttatta aagctactaa 1680 ggaattggtt tttggatagt ggatttacgt acaacggaaa acatataaag gcagacctac 1740 tttttgacat gatagccagt agaaatgaaa cagagattac acctttatat aagttgagta 1800 aaactcattt agttatgacg ccacaagagc gtcagaatgt tcgacgtgcc gcacagctgc 1860 tctcacatac tactgctatt tccttgcgtc gttattttaa aaataatgct gaagctacgg 1920 acttggcgaa tttcattgaa aaagttgact tatggttcag catatcgaac tcctatagtc 1980 ctttcgccaa attagactat aaaaaatctt atacagcaag cgacgatcag ataaaagcat 2040 tagatgaaat gttcgaaata gtttcaaata tgaccgtgat tggtaagcat agcttgcaaa 2100 tttttcaaaa gtcgttgctg atgcagataa cgtctcttaa attgctttat gatgatcttc 2160 ataaaagaca caacatttcc ttcatatcca ctcacaaggt aattaatgac ataaaaaaca 2220 tcatatattc atattgatct atcatccttt ttaatttagc tcaatcaaga tgtactggag 2280 aattttttct cacagctaag gcagatagga ggggtatatg atcatccctc accaatgagc 2340 tgcattcatc gtattaagat gattatatta ggaaaagcac ctacgttcct taaaaatcaa 2400 acagacttgg agccatctac attttcttgt acagatgaat atatctcatc gcaaattcgg 2460 acatcgattg agattgaaaa tgaaggttct gaacaggcta atgacggtga tataatttca 2520 gcatcaatca ttacctcggc cttaaatcaa cccccagtta aacaaagcat acaatccgat 2580 accctgagtt caagaagcag tgaaatgtta agctccgtca gtagttccgc tatcgagctt 2640 cctgaacaag acagcgatgg actcgagtat attatgggtt atattgggcg tcaatgcttt 2700 gaaaagtttc cgcatttaaa tttgggtaat cttagtctga atttgaatag cgaccattcg 2760 tatagccatc caccttcatt tgtaaagcat ttgtcggttg ctggtttgtt tgttccttca 2820 gaagcttttt tgaaacaagg atacaaaatg gagaaaatct tccaaaaatt gcacccaaat 2880 ggaaatttta acaaaaaacg ttacatatca aaaagattag ttaagcgact tcaaaaagaa 2940 ttccccgagt taccgctaat agttgtacaa caatttgcta aacatcgcat aaatatacgt 3000 atcaaatttc ttaatatgaa aatagcgaat gaaaaaaggg ttaataacaa acgaaaagca 3060 ccatcacatt ccaaaactgc gaaaaaatgc gaaaaattac aaattagttg cttcgtttaa 3120 cggatttata aaacagacag aaatatgaat tacataaatt tacttataaa ttttgattat 3180 gtataaatta taaatgtata tattagtgaa taagcatttt ttttaataat atcatgatgt 3240 attataaacg gtgaaattaa ccaggttgat tctaatgata atattcattt agagactaac 3300 tattccgact agaacatata aatgaaataa atgaagttgt tcgcaatgtt gtttgtttcc 3360 ctagcaaaat accaatagca ggctccaggg cctgccaccg gcatgcggga aagttaactg 3420 gcaggccctg ggacctgcca tcgggctgaa cgtgttaaac gattggcacg gcgaccaatt 3480 cagaaaaaaa gatgttgagt atatttgtaa cagtcatgca tactgttgag catactattt 3540 tgcttctgat gattcttttg agtagcgcta gcatgtacct atgcacttgt tttttcaaca 3600 aataacgtat gtaatttcag ccgtcgtctg gtatcgtagc attctaagag tttggttagg 3660 ctatcttatt cggttctaat gctccagtgt tagatttcgc taatatttac gttgcctaag 3720 ctgtttagtt gtaaattgtt ggcctggaaa aataccaaaa atgagttata tagaaatgta 3780 gcaaatgaag aacttcaagt atcatcaagc tcgtaagatt acagtcaagc ttctgtgctg 3840 catatgagcc ggaatatttc atcgcctttg atgcgggact tcctttctgc agccgaggtg 3900 gaatgtaccg tcaccgtgat gcgatgcaga cgggaccata tacgccgagg ccgatgcaga 3960 aggtgatgcg cttaagggag ggagaatgac ttcctccctt gatgtgcggc tggtgtgaag 4020 cctggccaat ggagtcgcca acgatgaaac ggcagcaggg gccaatctct ttagggttca 4080 caaagcagca cgttcaactg taacgtaagc acgtaatccg aattccgtaa tgaattgttt 4140 ccaatatctg aaaataatac aaatcagtaa ctattccatg aagataagga attacattga 4200 tgagggactt acattagaat actgaaataa tgaaacaatc catacaatgc ctgttgaaga 4260 aataaaactg attgtgcgac gatgttatgc acaggaagaa aaggtaaaca ataccggtta 4320 tatagtgtca aatttgacaa tcgtcatggc gacctaaatc tttcgataag ctcacctccc 4380 tttcattcac tttg 4394 // ID Loner repbase; DNA; ANG; 6343 BP. XX AC AAAB01008849; XX DT 21-JUL-2009 (Rel. 14.07, Created) DT 21-JUL-2009 (Rel. 14.07, Last updated, Version 1) XX DE Loner non-LTR retrotransposon, a fossilized genomic copy. XX KW I; Non-LTR Retrotransposon; Transposable Element; Loner. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-6343 RA Biedler J. and Tu Z.; RT "Non-LTR retrotransposons in the African malaria mosquito, RT Anopheles gambiae: unprecedented diversity and evidence of recent RT activity."; RL Mol. Biol. Evol 20(11), 1811-1825 (2003). XX DR Genbank; AAAB01008849; Positions 2639407 2633065. XX CC ORF2 is corrupted by a frame shift. XX FH Key Location/Qualifiers FT CDS 401..1777 FT /product="Loner_1p" FT /note="ORF1." FT /translation="MEAEPMDESDEGFITVKRKSSDPGQMAKKKVLHEPLA FT STSRDDSAKANLKSRAKTYHASFNGAHNVFFMPRTKPLDVRSITASIYKKY FT PGVLDVIRLHPKKLRVTAKDRVQANAIVADPDYTEDYRVYIPGGLVEVIGV FT VDDFEYPLEDILQYGQGAFLYQSSPRVKVLEVKKLYASKMVDGKKVFRETS FT SLRITFEGTVLPNTLYFEGLRVPIRRPFVTKVATCSKCSQIGHSEPYCTNS FT VRCGKCKGPHSTAECKADTQKCFHCKQQWHEVSTCSVYRKTQAEHKKAVLF FT NSKRSFADVLKKTSEVNPFDTLSLLEGNEPLPNLAPLTGVKRSITPRVSIK FT GIRKNKTVPPKKAPRRHVDATQRQSQGPMPPRDVRTRPPSRNPPNPSNLAF FT PSEPRSFSRQSLPSLTEILQSLLDSLSLPEPLSGLIPLAFPFLRGMVRQWL FT SKWPCLSMMVRLDD" XX SQ Sequence 6343 BP; 1717 A; 1499 C; 1456 G; 1671 T; 0 other; cattctgtcg ccaaccacga agcgaatcgg acgttatcga tacgcgctcc aggtggatga 60 ttacggccaa agtcaagcca ctgaccccga ccatacacac cacgaagatt tgcgtacgct 120 tggttgaacg tgaagcgatc tgtgcttgct gtgcgtaaaa gttgtacagt gttaagtgtg 180 tgtagtgtgt ttaattgtct gtgcgttacc cgttagagag agttgagcgc atcgacgata 240 caagtgcgta cggctcgcat ctcttgtgcg ctcgcatctc gtgcgctcgc atcttatcgt 300 gtgcgctctc ctctcatttg tcccgtgctt ggtaagtgtt ccgctcttat cctgttatcc 360 ttgttcccca tacagttaat tcgttggcta gggctctccc atggaggcgg agcccatgga 420 tgaatccgat gaaggcttca ttacagtgaa gcgcaaaagt tctgacccag ggcagatggc 480 caaaaaaaaa gtgttacatg aacctttggc ttcaacatcg cgagatgatt ctgctaaagc 540 gaatttaaaa tcaagggcca aaacatatca tgcttctttt aacggggcac ataatgtttt 600 ttttatgcct cgtaccaagc cgcttgacgt tagatcgatc acagcatcga tctacaaaaa 660 atacccgggt gttttggacg tcatccgact gcatccaaag aagttgcgtg tcaccgcaaa 720 ggaccgggtg caggccaatg caattgtggc cgatccggac tatacggagg attaccgggt 780 ttatattccc ggtggattgg tggaagtcat cggtgtggta gatgactttg aatacccctt 840 agaggatatc cttcaatacg gacaaggggc attcctttac caatcctcgc ctcgtgtcaa 900 ggtcctcgaa gtgaaaaaac tgtacgcttc gaaaatggtt gacggaaaga aggtgttccg 960 tgagacttcc tcactccgga tcactttcga gggtacagta ctgccgaaca cgctgtactt 1020 cgagggtctg cgggtaccca tccgtcgacc ctttgttaca aaggtggcta cctgttctaa 1080 atgcagccag attggacatt ccgagcctta ctgtaccaac tctgttaggt gcggtaagtg 1140 caaaggaccg cattctacgg cagaatgcaa ggctgacacc caaaagtgct ttcattgcaa 1200 acagcaatgg cacgaagtgt ctacttgctc agtgtaccgc aaaacgcagg ccgagcataa 1260 aaaagccgtc ctttttaatt caaagaggtc attcgccgat gttttgaaga aaacctcaga 1320 agtgaatccg ttcgatacac tctcacttct tgagggtaat gaaccgttgc ccaatctggc 1380 ccctctaaca ggagtgaaaa gatcgattac accgagggtg tcaatcaaag ggatccggaa 1440 aaacaaaact gttcccccta aaaaggcacc tcgacgtcac gttgatgcga cccaacgcca 1500 atcccagggt cccatgcccc cacgtgatgt tcgtactcgt ccaccttccc gtaaccctcc 1560 caatcctagc aatctggctt tccccagtga acccagatcc ttctccaggc aatccctccc 1620 aagcctgaca gaaatcttac aatccctcct agactccctt tccctcccgg aacccctatc 1680 cgggctaatc cctttggcat ttcctttcct aagaggaatg gtcaggcaat ggttatcaaa 1740 atggccatgt ctcagtatga tggtccgctt ggatgattaa tgcctccaat gaccactata 1800 ctacaatgga actgtagaag ttttttggga aaaattgact cttttaaagt attgataggg 1860 caacacaatt gtaacgcatt tgccctaagc gaaacttggc tctcacccga caaaaatatt 1920 accttcccgg gatataatat tattcgtcaa gatcgacatg atccagctag tgataggcgt 1980 ggtgggggag tgttaattgg tattcgaagt agtcacagct tctacagaat acccctcccc 2040 acacccgagg gaatcgaata cgtcgctata cagacaaaac taggggatct tgacgtttct 2100 attgcttcaa tttatatccc accgggagcc aatctggacc caaagaagat caacaaggat 2160 cttgaaaccc tagtaacggt actaccaaaa ccgtttttca tcctgggcga ctttaacgcc 2220 catggatcag attggggttg tacgcatgac gataatcgtg caccaatcat tagggatatc 2280 tgcgacacat acagtttgac gattttaaac tctggcgaag caactagggt gccctcacct 2340 atggcacgac ccagcgcaat agacctatct ctttgttcat cgtctttagg gctggattct 2400 atgtggaagg taatccaaga cccgcttggc agcgaccatc tgccaataaa gatctcgatt 2460 atcaagagga gtcgcacagt cgatcaagtc cccgttaatt gtgacttaac gaggaacatc 2520 gactggacga aatacggcaa ccaaatgacc gctttgctaa gccgcgtaga acccagtttc 2580 tccgtaaatg aggagtacac aaatctcgtt atggcgataa acgagtgcgc cctcggtgcc 2640 caaacgaggc cgccccccca ggcaaggatt tttaaaagac cacccactcc ttggtgggac 2700 gcggattgta aagcggcatt ctcagcgaag cggaaggctt ttgccaggta tagagacacc 2760 ggctccatgg atctatacat ccattataga ggcctggagc gtaggtgcaa aaacttgctt 2820 aaagcaaaaa aaaaggtcat actggccgag gtatgtcaaa aacctcaagc cttccacttc 2880 gctaacggag cttcagagca tggcgaaaag catgcgcaac agcaaagcaa caaacgagag 2940 cgaaagggtt tctggggcat ggctagagcc atttgcacaa aaagtctgcc ctgatttcgc 3000 tcaagcacca ccatttgaac agagtgctca tgggagtgat ccgcaaatgg attcaccgtt 3060 cacaatggtt gagctatcgc ttgctctgta ctccagcaat aattcgtctc caggactgga 3120 tcagattcgg aacaaattgc tccataatct gccagatctg gcgaggaaac ggctgttgag 3180 attattcaac attatgttgg agcttaacac cgtcccgttg gagtggagag aggtgaaagt 3240 agtcaccttg ttgaaacctg gcaaaccggc atcagactat aattcttatc gaccgatagc 3300 aatgctatct tgcttgcgaa aactatttga gaaaatgatt ctttttagac tggacaattg 3360 gcttgaatcc aaaggcctct tgtcaagtac ccaatttggc tttcgcaaag gcaagggtac 3420 caacgattgc ttggcgctgc ttgtgtccga aatcgagatg gctcattctc gtaaagaaat 3480 gatggcatct gtatttcttg acatcaaggg ggcttttgac tcagtatcag tcaatgttct 3540 gtgtcaaaag ttatcatctg cgggcttaac cccaagactg aataacgtct tattcaacct 3600 cctttcggaa aaggcaatga atttcgacaa tggccacatg aaaattcgga gagtcagtta 3660 ctatggacta ccacaagggt cctgtttgag ccccttgttg tataattttt atgtgaacga 3720 tattgatgca tgccttgcac ctggctgtaa cctaaggcaa ttggcggacg acggtgttgt 3780 atcagtcgcc agcaacaaca tcgccgacct tcaaagtcct ttgcaaacca cacttaacaa 3840 tttagaagtg tgggccacaa acctaggtat cgagttttct ccagagaaaa cggaaatgct 3900 gatattctct ttcttttaca acactgagag taatagacgc ttgaatttgg ttgacccaaa 3960 agtcgatatc tttttatatg gtaaaaagat atccattgcc agatcttttc gatacctagg 4020 ggtttggttt gatagcaaaa atgtatggag gacacatatt gactatctgg tacagaaatg 4080 tacaagaaga atcaattttc tcagaacgat taccggactc tggtggggtg cacatcccaa 4140 agacgtcctt aacctatata agacaacgat actgtccgtc ttggagtatg gttgcatatg 4200 cttccactgg gcagccaagt cgcatctaat ccgacttgag aggattcagt atcgttgtct 4260 tagaatcgca ctaggtagca tgaagtcgac tcacaacatg tcactcgaag tgatgtccgg 4320 agtgatgccg ctaaagcttc gttttgagct actatcgctt cgccttttcg tccgctctac 4380 agtatcaaat cccttgataa tcgagaactt tgaaacactg caagagatag gttctaaatg 4440 caaaatcatg aaggtctatc gtgattttgt ttctctacag gtgcacccta gtagttttca 4500 aaacattagt agtgccagct taccagagtc ctacagttcc attttaagtg tagatacctc 4560 gttgagggag gcaattaata ctatccctga taatcttcgc tggatgacaa ttcccaacat 4620 ttttgtggaa agacatggcc aaccaaatgt aaaccatttt tacactgacg gatcctcatc 4680 agagcagggc attggttttg gtgtatataa cacgtgtacc gaagcatatt ttaaattacg 4740 ccaaccatgc tcagtttatg tagcggagct cgccgcaatc ttttatgcct tactgttgat 4800 aagtgcatgc cctccggatc agtatgtaat tttttcagac agcctaagtg ctcttgaagc 4860 attaaaatcc gtgaaggcta ttaagagccc agactatttt gtaaaagaaa ttctaaaagt 4920 cctaagctcc ttgtttgaaa aatcgtttcg aatatccctg gtatggttgc ctgctcattg 4980 cggcatctta ggaaacgaga aagcagacca tttggccaag aagggtgctt cggaagggtc 5040 tttttacgat agacctatcc tccctcacga gttcttacga gccccacaag ctttctgcat 5100 tgcacgctgg cagggcttgt gggacacgga tgaacttggg aggttcctgt actcgatctc 5160 tcctagagtt tctttgaaac cttggttccg cgacatctct ggagagcgtg cattcattcg 5220 aatgatgtct agacttaggt ctaatcattt cgcattgggc gcacatctcc agcgtatagg 5280 acaggtcgac acaaaagcat gtggctgcgg ccttggattt cacgacatag accatcttct 5340 atggtcttgc gtggaatacg aggctgcacg acccactttg ttggatgcag tcgaaaaact 5400 gggaagatct cctggtgttc ccatccggga tatactagct gggtcagact ggagccttct 5460 aaggatcatc tttgagttct gcagggctaa cggattaact gtgtaatatt cctataacag 5520 agctattgtg gctggctgtg tattggttgg tcgttgtcgt tgtcatgcac tctcacgccg 5580 atggcatgga tcgcggggtg ctatggaggg acttgctccc tcgtagcatc ttcgcggttt 5640 ttggcagcgg ggtgcggtca tgtatcatcg tcgtgcggct tgctcttgca tggctggctg 5700 aatgagttat gtatcccact cccactccat ctgttcgtct ttgtgctaca tgttgtaaat 5760 cgctagtttg cttgtgatga atgactgtac gtatgaatga atgaatgaat gagcgtggtt 5820 tgtatggatc ttgggtcagc tacagcttga gacagaaggt ctttgcatgg acgaagtcct 5880 tgtaaagata aactcttcaa aaagagttcc gcgagagttc tcaaagcacc tcaactctgc 5940 taccagaggc aagaatgcag aatgatgcaa actaaccatc ccaaagagcc ctggaatatg 6000 gattggatta agacccctgc aatgcccacg gacaagagcg gactacttca acggacaaga 6060 acgaaccaga agaattcgag caagccttgg gaaatttctt ttccacagga taatgacgaa 6120 gcatatgatc tcgacctctg ccccctctta ttggaccact cgacaaggac aaggatcgga 6180 ttttgacagc aacgtccgag cacacccggg atatggtgcc ctcctaaagc tgcgagaagg 6240 aaatgtctcg caacagtcgt gcattaaaga ttaaactgaa tttatatgta ctctaatttt 6300 aagcaatacg gcaaagccgt ccctactgaa taaaaaaaaa aaa 6343 // ID CR1-3_AG repbase; DNA; ANG; 5485 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 19-MAY-2005 (Rel. 8.02, Last updated, Version 2) XX DE CR1-3_AG is a CR1-like non-LTR retrotransposon - a consensus DE sequence. XX KW CR1; Non-LTR Retrotransposon; Transposable Element; KW reverse transcriptase; CR1 clade; DNA/RNA-binding; PHD finger; KW AP endonuclease; CR1-3_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5485 RA Kapitonov V.V. and Jurka J.; RT "CR1-3_AG, a family of CR1-like non-LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(2), 14-14 (2003). XX DR [1] (Consensus) XX CC CR1-3_AG is a family of CR1-like non-LTR retrotransposons. CC The CR1-3_AG consensus sequence was reconstructed based on CC multiple alignment of ~50 copies identified in the CC sequenced portion of the genome. Given the ~3% divergence CC of these copies from the consensus sequence, transposition of CC CR1-3_AG occurred less than 2 million years ago. CC The 3' terminus of CR1-3_AG is composed of the ATAA CC microsatellite. CC CR1-3_AG encodes two proteins: a 418-aa CR1-3_AG-ORF1p CC (positions 1057-2310) and 910-aa CR1-3_AG-ORF2p (positions CC 2314-5043). CR1-3_AG_ORF1p is DNA/RNA binding protein composed CC of the PDH domain (positions 5-40). CR1-3_AG-ORF2p is composed CC the AP endonuclease and reverse transcriptase domains. CC Putatively, the last protein is translated through the ribosomal CC frameshift. XX FH Key Location/Qualifiers FT CDS 1057..2310 FT /product="CR1-3_AG-ORF1p" FT /translation="MAGICSACANDIVAADRIVKCQGWCNSEFHFSCSGLS FT EELSATIESCAQLFWACKACVKFHKDPRTAVLRSSTPYTHTSVDLLSSIAD FT LKAGLRSELSQHTTAIKLELLEVLKAEIRSCLRSTHAATDLPSQRPIHHNS FT APKLFNSVVKNIPGTITPTLQQYPSLAASLSVSANEPSSFTNMPPXYNPQT FT PLLRGSGSPLDSDTLDTIPHTDTRMWLFFTRFSPSVTTEQISLMVQVRLAL FT DKRDVFVHRLTKLGADTSTLSFISFKVGIPATLRNKALSPKTWPSALTYRE FT FRDYRTNNYNTNTCATTETNSMLDQNVTLATDTYHNPLSHSGGSAAMPTTT FT TDTLAQPVTISSPAMTTTDTALFTNEPHLDLNERMLVTTTPTRSPPECLAA FT PKKRPKRGNAKRTDESADAGPSDE" FT CDS 2314..5043 FT /product="CR1-3_AG-ORF2p" FT /translation="QSCTASGDAPRSAHPSLSIYYQNVRGLRTKTTNLRLA FT LSESEYDFIILTETWLTQSIPSSLLTDDHYHIYRCDRNLSNSALSRGGGVL FT IACSSSIPTCEIASPNTILEQLWIKTLLPGVSVYIGVVYIPPSHANDPAVM FT NALHDSVREISSRIKESDLLYVFGDFNKPDIRWELTNTSEATDCSPCYSVM FT HYAPLCNSVANTDFVDGLHSTGLFQLSGIANQSGRQLDLVFANLAATNILC FT DSITPLHSVNGTSALENSLPYVTHCSIPLLSEDFHHPSLDMMIYYPVQLSH FT TTNSHTRSTVNRNFFKTNVERMNSLIVSFDRNFDCSNFATIDEATDFFSVF FT MRSAINSCVPVAQRKSGPDWSNASLRRLKKIKSKAYADYSRTRSSLHRRIF FT FDALNNYRRXNRVLYRSFIRRTERQLFSKPTRFWSFWNKRRNIRSIPPSMS FT YNGXTSIDTSDICNTFANRFADAFTLPVHNPNTLAEATRNTPSDAIDFIIP FT TIDEALIARTLNDIKPSTSSGPDNIPAYILKHCRQSLAPILAKIFNDSLMR FT GTYPASWKHARMVPIHKKGSRLHASNYRGIVSLCACAKVFELILYNPLLTA FT VQNYMSPSQHGFLPRRSSTTNLAEFVGYCFDNMDRGTQVDAVYIDFRAAFD FT SISHDILLSKLKKLGFLDWHITWLRSYLTGRSYYISIGSHRSHSFTSSSGV FT PQGSNLGPLLFLIYINDLSFVLPPGQHLMYADDVKIFAPVRNDSDCVRLQT FT ILENLDSWCSRNALQVCADKCQCISFSRARHPITFTYTMLNTALARTTCIR FT DLGVLLDQKMSFRPHIDSVVAKGNQLLGLITRTCSEFTDPMCVKSIFCAIV FT RSCLEYCCPIWCPLGVGDINRLEAIQRRLTRYAVRLLPWQSHHARPTYHQR FT CLLLGH" XX SQ Sequence 5485 BP; 1459 A; 1439 C; 1091 G; 1489 T; 7 other; tcagctatca tgtagcgcat cctcatccct agcgccgggc ggggtgggga atcgcgacat 60 gatagagtga acggccactt tccccgcgct gtgtgtgtgc gtgtgtgtga cgagacccga 120 aaacgcgaga caaatttcgg cacataaaaa aaaaaatatt gtcactatac actgtgctac 180 atgatgctta gaaggtcgtt gaagcatgaa acatgttcaa attaaaaaat attagattag 240 tgttagacgt tctgatacga ttgttttaaa taggcttcta caccctcaac tggcaggggc 300 atcgaatggt ctcatcagca gcggcacgaa ataaaatgaa ccatcaatac accgataaca 360 ctttgaatcc caatccagct tctcccacag cgtctcaagg tcacacacaa attaataaat 420 gctaacaccc accaataaca atacacaacc gcaatgcaca tatgagtatc aacgcataca 480 ccgcaacagc caatcagctg gcggcggaag gcacgggcag aagcgcaagt aaggcagggc 540 agggtgaggg agagggagaa gaagcggacg aaaaaataaa tgggcgccaa tatttttaaa 600 ttttttgacc gtttgttttg ctccaccgcc ataaatagtt cgtccatttc ttaaacagat 660 ggcgctaacc ctgcgcaggc ctgcgcagga atcgctgaag tcccgcgcgg ggaacgggcc 720 aaaatctgcg ctggccagcg cagaatttgg tacatttcct gcgcaggaaa ctgctcgaat 780 ttgcgcagcg gggatctgtg atgtacatat tataatgtac acactctcat tccaaacgtc 840 actgtgactg gtcgaagatt ctctgcgctc cgactatttt caatattatt ccgagaaaac 900 ctgtatcttt gctgaacctg ttgctggtgt aggtgacttc attgcctggt tttgtttaac 960 aactggatta ttcaccgtca atcacctgga aacgttttcc ggcccacaat cattacgtcg 1020 cttcaacatc acaatcaatc gctcatccac aaagcaatgg ctggtatttg ttccgcctgc 1080 gctaacgaca tcgtagctgc tgatcgcatt gtgaaatgcc agggttggtg caactctgag 1140 tttcacttct catgcagcgg actttctgag gaactgtccg ctactataga gtcctgtgca 1200 caactttttt gggcctgtaa agcctgtgta aagtttcaca aggatccgcg tacggccgtg 1260 ttgaggtcat ccaccccgta cactcacact tctgtcgacc tactgtccag catagccgac 1320 cttaaagcgg gcctccgtag cgagctgtca cagcatacca cagctattaa gttagagctt 1380 ctggaagttt taaaggcgga gatccgttcc tgcttgcgat cgacgcatgc cgccaccgat 1440 ttaccgtctc aacggcccat tcatcacaat tcagcgccta aattgtttaa ttcagtggtc 1500 aaaaatatcc ctggtactat tacacccaca ctacaacagt atccatcatt agccgcttct 1560 cttagtgtga gtgcgaacga accgtcatcc ttcaccaata tgccaccast ttacaacccg 1620 caaacaccac tactcagagg atcgggatcg ccgctcgatt ctgacacact agacaccatc 1680 ccacacactg atacgcgaat gtggctattc tttacgcgtt tctccccatc ggttaccact 1740 gagcagattt ctctcatggt gcaagtacgt ctagcactcg ataagcggga tgtgtttgta 1800 caccgtctga cgaagcttgg tgccgacact agtacactct catttatctc atttaaggtg 1860 ggcataccag ccactctacg caacaaggct ctctcaccta agacatggcc ctctgctctt 1920 acctaccgag agttccgtga ctatcggacc aataattata acactaatac ctgtgcaaca 1980 actgaaacaa attcgatgct cgatcaaaac gttactctcg ctaccgacac ttatcataat 2040 cctctctcac attctggagg gagtgccgcg atgccaacta ctacaacaga tacgctcgca 2100 cagcccgtta cgatttcctc acctgccatg accactacag acacagcttt gtttacaaac 2160 gaaccacatc ttgaccttaa tgagcgaatg cttgttacca ccacacctac caggtcacct 2220 cccgaatgct tagccgctcc taaaaaacga ccgaagcgcg gaaacgctaa acggactgat 2280 gaatctgctg acgctggccc gtcggatgaa tagcaatctt gcactgcatc cggcgatgca 2340 cctcgctcag ctcatcccag tctctctatc tactaccaaa atgtacgtgg tctacgaacg 2400 aaaactacaa atcttcgcct ggcgctatca gaatcagaat atgattttat cattctcacc 2460 gagacttggc ttactcagtc cataccttct tcgctcctca ctgacgatca ttatcatatc 2520 tacaggtgcg ataggaatct ttccaacagt gccctctcac gcggtggggg tgttttaatt 2580 gcatgttcct cttcaatacc gacatgtgaa atcgcatcgc ctaataccat actggaacaa 2640 ctttggatca aaacattgct gccaggtgtc tctgtttaca tcggcgttgt ttacattccg 2700 cctagtcatg cgaatgaccc cgcagtgatg aacgctttac atgatagtgt acgtgaaatt 2760 tcaagccgca ttaaagagag cgatttatta tacgtcttcg gagatttcaa taaacctgat 2820 atcagatggg agctgactaa tacatcagaa gccaccgatt gctctccatg ttattctgtc 2880 atgcattatg cacctttatg caattccgtg gctaataccg atttcgttga tgggttgcat 2940 agtaccggat tatttcagtt gagtggtatt gcaaatcaat ctgggcgtca attggatctg 3000 gtcttcgcaa accttgccgc aaccaatatt ttgtgcgact caatcacacc tctgcactct 3060 gtgaatggta cttcggctct agagaactcc ctcccatacg taacacactg tagtattcca 3120 cttctcagtg aggactttca tcatccttca ttggatatga tgatttatta tcccgtacaa 3180 ctatcccaca ccaccaacag tcacactcgc agtacagtca atagaaattt cttcaaaacg 3240 aatgtggaac gtatgaattc tcttattgtg tcgtttgacc gcaattttga ctgctccaac 3300 tttgccacta tcgacgaagc caccgatttc tttagcgttt ttatgcgctc agcgattaat 3360 tcctgcgttc ctgttgctca acgaaagtct ggccccgatt ggtctaatgc atctttaaga 3420 cggttgaaaa aaataaaatc aaaagcctac gcggattaca gtagaacgag atcatcgctg 3480 cataggagaa tttttttcga tgcactgaac aattatcgtc gacawaatcg tgtgctctac 3540 cgctccttca ttcgccgtac tgaaaggcag ctgttttcta agccgacacg gttctggagc 3600 ttctggaaca aacggcgcaa tataagaagt atccctccgt caatgagcta caatggcsaa 3660 actagtatcg atacatccga tatttgcaac actttcgcca atcgtttcgc tgatgcattc 3720 acccttcctg ttcacaatcc taacacacta gcagaggcca cycgcaatac tccatcggat 3780 gctatcgatt ttattatacc cacaattgac gaagcattaa ttgcgcgcac actcaacgat 3840 ataaaaccat ctacatcatc tggacctgac aatattcccg catacatttt gaagcactgc 3900 cgtcaatcac tcgcacccat tcttgccaaa atatttaatg attcccttat gcgtggcacg 3960 tatcctgcgt cctggaaaca cgcgcgaatg gttcctatcc ataaaaaagg cagtcgactt 4020 catgctagta attatcgtgg cattgtttcc ctatgcgctt gtgcaaaggt gtttgagctc 4080 attctataca atccgctact cacagcagtt caaaactata tgagccctag tcagcatgga 4140 tttctcccaa ggagatcttc caccacaaat cttgctgaat ttgttggyta ctgcttcgac 4200 aacatggatc gtggtactca agttgatgca gtatatatcg acttcagggc tgcgttcgat 4260 agtatytctc atgatattct actctcgaag ctaaaaaaac tcggtttcct cgactggcac 4320 atcacctggc tgcgttcata tttaactggt cgttcgtact acataagcat aggatctcat 4380 cgttctcact ccttcaccag ctcctccggt gtgcctcaag ggagtaattt gggaccgcta 4440 ctcttcctca tctatataaa tgatctatct ttcgttttac cgccaggcca acacctaatg 4500 tacgccgacg atgtaaaaat attcgctcca gttagaaacg acagtgactg tgtacgcctt 4560 caaacgatcc ttgagaatct ggatagctgg tgcagcagaa acgccctcca agtgtgtgct 4620 gataaatgcc agtgtatatc attcagcaga gcccgtcacc ccatcacgtt tacatacact 4680 atgctcaaca cggctttggc tcgcacgaca tgtatccgtg atctgggggt gctactcgat 4740 cagaagatgt catttcgccc tcacattgat agcgttgttg cgaagggaaa tcagctactt 4800 ggtttaatta cgcggacctg tagcgagttt accgatccca tgtgcgtcaa gtcgatcttc 4860 tgtgccatcg taaggtcgtg cctggagtac tgctgtccga tctggtgccc gcttggcgtt 4920 ggtgacatca atcgcctcga agccattcaa cggagactca ccaggtacgc ggttcgactc 4980 cttccatggc aatcccacca cgctcggccc acctaccatc agcggtgtct gcttctcgga 5040 cattgaacca ctctgctctc gacgtaaaat atgacgccca atgccttttc atattccggc 5100 tccttaaach cggagagatc gattccccgg catcgaatac tagccagcat caatttgttc 5160 gctccctgtc ggattcttag atccaatttc catctccgtg taccgcgtac ccgcaacaac 5220 cattagccag ggacacccta ttatacgtat gtccctcgag ttcaatgaag tgttagattt 5280 gttcgatttt agtatgtcta cttctacgtt caaggagaaa ttgcgtctac gtcacattta 5340 atgtttgctt ataactagag gtctatttat atgctataaa tgatcattgt tatacatacc 5400 ttaatttaat tataaggttg ccgattagac acgatggtcc gtcggtttat atatgaaata 5460 aataaataaa taaataaata aataa 5485 // ID GYPSY31-LTR_AG repbase; DNA; ANG; 181 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY31-LTR_AG is an LTR of retrotransposon GYPSY31_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY31_AG; GYPSY31-I_AG; GYPSY31-LTR_AG; Gypsy clade; KW MDG3 lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-181 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY31_AG, a member of the MDG3 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 55-55 (2004). XX DR [1] (Consensus) XX CC GYPSY31-LTR is a long terminal repeat of GYPSY31_AG (its internal CC portion is deposited as GYPSY31-I_AG). XX SQ Sequence 181 BP; 77 A; 33 C; 30 G; 41 T; 0 other; tgtaaacaat ttaaaattag gttacttata cttatctcgg gtgcaataat ctaatctaca 60 agaactgcaa taaaatagga atgacagcga gcagaacggc acttcctacg acacacgaaa 120 aaacgataga aagccctcgt gtacgaaaaa agcaacaatt gtaaaaataa agctgtttac 180 a 181 // ID BEL13-I_AG repbase; DNA; ANG; 5752 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE BEL13-I_AG is an internal portion of the BEL13_AG LTR DE retrotransposon - a consensus sequence. XX KW 5-bp TSD; BEL13-I_AG; BEL13-LTR_AG; BEL13_AG; Bel clade; KW LTR retrotransposon; RING Zn-finger; integrase; peptidase; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5752 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "BEL13_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(3), 33-33 (2003). XX DR [1] (Consensus) XX CC BEL13_AG is a young family of Bel/Pao-like LTR retrotransposons. CC BEL13-I_AG, an internal portion of BEL1_AG is flanked by CC BEL13-LTR_AG CC LTRs. The BEL13-I_AG consensus sequence was reconstructed based CC on CC multiple alignment of 6 copies; they are less than ~1% divergent CC from CC the consensus sequence. CC The consensus sequence encodes one 1762-aa BEL13_AGp Bel-like CC protein CC (pos. 429 5714). CC BEL13_AGp is composed of the peptidase A16 (pos. 150-290), RING CC Zn-finger (pos. 385-422, reverse transcriptase (pos. 800-930) and CC integrase (pos. 1480-1620) domains. XX FH Key Location/Qualifiers FT CDS 0..0 FT /product="BEL13_AGp" FT /translation="MSVRRSDVHKTPAESEEWKIAPIDMEEAGPSSEIPAG FT LDEAKEVHRPIDKNQCIRKHFLNKLQRIEEALSGTSLGDTTFLKGCANRLT FT SLASEYEKWHQTVLETADMENFEDGEEEYARFEKRHFNLLLRIERGMSITT FT NVSQSRVKLPELRLPTFDGSLEAWLPFRDSFSSLIDANASLSDVDKLRYLK FT GALTKEANKLIADIEITSANYIVAWELLKARYENKKLAVKRHIDALFLIPV FT MKKDSYESLIHILDSFERSVNITKQLGVATEGWSVLLAHMLHSRLDSATQM FT HWEAHHRSTDVPEYYELLTFLKSHALVLQAMLSPGQKKEQYTSSWKQRSKS FT EVHVVNSSMEICSFCKKGSHSPFKCDMFSGWTVQERYDKVKEKKLCINCLL FT PGHIMKNCTSSVCRVCNKKHHTMLHKPVQTSNSAEASPNDREIVTQPDPPA FT DQNVVTYCGNALLTNSIETPSTILLPTALVKIELPDGSLHWARALLDGGSQ FT INLVTERLCQRLQVIKKRENHPIGGVGQSKHVSSHSTQLTIKSHCTSFKAN FT WKFHVMRYITWNLPAEKVNKTRYCIPNTCTLADPKFYEPSSIDLLIGRESY FT DELMLEGILKLVPEKVMLQNTELGWIVSGRVELERRPTSSIVNLVCTNQDL FT ENQLTKFWEIESCNTNSTMSIEETSCEKVFSETTTRDDQGRFIVTLPTKKD FT IVPQLGNSFEIAKRRLNSLNRRLASNKDLKAAYIAFLEEYVQLGHMEEITE FT QHTNIDTPIYYLPHHCILRPDSLTTKLRVVFDASCATDSGLSLNDALMVGP FT VVQDDLVAIMIRFRLPKFAIVADIEKMYRQVWIKKEDRSLQRILWQNCPEN FT KLRIYELKTITYGTASAPYLATKCLQMLSVHGTSTHPEASRVLANEFYMDD FT LLTGVETQTEGEELCHQLTDLLSSAGFTLRKWASNSSQILQSIPVDQRDTS FT GLCSLDINSSIKTLGLKWIPATDELGFCVPIWTEDEQITKRIALSDASRLY FT DPLGLIGPTIMIAKCFMQNLWALQKAWDEPLEKELHKQWNQFRQQLSIVKD FT MRIPRRVVGSTHRIEIHGFSDASMKAYGACLYMKSVSEDGKVSVNLLCSKS FT RVAPLANSKRQKNVTLPRLELSAALLLCHLWQKVKDSLKHEYSCFYWVDST FT IVLHWINSSPSRWKPFVANRVSEIQHLTEPRHWNHVPGDQNPADIISRGMM FT PSQLQESCLWWHGPEWLSQPSNTWKLHHPILDCPPSEFEERKTVLIINKQS FT NIHHPIFSLKSTFSGLVRLMAYMQRFSYNCKPVNRNNRRQGYLQTFELHAA FT RENLVRIAQNESFADDIRSLETAGEVKTSSSLRSLTPMLVNGVLRIGGRLR FT NAPVAYDRKHPMILPYKHPLTRLVMDFYHLKTLHAGQQLLIASVREKYWPL FT RVRNLARQVVHECIQCFRCKPSTMEQIMGDLPAERVTPTFPFLNTGVDFCG FT PLFYRSASRKSAPVKCYVAVFVCLATKAIHLELVADLSSDAFISTLKRFVA FT RRGKPSLLQCDNAKNFRGAERKLKVFHQQLQQQQFQQSISSYCGPEGIEFR FT FIPPRSPHFGGIWEAAVKSFKHHFRATIGTSILRRDDLETIIAQVESCLNS FT RPLTPISTEPEDLEVLTPGHFLIHRPLVAVPEPSYEEVPSNRLDRYQQNQE FT FVRRIWNRWSTDYLSGLQPRTKWTKQRDNIHIGTLVLMKEDNLPPLKWSYG FT RVTQIYRGDDGNIRVVTVKTKDGEYKRAITKICVLPIHSNTE" XX SQ Sequence 5752 BP; 1667 A; 1200 C; 1313 G; 1572 T; 0 other; ttttggtcct tcaaatccgg atatcgtttg aaaatcgtga tctatggtga tctttcgtga 60 acattcgtac gcctttgtga acagtcgagc attatcgtat gcgttgttca taggtgttcc 120 tttcgttgta tgcgccttgt tcgtgtggag ttttatcttg ttgttattat ctatggtgat 180 aatttgaaga cagttcctgt tagaggatac tttgttgaat atcatacaat tgagtgtgat 240 attgtgcaaa ttcagcttgc tatcgggtat gctgtattag acgcggacaa tttttaggaa 300 aagatacatt ggtgtgcgtg tacgtgtcgt caggctgtac ctgtttggaa gtgttggtga 360 gatcgcttcg cagccatttt gaagagcagc atcttaaaaa gtattgtcgg cgcaaatagc 420 tcaacaaaat gtcggtcagg agaagtgatg tgcacaaaac gccagcggaa tccgaggaat 480 ggaaaattgc acccattgac atggaagaag ctggaccgtc atcggaaatc cctgcgggtt 540 tggatgaagc aaaggaagtg catcggccta ttgataaaaa ccaatgtata cgcaagcatt 600 ttcttaacaa attacagcga atcgaagaag cattgagtgg tacatcgttg ggtgacacga 660 cgtttctgaa gggatgtgca aatcggctta cctcactggc gtcggagtat gaaaaatggc 720 atcaaaccgt cttggaaacg gccgatatgg agaattttga ggacggtgag gaggaatatg 780 cacgattcga aaaacgtcac ttcaacttat tgctacggat tgaacgtggt atgtcgatca 840 ctacaaatgt ttcgcaatct cgtgtaaagc ttcccgaatt acggctgcca acgtttgatg 900 gctctctcga agcttggctg ccttttcgcg attcttttag tagcctgatt gatgctaatg 960 caagcctgtc agatgtggac aaattgcggt atttgaaggg agcattgaca aaggaagcaa 1020 acaagctaat tgcggacatt gagattactt cggctaatta catagtggca tgggagcttc 1080 ttaaggctcg ttatgaaaac aaaaagctag cggtgaagcg acatattgat gctttgtttt 1140 tgataccggt aatgaagaag gattcgtacg agtcgctcat tcacatttta gatagtttcg 1200 agagaagcgt caacataacg aagcagcttg gtgtggcaac agagggttgg agtgtgcttc 1260 tggcgcacat gctacattct cggcttgatt cggctactca aatgcattgg gaggctcacc 1320 atcgcagtac tgacgttcca gaatactatg aacttctaac gttcttgaag agtcatgcgt 1380 tagtgttgca ggcaatgctg tctccaggcc aaaagaaaga acaatataca tcgtcgtgga 1440 agcagcgatc gaaaagtgag gtgcacgtgg taaattcttc tatggagata tgctcgtttt 1500 gtaaaaaagg ttcacattcg ccctttaagt gtgacatgtt tagtggctgg acagtacaag 1560 agcgctatga caaggtcaaa gaaaagaagc tttgtatcaa ctgtttgttg cctgggcata 1620 ttatgaagaa ctgtacatct agtgtatgcc gagtttgcaa caaaaaacat cacactatgt 1680 tgcacaaacc agtacaaacc agtaattcag ctgaagcatc tcctaatgac cgggagatag 1740 tgacgcaacc ggatccacct gcggatcaga acgtggtcac atactgtggc aatgctttgc 1800 tgacaaattc gatagaaaca ccatctacta tcctattacc cacagctttg gtgaaaatag 1860 agttaccaga cggatcgttg cattgggcac gagctctttt agatggagga tcgcagatta 1920 atcttgtaac tgagcgtttg tgtcagcgat tgcaggtcat taaaaagaga gagaaccatc 1980 caattggcgg agttggacaa agcaagcatg tatcatcgca ctcgacacag cttaccataa 2040 aatctcattg cactagcttc aaggcaaatt ggaaatttca tgtaatgcgt tacattactt 2100 ggaatttacc tgcagagaaa gtgaacaaaa cacgttactg cattcccaac acttgtactc 2160 tagcagatcc caaattctat gaaccatcat ccatcgattt attgatcggc agagagagct 2220 atgatgagct gatgctagaa ggtattctta aactggtacc cgagaaagta atgctacaga 2280 acaccgagtt aggctggatc gtttctggca gggttgagct cgaacgtcgt cctacatcct 2340 ctatagtaaa tctggtatgt actaatcaag acctagaaaa tcaactaact aaattttggg 2400 aaatcgaatc ttgtaataca aatagcacca tgtctattga agaaacatca tgtgagaagg 2460 ttttctctga aacaaccacg cgtgatgatc aaggaaggtt tattgtaacg cttcctacga 2520 aaaaggatat cgttccacaa ctcggaaact cgtttgaaat cgccaaacgt agattgaact 2580 cgctaaatcg tcgtcttgca tcaaacaaag accttaaggc tgcttatata gcctttttgg 2640 aggaatatgt tcaattggga catatggaag aaatcacgga acaacatacg aacattgaca 2700 cacccattta ttacttaccg catcactgta ttttacgccc tgacagttta actacaaagc 2760 ttagagtcgt gttcgatgcg tcttgtgcta ccgactccgg cctctcgttg aatgatgcgt 2820 taatggtggg tccagttgtt caggatgatt tggtcgcaat tatgattcgt tttcgtctgc 2880 ccaaatttgc aatcgtagcg gacatcgaaa agatgtaccg acaagtatgg attaaaaaag 2940 aggatcgctc gttacaaaga atcctttggc agaattgtcc cgaaaataag cttcgaatat 3000 atgagctaaa aacaattacg tatggcactg catctgctcc ttacttagca accaaatgtt 3060 tgcaaatgct ttccgttcat ggaacttcta ctcatccgga agcctctaga gtactggcaa 3120 atgaatttta tatggacgac ttgcttaccg gagtagaaac tcaaacagaa ggagaagaat 3180 tatgccatca actaactgac ttgttgtcga gtgctggttt tactttgaga aaatgggcgt 3240 ctaactcttc ccaaattctg caaagtattc ccgtagacca acgcgatact tctggattgt 3300 gcagccttga cataaatagt tccattaaga cattgggtct taaatggatc ccagcgaccg 3360 atgagttggg attctgcgta ccgatttgga cagaagatga acaaataaca aagcggatag 3420 ccttatcaga tgcatcgcgt ctatacgatc cattgggact cataggtcca acaataatga 3480 tagccaaatg ctttatgcaa aatctttggg cactacaaaa ggcatgggat gagccgctag 3540 aaaaagaatt gcataaacaa tggaatcagt ttcgccaaca gctctcaatt gtgaaagaca 3600 tgcgtatccc acgtagagtg gtaggaagta cccatcgcat cgaaatccat gggtttagtg 3660 atgcgtctat gaaggcttat ggagcctgtt tatatatgaa atcggtctct gaggatggaa 3720 aagtttctgt taaccttttg tgctctaaat ccagagttgc tccactcgca aatagcaaac 3780 gacagaagaa cgtcacttta cctcggctgg aattatctgc agctttattg ttatgccatc 3840 tctggcaaaa ggttaaagat agtcttaaac acgagtattc gtgtttctac tgggtggatt 3900 cgaccatcgt gcttcactgg ataaacagta gcccgtcacg ctggaagcca tttgttgcta 3960 atcgggtgtc tgaaatccag catctgacgg aacctagaca ttggaatcat gttcctgggg 4020 atcaaaatcc agctgatatc atctctagag gaatgatgcc gagtcaattg caagaatcat 4080 gcctatggtg gcatggacca gagtggttaa gccaaccatc caatacttgg aagctacatc 4140 atccaatact cgattgccca ccctctgagt ttgaagaacg gaagactgtt ctgatcatta 4200 acaaacagtc taatattcat catccaatat ttagcttaaa atcgacattc tccggactcg 4260 ttagactgat ggcatatatg caacggttta gctacaactg taaacctgtt aatagaaaca 4320 atcgtcgtca aggctacctt caaacatttg aactacatgc agcaagagag aatttggtac 4380 gcattgcaca gaacgaatct tttgctgatg atattcggtc tctcgaaact gctggagaag 4440 tcaaaacatc gtcatcttta agatcattga caccaatgct tgtaaacggt gtactacgca 4500 ttgggggacg tcttcgaaat gctcctgttg cttacgatcg gaaacaccca atgatactgc 4560 cctacaaaca tccattgaca cgtcttgtca tggatttcta tcatcttaaa accttacatg 4620 ctggacaaca actgttgatt gcttctgttc gagagaaata ctggccttta cgcgttcgaa 4680 accttgctcg gcaagtagta cacgagtgta tccagtgctt ccgttgtaaa ccatcgacaa 4740 tggaacagat aatgggagat ttacccgcag aacgagttac tccaactttt ccgttcctga 4800 acactggtgt cgatttctgt ggaccgctgt tttatcgttc ggcgtccagg aaatctgctc 4860 cggtgaagtg ctacgttgca gtatttgtgt gcctggccac gaaggctatc catctagaat 4920 tggtagctga tttgtcgtcg gatgcgttca tatcaacact gaaacgattt gtcgctcgtc 4980 ggggaaaacc atctcttctt cagtgtgata acgctaaaaa ttttcgtgga gcggaacgaa 5040 aattgaaagt gtttcatcaa caactgcagc aacaacaatt tcaacaatca atttcgtcgt 5100 attgcggtcc agaaggtata gagtttcgtt ttatccctcc taggtctccc cactttggtg 5160 gaatctggga ggccgccgtc aagtctttta agcatcattt tagagctact attggaactt 5220 cgatcctgcg tcgagacgac ttagaaacga tcatcgccca ggtggaaagc tgcttgaatt 5280 cgcgcccttt gaccccaatc agcacggaac ccgaggattt ggaggtgctt actccagggc 5340 atttcctgat ccatcgtcct ctggttgctg ttcctgaacc ttcatacgag gaagtgccat 5400 ctaatcgcct ggatagatat caacagaatc aggaattcgt gagacgcatt tggaaccgat 5460 ggagtacaga ctacctgtct ggcttgcagc cccgcacgaa gtggacgaaa caacgggaca 5520 acattcatat cggaactctc gtgttgatga aggaagacaa cttgccaccg ttgaaatgga 5580 gttatggacg agtaactcaa atctaccgag gagacgacgg caacatccgc gtggtcacgg 5640 tgaagacaaa ggacggcgaa tacaaacgag caattacgaa gatctgcgtt ttgcccatcc 5700 attccaacac ggaataagtg gtggaaattt caatttccac gggggccggc ta 5752 // ID TransibN2_AG repbase; DNA; ANG; 1750 BP. XX AC . XX DT 21-MAR-2005 (Rel. 10.03, Created) DT 12-APR-2005 (Rel. 10.03, Last updated, Version 1) XX DE TransibN2_AG is a family of nonautonomous DNA transposons - a DE consensus sequence. XX KW Transib; DNA transposon; Transposable Element; Nonautonomous; KW TransibN2_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1750 RA Kapitonov V.V. and Jurka J.; RT "RAG1 core and V(D)J recombination signal sequences were derived RT from Transib transposons."; RL PLoS Biology0-0 (2005). XX DR [1] (Consensus) XX CC TransibN2_AG is a family of nonautonomous DNA transposons that CC belongs to the Transib superfamily. TransibN2_AG elements are CC characterized by 14-bp terminal inverted repeats (1 mismatch) and CC 5-bp target site duplications. This family was active less than 1 CC million years ago (some copies are less than 1% divergent from CC the consensus sequence). XX SQ Sequence 1750 BP; 633 A; 266 C; 261 G; 589 T; 1 other; cacaatgggc atttgccggg atgaaattca aaaaatcaac tttgtaatag cgcaattcca 60 aataataggt tttaatacta agggttaaac taagaaaaat accaaatatg agctctttat 120 ctgttctggt tctctagaaa acacctctca aagtcgagat ttgttaaaaa aacgcagaaa 180 aatcttcact tttttggaaa actttaaacc tttgtaactt tttcaaatat gaaccgattt 240 tgataaattt agacatttta taaaggatat ttagtcagct ttataaacac atgaaaaatt 300 ttgagctaag ttaattttat gcaaaaatat acgataataa ctgaagaaac ctcctgaaaa 360 ttttcatmat tttaattttt gacttgatgc cattataaaa gtggcgtaga gtggcgttgt 420 ctcatatatg tcacatcata gtgtaatata tatactatca atctgtgaag aaatattctg 480 cgaaagtttg cgaacgcgaa ttttattgaa gttttttcac atcaactttt tctaaaaaca 540 gacacatgcg taactaggcg gctccatagg tgcctagtgt agcgtatttc gtgtattcta 600 caaaaatata ttctacgtgc ttttgatcaa ctttcatttc aattacgaag ctgtgtatct 660 tagtttcagc tcatgtactt atcttatatg taaatttcta ttctaaccta actttatcat 720 taaatataat gaagcaaatc agaaccaaat accttttggt cctactgcag attatagaag 780 gatataagga aagtaattct taaattatac gcataacagg aaaacgttat gaatgcagta 840 gcatctattg gcgttttggg agaagaaact cccaaaactc atgacaaact aaatataagg 900 atagagggga gatgctacaa agaatatttt gcaagaaaga aaatgcatcc agaaggacat 960 tttgtataat gagctgtatt cctctgattc gttaaccagt tcaatatcag tagctacacg 1020 cactaaaaca aagacagggt ttaactttac aaatgaagta aatacttagt ttgtagctat 1080 cgtgcaattt tgattttttc agatagttgt aaaaccaaga aggattttta gttttattac 1140 cccttctgaa ggcagtacag cacttggaaa ggtattggat gaatattcaa agacaaacaa 1200 tatgtgctgc ttttcattac gaatttggga agtactttat ttcaatgcac ctgctctatc 1260 taattattta aatgccttta ttcatgtctt tagaaaagtg aactagacgt caggagcttg 1320 taatagcatc cacctaagtt gtgtatgtaa atttttaata tttgtactac ctgtgagtaa 1380 aagtgacgag aattagctac aattgtccta tacaaagcag aaaaaattat cttattcaaa 1440 attattatgt tttgtttata tattattatt attattattt atttaatctt cattatggaa 1500 cattacagta tattttaaaa tggtttatac tattagcaac aaataataca agaaaaaaaa 1560 cacacaaaat taataaaact aagcaaagat cccttaaaac actattttat gtagcgccac 1620 ctggtggtat ggatgcgaac tacaaataaa atgttattta ctatcataat actttactac 1680 aaacgataaa aactaacaat ttctgcaaaa ataattttat cccgcaattt gttctaaact 1740 gcccattgtg 1750 // ID GYPSY67-I_AG repbase; DNA; ANG; 4715 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY67-I_AG is an internal portion of retrotransposon GYPSY67_AG DE - a consensus sequence. XX KW LTR Retrotransposon; Transposable Element; 5-bp TSD gag; KW AP protease; GYPSY67-I_AG; GYPSY67-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY67_AG; mag lineage; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4715 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY67_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 173-173 (2004). XX DR [1] (Consensus) XX CC GYPSY67_AG is a family of gypsy-like LTR retrotransposons that, CC according to the aminoacid sequence of its reverse CC transcriptase, RNase and integrase domains is phylogenetically CC grouped with representatives of the mag lineage of other CC organisms. GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, CC GYPSY23_AG, GYPSY24_AG, GYPSY25_AG, GYPSY26_AG, GYPSY27_AG, CC GYPSY28_AG, GYPSY55_AG, GYPSY56_AG, GYPSY57_AG, GYPSY58_AG, CC GYPSY59_AG, GYPSY60_AG, GYPSY61_AG, GYPSY62_AG, GYPSY63_AG, CC GYPSY64_AG, GYPSY65_AG, GYPSY66_AG, GYPSY68_AG and GYPSY69_AG, CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY67-I_AG consensus was reconstructed after multiple CC alignment of 5 copies. The consensus encodes the 1529-aa CC GYPSY67_AGP gag-pol like poliprotein (pos. 117-4703). The CC sequence of the LTRs flanking GYPSY67-I is deposited as CC GYPSY67-LTR_AG. CC GYPSY67_AGP: CC MNSGDGAMEVPMDEDQRRSSLGRIDYRAMTAQQTGEVPSVYDPPGHVQPGSRGYVSQQGQLTPR CC PVQLGPSPSQPAPSQSAPLPSVQNGQMGQQPLDNAVLQQTLHLLQQQLKQQQQLISQMLQQQQS CC APPAQLQPAQQYQPAGPSNPELILDALANSIAEFRYEAESGVTFEAWFTRYEDLFAKDASRLGD CC EAKVRLLVRKLGTPEHARYISYILPRSPRDLSFEETVDKLTALFGCRESLLSKRYRCLQICKKR CC TEDLIAFSCRVNRACVEFQFASMNEETFKCLMLVCGLKDEADNDLRTRLLARIEERNDVTLEQL CC SAECQRITSVKGDSAMIAGETSERVFAVHSGEKRSHEKAAQQTNYKRFTPYRTKRPFRAKYAVC CC SSTSKPAKPCWLCGDMHWVRECTYRSHKCLDCARYGHREGHCNTASRKKRFNVRQRNINTRVVT CC VNVRSIRERRRFVSIALNGTAVRLQLDTASDISVIDRRTWRKIGSPPLTPSSVTAKTASGATLV CC LDGEFSCAVSVGSQTRQATLSVCGAANLLLLGADLIDVFSLWSVPMDAFCNHVTVAGQQSFQQL CC FPKVFTGTGLCTKASIKFTLRDNVRPVFRPSRPVAYAMEETVSRELDRLEELNVITPVTTAEWA CC APVVVVRKANGLVRICGDYSTGLNAALFPHDYPLPVPEDIFARLANCKVFSKIDLSDAFLQVEI CC DPEYRHFLTINTHRGLYTYNRLPPGIKIAPTAFQQLMDIMLSGIQGVSVYLDDIIIGGPSEAEH CC DATVVEVLNRIQNYGFTLRAEKCHFRVNQIKYLGHIIDSHGLRPDAQKVEAIRKLPEPTNLTEV CC RSFLGAINYYGRFVPNMRNLRYPLDDLLKAGVEFRWTSECRKAFESFKTILASDLLLTHYDPKQ CC AIIVSADASSVGLGATISHKYPDGSIRVVQHASRALTAAEKAYSQIDREGLAIIFAIKKFHKMI CC FGRKFLLQTDHRPLLRIFGSKKGIPNFTANRLQRFALTLLAYDFEIEYVRTDQFGNADLLSRLI CC HTHAKPDEDSVIACVTLEAEVKSLVTSAIQHVPVNFIDISRETIADRLLSKVLQYVQHGWPNNA CC AYEGELSRFHDRKDSLTAIDGCLLFRERVVIPKVLQRTCLKQVHLGHPGINRMKAMARSYFYWP CC SMDHDIVDWVNSCHSCQIAAKSPTHIKSTSWPEAPGPWYRIHVDYAGPLNGEWYLVIVDSFSKW CC PEVIPTSSTTTTATISILRNIFARFGNPVTLVSDNGPQFTSTDFESFCRQNGVEHIRTAPYHPQ CC SNGLAERFVDTFKCALRKMSADGLTLREALDTFLQTYRATPNAQLNNSKTPAEIMLGRRPRTLL CC ELLLPPRQASQPSAGLRVRELNPGESIYAKEYRLNDWKWVTATVVDRQGRYMYLVRTAEGKTFR CC RHINQLRRRISDGSFKDTTRAHSPLPLDLLFDAWHFNPSPVPASSGQLEADKPSSPPAPVSSCV CC VPSSNFNIESSRPTLIKCPRRRATSRRTDQTIDSVHEPRRSSRNRRPPSRFDVYRTF. XX SQ Sequence 4715 BP; 1280 A; 1152 C; 1170 G; 1113 T; 0 other; gtggcgacga gaatttgttt taaaagttta gtgtctgtaa agtaaaagtc tgtgttttaa 60 cgtgtaagag tgtttaaaaa aaaaaaaact ccagcatcac gtcgaagcgg cgcaggatga 120 acagcggaga tggagcgatg gaagtcccga tggatgaaga ccagcgacga tcatcgttag 180 gccgtatcga ttaccgtgca atgacggccc aacaaaccgg tgaggttcct tccgtgtacg 240 acccgccggg ccatgttcaa cctggatcgc gcggctacgt gagtcagcaa gggcagttaa 300 caccgaggcc agtgcagcta ggaccatcgc catcgcaacc agcgccatca cagtcagcac 360 cattgccatc cgtacaaaat ggccagatgg gtcaacaacc tctggataac gccgtcctgc 420 agcaaacctt gcacttgctg caacagcaat tgaagcaaca gcaacagtta atttcgcaaa 480 tgttgcaaca gcagcaatcc gcgccgccag cccagctaca acccgcacag cagtaccagc 540 ccgccggtcc tagtaaccca gaacttatac tcgatgcttt ggccaatagc atagcagagt 600 tccggtatga agctgaatct ggtgttacgt tcgaagcttg gttcacacgc tacgaggacc 660 tgtttgccaa agacgcttcg cggctaggcg atgaggctaa ggtaaggctt ttagtgcgta 720 agttgggaac accagaacac gcccgttata taagctatat tttaccccgc tcaccacgtg 780 atttatcgtt tgaggaaacc gtcgataagc tgacagcact ttttgggtgt agagaatctc 840 tccttagtaa gcgctacaga tgcctccaga tttgtaaaaa gcgcacggaa gatttgatcg 900 cgttctcttg tcgggtaaac cgagcatgcg tcgagttcca gtttgcgagc atgaacgaag 960 agaccttcaa gtgcctaatg ctggtgtgtg ggctcaaaga tgaggctgac aatgacctgc 1020 gaactagact ccttgcgcgc atagaggagc ggaacgacgt tacgcttgaa caattatccg 1080 cagagtgtca gcgtattaca agtgtgaaag gagacagcgc tatgattgcc ggagagacga 1140 gtgaacgtgt tttcgcagta cacagcggag agaagagatc gcacgagaaa gcagcgcagc 1200 aaactaatta caagcggttc acgccgtacc gtaccaaacg accgttccgt gcaaagtatg 1260 ctgtgtgttc atcaacaagc aagccagcga agccctgttg gttgtgtggt gacatgcact 1320 gggtgcgtga gtgcacttac cgttcgcaca agtgtctcga ctgtgcaaga tatggtcatc 1380 gtgagggaca ctgcaacaca gcgagcagga agaagcggtt caatgttcgg caacggaata 1440 tcaacactcg tgtagtaacg gtcaacgtcc gaagcatacg agagcgccgc agattcgtgt 1500 ccatcgctct caacggaaca gctgttcgat tgcagctaga cacagcttcg gatatcagtg 1560 tcatcgaccg ccgtacgtgg agaaaaattg gcagcccgcc tttaacaccg tcatccgtca 1620 cagctaaaac tgcatcggga gctacactgg tattggatgg tgaattcagc tgtgctgtta 1680 gtgttggtag ccagacgagg caggcgacac tcagcgtatg cggagcagca aacctactac 1740 tactgggagc cgatttgatt gatgttttct ccctctggtc ggtgccgatg gatgcgttct 1800 gcaatcacgt tacagtagca ggacagcaat cgttccagca gctttttccc aaggtgttta 1860 cgggaacagg actctgtaca aaggcgagta taaaatttac cttgcgagat aatgttcgtc 1920 ctgtttttag acctagccgc ccagtagctt acgcgatgga ggaaaccgtg agtcgtgagc 1980 tcgaccgtct tgaggagttg aatgtcatca ctcctgtaac tactgcagaa tgggcagctc 2040 cagtggtcgt cgtgcgcaaa gccaacggac ttgttcgtat ttgcggcgac tattcgacgg 2100 gacttaatgc tgcgttattc ccccacgact acccgttacc tgtaccagag gacatttttg 2160 caaggctggc aaattgcaag gtctttagca aaatagacct gtcagatgca tttttgcaag 2220 tggaaattga tccagaatac cgtcatttct tgaccataaa tacgcatcga gggctctata 2280 cgtacaaccg acttccacct ggcattaaga tagctcctac agcctttcag caattgatgg 2340 acataatgct atccggtatc caaggcgttt cagtgtactt ggatgacatc atcatcggag 2400 gtccatccga agcggagcac gacgcaaccg tagtagaagt tttgaatcga attcagaact 2460 acgggttcac actgcgagcg gaaaaatgcc acttcagggt taaccaaatt aaatatttgg 2520 gccacattat cgacagccat ggattacggc ctgatgcgca gaaggttgaa gctattcgca 2580 agttaccgga gccaaccaat ttaaccgaag taagatcgtt tttaggggcc ataaactatt 2640 acggtcggtt tgtacccaac atgagaaatt tgcgctatcc attggacgat ttgttaaagg 2700 ccggagtcga atttcgctgg acgtcggaat gcagaaaagc ttttgagagc ttcaagacga 2760 tattggcttc cgatttacta ctaacccatt acgatcctaa acaagcaatc atcgtatctg 2820 ccgacgcatc atcagttggc ctcggggcga caataagcca taagtatcct gacggatcta 2880 ttcgggtggt ccagcatgcc tcgcgcgccc tcacagcggc agaaaaagca tacagccaga 2940 ttgatcgcga gggcttggcc ataatttttg cgataaaaaa atttcataag atgatattcg 3000 gcagaaagtt tttattacaa acagatcatc ggccgctttt gcgaatattt gggtcgaaaa 3060 aagggatccc caattttaca gccaatcggt tgcaacgttt cgctcttacc ctactggcat 3120 atgattttga aatagaatac gttcgcaccg atcagtttgg caacgctgac cttctttccc 3180 gcctgataca cacacacgcc aagccagacg aagattcagt gatagcatgt gttacattag 3240 aggcggaagt aaaatcattg gtaactagcg ctattcagca tgttccagtt aattttatcg 3300 acataagtag agaaaccata gccgacagat tattatccaa agtccttcaa tacgttcaac 3360 atggatggcc aaataatgca gcttatgaag gagaactgtc gcgcttccat gacagaaaag 3420 attcgcttac agcaatcgac gggtgtctcc tatttagaga gagggtggta atccccaaag 3480 tattgcaacg cacctgcttg aaacaggttc atctcgggca cccaggaata aataggatga 3540 aagccatggc cagaagttat ttttattggc catcaatgga ccatgacatt gtcgattggg 3600 tgaactcgtg tcattcatgc cagatagcag ctaagtctcc aactcacatc aagtcgacca 3660 gctggccaga agcaccaggt ccgtggtacc gcattcatgt cgactacgcg ggaccactca 3720 acggggaatg gtacctggtg attgtcgatt cgttctccaa gtggccagaa gtgattccaa 3780 cgagcagtac aacaacaacg gcaacgataa gtatactgcg taacatattt gctcgttttg 3840 gcaacccagt aacacttgtg tcggataatg gtccacaatt cactagcaca gattttgaat 3900 ctttttgtag gcaaaatggg gttgagcaca tcaggacagc gccgtatcac ccgcaatcca 3960 acgggctggc tgaaaggttt gtcgacacct ttaagtgcgc tttgaggaag atgtcagccg 4020 atggtctcac gttacgagaa gctctggaca ccttcttgca gacctatcgg gccaccccga 4080 acgctcagct gaataattcg aaaactcctg ccgagattat gctaggcagg cgtcccagga 4140 ccttgctaga gctgttattg ccgccacgtc aagcttcaca accgagtgct ggtctacgag 4200 ttcgtgagct gaatcccggt gagtcgatct acgctaagga gtaccgcctg aacgattgga 4260 aatgggtcac tgctacggtg gtcgatcggc aaggcagata catgtacctg gttcggactg 4320 ctgaggggaa gacgttccgt cgacacatta atcagctacg tcggcgtatc agcgacggtt 4380 cgtttaagga cacaacacgg gcccattcac cactaccttt ggatctgtta ttcgatgctt 4440 ggcactttaa tccgtcacca gtacctgcat catcgggcca gctagaagct gataaaccgt 4500 cctcaccgcc agcgcctgtg tcgtcctgtg ttgtaccatc tagtaatttt aatattgaga 4560 gcagccgtcc aacactcatc aaatgtccac gaagaagagc tacatcccgc agaacggatc 4620 aaacaatcga ttcggtacat gaaccacgtc gctcttctcg caacagaaga ccgcccagtc 4680 ggttcgatgt gtaccgaacg ttttaaggag ggaga 4715 // ID R7Ag2 repbase; DNA; ANG; 6590 BP. XX AC AB090821; XX DT 14-SEP-2005 (Rel. 10.09, Created) DT 14-SEP-2005 (Rel. 10.09, Last updated, Version 1) XX DE Anopheles gambiae retrotransposon R7Ag2 DNA, complete sequence. XX KW Non-LTR Retrotransposon; Transposable Element; R7Ag2. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-6590 RA Kojima K.K. and Fujiwara H.; RT "Evolution of target specificity in R1 clade non-LTR RT retrotransposons."; RL Mol Biol Evol 20(3), 351-361 (2003). XX DR EMBL/GenBank/DDBJ; AB090821; Positions 1 6590. XX FH Key Location/Qualifiers FT CDS 1762..2610 FT /product="R7Ag2_1p" FT /translation="MLQQQQQQQRQPQRQAVVGTQQQQQRRQQQQHQQRSN FT ATQAQRREQLRNEQRRPARLRQDQIIFEPAEGTSYKVLYEKIRLNPRLQEE FT NKGVHQGYRTTRDFLRLELKKDTDAASLLQRIQQEVGDLAAGRIVTEMAEV FT LITGIDMLAKKEDVERGLQRALERTAVAATTSLWERRDGTQRARVRLPRRD FT TDLLLDKRIVVGHSVCLVRSAPKQQQSAVRCFRCLERGHTTADCAGEDRSS FT LCLHCGAADHRAASCTSDPKCIVCGGPHRIAAPMCKGPPSQC" FT CDS 2526..6110 FT /product="R7Ag2_2p" FT /translation="MHFGPEVHCLRRPTSHRRPHVQRTAITMLNILQFNAN FT HCENAQDLALHVITTESLDVLLLSEPYCVPRNNGNWVTDESKTVAIVVNGN FT RLPIQRIRHRQTLGVVAADVGGTTIVSCYVSPQTGVPEFRSIMEKIDLIVR FT GCSRVLLAGDFNAMHLDWGSSRTCPKGLELLQLADNLGLVLLNKADCLPTF FT KGNRADTNRFPPSRPDVTFASSVISRLDPRDDSARGWRVPDVATLSDHRYV FT QYEVGESSPPTRDRAARRGQRPARVSKAGTRWKTSQFDSQLFGKALAMTGF FT ARQVNSVESLVESLTSVCDETMSRVFPTQDHTGRPAYWWTPAIQAMIDNLS FT RKEQMTMRTIPPEEQLQTELLAARESLRKAIRLSKNEAFDRFLRSIREDVT FT GIFFRKVFHWFQGARSAPERDPAELRRIVDALFPVHPPVEWPDLGVGNMAP FT LRSIGLTELDQIAASMHPRKAPGLDGVPNAALTVALRQQPEPFRRVFQECL FT DMSCFPQPWKKQRLVLLPKPGKSPGEPSSFRPICLLDNTGKALERLLLNRL FT NEYIEDPESPQLSEQQFGFRRGRSTLQAIQQVVDAGRRALSLGRTNNRDRR FT CLMVVALDVRNAFNTASWQSIAEALQAKGVPVQLCRILQDYFADRELEYDT FT ADGPVTRRVSAGVTQGSILGPTLWNIMYDGVLAVELPEGASIVGFADDLAI FT LAAGTIPEHAAAIAEEAVAAVNNWMVQHKLSLAPEKTELLMISSKRSGYRN FT IPVNICGVEVRSKRSIRYLGVMLHDHLSWRPHVEMVADKALRVVRALRGIM FT RNHSGPQVSKRKLLAAVAASIIRYGAPVWTEATDLQWCRRILDRVQRLLAQ FT GITSAFHSTSCEVAVVLAGELPYHLLAKEDARCYNRQQSSPDSSREAIRQE FT EKETSLQLWQQQWDDVAANNTSRYLRWAHRQVPDVRLWTGRKHGEVDFYLS FT QVLSGHAFVHEFLHVFGFAPSPDCPRCAGSVESVAHVMFECPRFADVRAEF FT LQGVGEHNLGSRLLESAEWWDRIQQAARRILSVLQEDWREEQQTLAAAEAA FT QPDPASSLPEDMAEAERRLLRRREVRNRSAQRRRQQQRQQRLGDFELVPAV FT LARAAANEAAEPTGEVEEEEVSPPVPPIPPRSRRLPPSPRTTEMRRRRRNY FT MQLQYRRRRRDGELGDVPQGRQRRGRIPTSAAELER" XX SQ Sequence 6590 BP; 1616 A; 1777 C; 2009 G; 1188 T; 0 other; agttgcaact gagagttcaa accgaacaga cgtgcctcca cgcagttccg tgagcaagtg 60 aaaaaattgc aagtaaaatg caataaaacc gtatgaattg tgcatattcg tgtagatgaa 120 tgcaaagtga aacactgtgt gtgaaatttg gatcaaaaac agtggataga aacgttagaa 180 aagtgtttgt aatattctgt gacgtcacac cataagtttg tatggagaaa actttgtgcg 240 tcaaataaac aggttactac atatatttcc ggaatcaaaa gtgatcaggc gatagatctc 300 gccgggatct ttacagtgac ggttttcctg tggaaaaata ttaaatggaa agtggtaaaa 360 agtgcatcag aaagttattt aattcggaat ttatcgcgtg ggtaaaatgg cgtctgcccg 420 tgaaaatttt tattttcggg ttgggccagt gtgtcaaaag tgacaaggct tggtaaccgc 480 ctgacaccgt tcggccccgc cattttcgcc gtcgttgaag tgtcaaactg aattagtgaa 540 acattacaag aacaaattca aatggtccgt tgaaagtgca gtaaaagaag aaagttttcg 600 tgaaagtttt caaaaaaatt caaaaaaaaa ggcgttgcga gtgaaattgt ggtgaagtgt 660 gtgaaatatt tccgaccgtt aatttcaaaa ccaaccgttg gtgaactttt cggggccagt 720 tttgaaatta acggccaagt gcgcggaaag tttcacggcc aagacggaaa gggccgttta 780 tagtgtggtg tgtgagttga actatacacg gacatacgtt ccggccgaaa tatagtgagt 840 ggttgaagtg ttaaagcggt aggcacacgt gctgaaacaa atttgaaacg acggttggaa 900 gcagttcgac agcgttacat agcgagcgga tttgcagcgg tacgcccaag ttgcagcggt 960 acggcggagg tgatatgggc aacgacatct gttggcaata acaacagccg agacagaaac 1020 cagctccccg ttcggttggc taacgcggtg ccttctattg cccgcaagca gaaaaataaa 1080 ctgctgccac ctgcggtaga cgagtaaaat ggttgcaaaa ctctaaaata aatggatcgc 1140 aacttaaggc ctcggcagcc gtctgtcgat tgtagcttgg aagcgattcc aaagaagacc 1200 acagtttctg cgaaagtcga cactggacgc acgtccatga cggcgaagat gaacgaggag 1260 aaagaaatcc gcctgcacct catgctgcag gcggagaagg ccgaaaaggc ggaacttatg 1320 aagacggtgg caagtcttca agccacaata gagggccttc agaagcagct ggcggaggaa 1380 cagcaggcca ggctttccgc cgatgcggag ctaagaagga acgggatgca atgctggccg 1440 agatccgaga cctggggcag taaatgcgcc gggagctatc ctggaaactc ggccaacagc 1500 aacagttgca gctgcaatcg caacctggtc cgtcagggac agcggcggta aagctcccgg 1560 accaccaggc gccaacggca gaaggcattc ggcaagcgga gcagctcagc ttcgctgagg 1620 tcgtccgccg caaatttcgc ggcatggcta aggggaaact ccggcagccc cctcagcagc 1680 atcagcagca gcagcaggaa cagcagcagc tgcaacaaca gcagcagcag cggtacccac 1740 agcgacaggc cggtcgctgg catgctgcag caacaacagc agcagcagcg gcagccacag 1800 cgacaggcgg tcgtcggcac gcagcagcag caacagcgtc ggcagcagca acaacaccag 1860 cagcggtcaa atgccacgca ggcgcagcgc cgagaacagc tgcggaacga acaacgacgt 1920 ccagcgcgcc ttcggcagga ccaaatcatc ttcgagccag ccgaaggtac atcctacaag 1980 gtgctgtacg agaagatacg cttgaacccg cgcctacagg aggaaaacaa gggagtccac 2040 cagggctacc gtacgactcg ggacttcctc cgactcgagt tgaagaagga tacagacgcc 2100 gcttcgttgt tgcagcggat ccaacaggaa gttggcgact tggccgctgg ccggatcgta 2160 acagagatgg ccgaagtctt gatcacgggt atcgacatgc tggccaagaa ggaggatgtg 2220 gaacgcggtc tgcagcgggc gctggagcgc acagcagttg cagcaactac atctttgtgg 2280 gagcgtcgcg acgggacgca gcgtgcccgt gtccgactgc cacggaggga cactgacctc 2340 ctactggata agcgcatcgt ggtcgggcat tcggtgtgcc tggtgcgcag tgccccgaaa 2400 cagcagcaaa gcgcggttcg ctgtttccgc tgcttggagc gcggccatac cacagcagac 2460 tgcgctggcg aggatcgatc cagcttgtgc ctgcactgcg gagccgcgga tcatcgcgcg 2520 gcgtcatgca cttcggaccc gaagtgcatt gtctgcggcg gcccacatcg catcgccgcc 2580 cccatgtgca aaggaccgcc atcacaatgt tgaacatcct gcagttcaac gcaaatcact 2640 gtgagaacgc ccaggacctg gcgttgcacg tgataaccac cgagagtctc gacgttctgc 2700 tcttgtccga gccctattgc gtaccgcgca ataacggcaa ttgggtgacg gacgagagta 2760 agacggttgc catcgtcgtc aacggcaacc ggctgccgat acagcgtatc aggcaccgtc 2820 agaccctcgg tgtggttgct gccgacgtgg gaggaacaac aatcgtgagc tgctacgtgt 2880 cgccgcagac gggagtcccg gagttccgga gcattatgga gaagattgac ttgatcgtcc 2940 gcgggtgcag ccgggttctc ctggccggcg atttcaacgc catgcacctt gactggggaa 3000 gcagcaggac ttgcccgaag ggcttggagc ttctccagtt ggcggacaac ctcgggctgg 3060 tgctcctgaa caaggcggac tgcctaccta cgttcaaggg gaaccgagcg gacacaaatc 3120 ggttcccgcc cagccgaccg gacgttacat tcgccagcag tgtgatcagc cgcctcgacc 3180 cgcgcgatga cagcgcccga ggctggcgcg tgcctgacgt ggcaacgctg agtgatcacc 3240 gatacgtcca gtacgaggtt ggcgagagtt caccaccaac gagggatcgg gcagcacggc 3300 gtggacagcg cccggcccgt gtaagcaaag ccggtacgcg ctggaagacc agtcagtttg 3360 actcccagct tttcgggaaa gcgctggcta tgactgggtt cgcccgtcaa gttaacagcg 3420 tcgaaagctt ggtcgagtcg ctgaccagcg tctgcgacga gacgatgtcg cgggtcttcc 3480 caacgcagga ccacacaggt cggccagctt actggtggac tccggcgatc caggcaatga 3540 tagacaacct ctccagaaag gagcagatga cgatgaggac aatcccgccg gaagagcagc 3600 tccaaacgga actcttagct gccagagaga gccttcggaa ggctattcgt ttgagtaaga 3660 acgaggcgtt cgaccggttc ttgcggtcga tcagggagga cgtaactgga atttttttcc 3720 ggaaagtctt ccactggttc cagggagccc gctcagcccc ggaacgtgat ccagcagaac 3780 tgcggcgaat tgtcgacgcc ttgttcccgg ttcatccgcc ggtcgagtgg cctgatctcg 3840 gagttggcaa catggcgccg cttcgatcga tcggcctgac cgagctggac caaatagcag 3900 ccagcatgca cccgcggaag gctcctgggc ttgatggagt gccgaacgcc gctcttacag 3960 tcgctctcag gcagcaaccg gagcccttcc ggcgggtgtt tcaggagtgc ctggacatgt 4020 cctgctttcc acagccgtgg aaaaagcagc gactggtgct cctgccaaag ccaggcaaat 4080 caccgggcga gccgtcgtcc ttccgcccga tttgcctgct ggacaacacg ggcaaggctt 4140 tggagcggct gctattaaac cggctcaacg agtacatcga agatcctgag agtccgcaac 4200 tgtctgagca gcagttcgga ttccggcgag ggcgatcgac tctgcaggcc atccagcagg 4260 tcgtggacgc gggacggagg gcgttgtcct taggcagaac caacaaccgt gaccggcgct 4320 gcctcatggt tgttgccttg gacgtccgca acgcgttcaa tacggccagc tggcagagca 4380 tcgccgaggc gctccaggcg aagggtgtcc cagtgcagtt gtgccgcata ctgcaggact 4440 acttcgccga cagggagctc gagtacgaca cggcggacgg gccggttacc cgccgagtat 4500 cggcaggtgt tacacagggg tccatattgg gccccacact gtggaacatc atgtacgacg 4560 gcgtgctagc cgtagagctc cctgagggcg cctccatcgt gggattcgcc gacgatcttg 4620 cgattctggc agcgggaaca atcccagaac acgccgctgc aatagcggag gaagcagtag 4680 cagcggtcaa caactggatg gtgcagcata aactttctct ggcgccggag aagacggagc 4740 tgctgatgat ctccagtaag cgcagcggat atcgtaacat cccggttaac atctgcgggg 4800 tggaagtgcg ctcgaagcga tcgatccgtt acttgggggt catgctacac gaccacctat 4860 cgtggcgccc acacgtcgag atggtcgcgg acaaggccct ccgtgtggtg cgagcattgc 4920 gcggtatcat gcgcaaccac agcggccccc aagtgagcaa gcggaagctg ctcgccgcag 4980 tggccgcatc cattatccgc tacggcgcac ccgtctggac ggaagccacg gacttgcagt 5040 ggtgcaggcg gatattggat cgcgtgcagc gcctcctggc ccagggcatc acgagcgcct 5100 tccactccac gagctgcgag gtagcggtag ttttagccgg agaactgccc taccacctcc 5160 tggcgaagga agacgcccgc tgctacaatc gacagcagtc cagcccggac agcagtcgag 5220 aggcgattcg ccaggaggag aaggagacat cgctgcagct gtggcagcaa cagtgggacg 5280 acgtggcggc aaacaacacc agccgctact tacgttgggc ccaccgacaa gtgccagacg 5340 tgcgcctgtg gacgggacgg aagcacggag aggtagattt ctacctctcg caggtactta 5400 gtggacacgc cttcgtccac gagttcctgc acgtgttcgg gttcgctccg tccccggatt 5460 gtcccaggtg cgcagggtcg gtcgagtcgg tggcccacgt aatgttcgag tgtccacgtt 5520 tcgcggatgt ccgggcggag ttcctgcaag gcgtcggcga acacaacctc ggcagccgcc 5580 tgttggagag tgcggagtgg tgggaccgca tccagcaagc ggctcggcgg atcctctccg 5640 ttctgcagga ggactggcgg gaggagcagc aaaccttggc agcagctgag gctgctcagc 5700 cagatcctgc atccagcctg cccgaggaca tggcagaggc agaacggaga ctgctgcgac 5760 gtcgcgaggt gcgcaaccgc agcgcccagc ggaggcggca acagcaacgg cagcagcgac 5820 tcggcgactt tgagctggta ccagccgtgt tggcacgcgc agcggccaac gaagcagcgg 5880 agccgacagg ggaggtcgag gaggaggagg tcagccctcc agtaccacca atccctccgc 5940 gcagccggcg attgccaccc tccccacgaa caacggagat gcggcgacgt cggcgcaact 6000 atatgcagct gcagtatcgg aggaggcgtc gggacggaga gctcggagat gtcccgcagg 6060 gtcgccagcg acgtgggcga ataccaacat ccgcagcgga gctggagcga tgacggctca 6120 acaggcgaca gagggagaga caacggcggc tggagcaacg gcaagcggag gtcgaggctc 6180 agcctccccc gtggcgagct gccgaatgag ctgccgagct ctcagctggc cgaaagaaga 6240 agcagcctca cagacgtcga gagagcggcc ataacatcgg caaatacttc cgcacgttga 6300 ggaatattgg aatatgattt ggagacacct gctaggaaac ggaaaacgtt aaaggcttac 6360 ggaacgacat ttttgttttg cgaaaaagga tatcttcctc tgatcatttg gggaagtaca 6420 aaaactaagt gaactaaaat actatagttt aaataaagag aaaggcccaa atgagcgaaa 6480 ttcccggcgg ggtggagtcc attagtggta accccccgct ggtaggagcg ggttttcttg 6540 gaagcagtta gtaacaccaa taaagataac ccaatgaatt aaaaaaaaaa 6590 // ID GYPSY4-LTR_AG repbase; DNA; ANG; 270 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 08-MAY-2003 (Rel. 8.04, Last updated, Version 1) XX DE GYPSY4-LTR_AG is an LTR of the GYPSY4_AG LTR retrotransposon - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD; KW GYPSY4-I_AG; GYPSY4-LTR_AG; GYPSY4_AG; Gyspy clade. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-270 RA Pavlicek A., Kapitonov V.V. and Jurka J.; RT "GYPSY4_AG, a family of LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 3(4), 78-78 (2003). XX DR [1] (Consensus) XX CC GYPSY4-LTR is a long terminal repeat of GYPSY4_AG (its internal CC portion is deposited as GYPSY4-I_AG). XX SQ Sequence 270 BP; 74 A; 73 C; 53 G; 70 T; 0 other; tgtaacgtcc ggactaatat cgcccactgt cactcaaccc gaaccccgaa cgcagcggta 60 taaacgcatt ataccttgac cgctggagca cccggtgtgc tagatgaact gtcatagaat 120 aaagctctct tcttggcgcg acattgaact gaacagacgt aagccactga cttctgcgta 180 taattatttg tgtgctcttc cgaattgtgc taaatcttat taaaacggcc aattaacctt 240 ccgccaaccg taaaacgctt ggtcgttaca 270 // ID BEL1-I_AG repbase; DNA; ANG; 7959 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 12-MAR-2003 (Rel. 8.02, Last updated, Version 1) XX DE BEL1-I_AG is an internal portion of the BEL1_AG LTR DE retrotransposon - a consensus sequence. XX KW 5-bp TSD; BEL1-I_AG; BEL1-LTR_AG; BEL1_AG; Bel clade; KW LTR retrotransposon; PHD zinc finger; env; integrase; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-7959 RA Kapitonov V.V. and Jurka J.; RT "BEL1_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(2), 8-8 (2003). XX DR [1] (Consensus) XX CC BEL1_AG is a young family of Bel/Pao-like LTR retrotransposons. CC BEL1-I_AG, an internal portion of BEL1_AG is flanked by CC BEL1-LTR_AG CC LTRs. The BEL1-I_AG consensus sequence was reconstructed based on CC multiple alignment of ~20 copies. CC The consensus sequence encodes two proteins: a 1771-aa CC BEL1-I_AG-ORF1p CC (positions 1204-6516) and 502-aa BEL1-I_AG-ORF2p (positions CC 6452-7957). BEL1-I_AG_ORF1p is composed of the PDH domain (aa CC positions CC 13-54), reverse transcriptase (aa positions 745-900) and CC integrase CC (aa positions 1421-1569). CC BEL1-I_AG-ORF2p is a putative env-like protein. It is distantly CC similar to the env-like proteins encoded by Tom and Ted CC retrotransposons CC from Drosophila ananassae and Trichoplusia ni, respectively. CC Some copies of BEL1_AG are nearly identical to each other. CC Therefore, CC BEL1_AG can be still active. XX SQ Sequence 7959 BP; 2873 A; 1648 C; 1673 G; 1764 T; 1 other; tggtggctcc agagaggact agtagagacc tcgcaacgta caagtcggag gttaacttcc 60 gatcttgaaa cccagagtcg gcaggaacat acaagtcgga ggttagcttc cgatcttgac 120 gaacgtattc ctgaccgata gagaaaaaca accagcgctc caacggagga aggttcgtaa 180 aagcggaaaa catcgcgaat gtataatagt atccaaaaac gtaaagtgca acaaaccaaa 240 cgagtgcaaa aaacgcctga aacagtgctt aagaagtcca aggtacagag acagaattca 300 gtaaggcgcc gtacgatcac cgatatattg tgattgaata aagacaccgg tggcaaaaaa 360 aaaaaacgac gttcgctata tcgtttcggt cgtaccacag gtttgtaaac aaaacaaaaa 420 gggtgtgtgt gtgcagtgaa aaaaaaaaaa aaaaaaaaaa aacgacgttc gctatatcgt 480 ttcggtcgta ccacaggttt gtaaacaaaa caaaaagggt gtgtgtgtgc agtgaaacgc 540 aacctcagaa agggcgttta aaacggacga cctgagaaca tctgatcgaa aatcattacg 600 tagcgtacta gtaacaaagt gcagtgtaaa caattacgcg tgtggcacac attcgcagtg 660 agaacgtaat aaaacccaca agacgctaac gaccgcggtt tacggacgtt cccgtaaagc 720 accggcagtg tactccaatt acaacaacac aagtaaatgt gcggatcaag aaaagatcaa 780 cagttgggga gtgtaacaat acgggacgtg catacaaaaa agaaatcttc gagctaaacc 840 gtatccggct ttagcatttc cgggaaacca caaacataaa ggtgttgata agagtgcgaa 900 ctagtggagt agtacgtgtc cagcttattg acgtgtgttg atacggtgat cggtagccaa 960 tggtcatcgc cccggaagac ccgattatcc aacgaaaaag cgtggcggca atactaccca 1020 agcagcggat ctgaaggcga ccagaagaca ggggtactga cgacaccgcg ggggatcacc 1080 tgaagacgac ctcgaggagg acaagcaggg gcgttgtaag agcagcaacg gcacgacgac 1140 acccatccag atcggcagcg acagcccagt tactttatcc accggttacg caaagtgagt 1200 agtatggata tcatatttaa aagcaacccg aaaggcaatt gcaagctatg caagaacccc 1260 gatgagtggg acacacaagt taattgtatt gagtgtgaca gatggttaca tctcaaatgt 1320 ctgaagctag aaggtcccgt taaaaaatat gtgtgtccaa aatgctacac aatagctgag 1380 gaacgcaagg gaaataggga ggccttaatg caaacagaga ggctactaaa agaaaaaact 1440 gaagcggaaa aaagaactag agaagaaaac gaaaggtgtg agaaggaaat cgaaagacta 1500 gaagacatat taagaaatga agaaatacat aaccaatccg acacaactca cctacaagac 1560 gacctacaga cacttacaac aaacgtaaat aaaatggcaa acttgggttt tgctccacac 1620 aagaagacag ttttaaaact tccggatttc tatggcaatt atagaacatg gcctcgtttt 1680 aaactactgt tcgaagaaac cactcgaaca gaaaaatttt cgaatttgga aaaccttaca 1740 cggctccaaa ttcaccttaa gggagatgca ttgcgatccg ttagcgggtt gatgttgaac 1800 ccaagtaacg tggatgcaat cttggaaaga ttggggaggt tatatggcaa cccagttagc 1860 atttttaacg ccttactaaa agaccttatg gtggttaaac gggcatcttt ggaaaatcca 1920 tcttcgatta ttgagttctg taacgcactg aataacatgg tggaaaatat gaccatgttg 1980 aaccaaacgg agtacttgat ggaccaaagg ctccttacag atctggtcgc aaaactctct 2040 ccggacctta aaaccaggtg gctcagagat tcacttaacg aggaaggtga caaaatcaaa 2100 accttgaaag atttcagcaa atggttgaaa ccaacagaag acgtggcgat cacacttctt 2160 gctatggaag gtggtcaaag agacagaccg gcgaggctga atactcacta ttcagccagc 2220 catcaaattt caaataaagg ctgtctaatt tgcagccgtc ctcatgaaac catatcttgt 2280 tacaagctaa agaatgcctc agtaaacgaa agatggaaaa tgctgaagga gaaaaacgtt 2340 tgcactaact gctgcaaatt ctctaaccat gcggccatta actgtcgctc aaggccgcag 2400 tgtacagtgg atggttgtgg acgacgacac aacaccatat tgcatgaaga aaaattyaac 2460 tcaatgggcg cggcgtcaaa agcacattta aattttcatc aaaactcgga acaataccta 2520 tttcaagttc tgccaataac tgtctataac gaaaacaact ccatcgaaac atttgcattg 2580 atagacccag gatcctcaac gagcctcatg acagaaagcc taagacagaa actaaatctg 2640 catggcccaa ggaagccgtt aacactctcg tggacaaatg gatgcaacca ggtagaggat 2700 acaagcacgt cggtatctct gaaactcaga ggtccaaacg gcaggctgct ttatgtcaag 2760 gacattagga cagtaaaaga actggaccta cccactcaaa gcatcaatgc aaacgtgttg 2820 aagagaaaat tttcccactt aaagacggta aatatttcaa gctacaaaaa tgctaaaccc 2880 accattctgt taggacttcc acatgcttat tacacgcaag ctgtggagtc caaatcagga 2940 gcgcccaatg aaccagtggc acacaaaaca cgcattggtt gggtcgtatt tggaaagtgc 3000 agagatggtg atgcaaaaga aaatcaacat cttttcacaa tacaggataa gaaagaggag 3060 gaagaaaagt caatgaggga cctgatgaaa aggttttttt caacagaaga atttggcgta 3120 agggaaacca aattcacccc aaaatcaaag gaccatgaaa gagccctaag tgtaatgaat 3180 gacacactga aatatacaaa taatcagtat gagattggcc tactttggaa agatcccaat 3240 gtatccttac caagcagcta cgcacaggcg ctaagaaggc tcgaaagtca agaacggaaa 3300 atgaaaggta atgacgagat gaaaacctgg tataaaaatc aaattactga ttatgttcag 3360 aaaggttacg ctcgtaaact aacaccattt gaattgctga atagagatcc aaagatcaat 3420 tacattcccc attttatggt catcaatcca aataagccaa ctccaaaacc aagactggtt 3480 ttcgatgcag ctgcaaagaa cgaagggatt tcacttaact ctactctctt gtccggacca 3540 gacgccacta cgtcaatttt tggagtatta atccgctttc gcgaataccc tatcgcctgt 3600 tcaggggaca tcaaggagat gttccatcag atacggatcc gcaaagaaga tcaagtggct 3660 caacgatttt tattcaggga taatccacgc aacgaacccc aagtatacgt tatgaacgtc 3720 atgaccttcg gtgccacatg ctctcctgct tgtgcccagt tcgttaaaaa tgaaaacgcc 3780 ttaaaatata aagacaaata ccccactgca gtggaagcaa tagtaaaaaa ccactatgtt 3840 gatgattatc ttgatagttt tcggactatc aatgatgcaa tcaagactat taacgaggtt 3900 tgcctcatac atgatagagc gcatttcttt atgagaaatt tcgtttctaa ctgtcaggag 3960 gtaataagaa gcatcccaga tgatagatcc tcacaacaag agctgctgca catctctaat 4020 aaagatatga attttgagaa aattttgggg caatactggg acaaaacaaa cgatgtgtta 4080 aggtataagc ttaagcatac cccgtgttcc ataatttcaa aaagagaaat gctagcctac 4140 ttgatgaaaa tatatgatcc attggggcta gcggcaaact atactacgca agcgaaggtc 4200 atcatccaag aaatttggaa aacagaactg gattgggata gcccagtacc agaacgcata 4260 atggaacaat ggcaaagatg gaaggaaaga ataaaggaac tagaacacat acaaatacct 4320 agatgctact cggtggctag caatatcgaa gtaactgagt tacacacttt cgttgacgct 4380 tcggagaaag cgttcgcagc agtagtgtac ttaagaacat taacagaaaa ggggattgac 4440 gtaaacatag tggcggcaaa aacgagagtg gcaccaataa aaccactctc aattcctaag 4500 ctggaacttc aagcagcagt actcggagtc agactcgccg agactgttaa agaggaatta 4560 agaattacca ctgataggga ctattattgg tcagattcca aaaccgtcct aggatggatc 4620 aatgccgatc cacaaaagta caaacaattt gtggcggtaa gaattggaga gattttagat 4680 actaccaatg ctagtcaatg gaagtgggtc tcctccgaaa gtaatcctgc agacgaagca 4740 accaaggtag ttacaagaaa atctatatgg ctgaatggcc cagtatttct taaacaaagg 4800 gaaattgaat atagggaccc caagctaatc attactcatg aagaaatccg tccaaatctt 4860 atgattaaaa ccatagagaa gagaacattc aactttataa aaaccgaatg gtgttcaaat 4920 tggctaagac tgaagagatc actggcaatc aatttaaaat atatagaatt tttgaaaagc 4980 aaggtcaagc gattagcatt ttccccgata gtagaaaagg aaaacctgga taaagcagaa 5040 aaactcctat tgcaaaaggc acaatgggag atatacgaag atgatttagt tcagctttca 5100 ctcaatggac aagtctctaa aaacagcaca ataaagaatc tcaatccaca agtaatagaa 5160 ggactactac gagcaagggg acgattagca aatatatgct acctctccga tgacgtgaaa 5220 caacccataa tattacctaa gaggcatcac gtgacagaat tgatcataca gcattatcat 5280 gaacgctata tgcataaaaa aatggaagca gttattgcgg caatccggca aaggttttgg 5340 gtaatcgacc ttagggccgt ggtaagaagc gtgatcagca aatgccagcg ttgcaaaaat 5400 gaacgcgcac gtcccattgc cccgatgatg gccccccttc cagaaagccg agccgctgtg 5460 ttcaaaaaac ccttcaccca tacaggtgta gattactttg gacccatgac agtgtcaatc 5520 ggaagaaggg tagaaaaaag atggggagcg atattcacgt gtatgacaac gcgcgctata 5580 catttagaaa tcgctaaaga cttaagtaca aattccttca taatgtgcct aaaaaatgtg 5640 cagcataggc gtggaaagat ttgtcacata tacagtgaca atggtacaaa cttcgttggg 5700 gcaaacaggc aaataacgga actcgtcgaa agatgtgcaa ccaacggtat caaatggcac 5760 ttcaatcccc cggccgcgcc tcactttgga ggtgtatggg agagaatggt ccgagaggtc 5820 aagagcttgc tgccaaataa tgataatatg ccagaagaag tattaagatc ggcctttatc 5880 gagatcgaat ttattctcaa taatagacct cttactcaca tccccctcga aactgaagac 5940 gacgaacctc tcacaccgtt tcacttcttg atagggtgtt ccggagaggc cgaacctacg 6000 ccagccggga tttcagcagc tgaagctagc agaaacaact ggaagaaggc acaagttatc 6060 acccaaaact attgggaacg ttggttgaag gagtacctcc caacattagc caaacgagaa 6120 aagtggatag aacgctcaga cccaatacaa cctgatgaca tagtcgtctt cccagacgaa 6180 caacgcgtgg gtaggtggtt aaagggccgg gtagtagaag tttatcccgc taaagacggg 6240 caagttagat ccgcaaagat taaagttgaa aacggcgaat acaaacgccc tgttatcaac 6300 ttatcagtac tagaagtaaa gggcaagaaa attgcagacg taccttcgtg gggagttaaa 6360 agaccggtca acatcgccta cgtcaagaaa ttagctgaac aattaaaaac tcctcctgca 6420 aaaaggagga agcatctcgt aaaaccttat aatggcccgg tatctatgca ttataagcct 6480 gttagccgca tggaaactaa cagacaatct ttcagttaaa ccagttgaag aagctggtat 6540 attcttcgac cacgaaggaa cgcttctctt gaaaaggggt gtgtgggaaa caaccttcca 6600 cacgaaaata caccccgaaa atgacacaga gactttactg acaatggaga aagaggtgac 6660 cacagtattc aaggcactga gtgacatgga cactaatctt ctaaatttga aattgacatt 6720 acaacaaaac attcgacacg cacttcaact ctcacagact gcagtaaaaa gacgaaccaa 6780 acgatctagc ggcatatttg gatttttgaa aggtattcta tttggagaag acgatattga 6840 tgaacagtta gccctcttta gagcttctga agaccagaaa ttgaaacata tatcggaaga 6900 catgactcat aaaatcaagc agggtgacag acttagaaac aaactaaata tgaaaataga 6960 ccacatgaac gaaggtatta agagtcttaa taaaagcttc aatgaaaaca aaaaaaacgt 7020 acttataaag catgtcacag aaacaatcat gctagctgaa gacatagtac aatacattac 7080 gacaaggtat ctagagttag aaatccagcc ccttagcata ttcgactcga cgaaaatttc 7140 cgagaagata caatcaaggt tacccgatgg gtatacaatt ctagaccacc cccgaatttc 7200 tagcaaagag ttatttaggg gagaaataat agtacatatt gaaaacgtca tcgtttcgca 7260 agagagattc gaaatattcc atatcactgt aataccaaac ctaaaaaact ttacaactct 7320 ggacttagat gaaaatgtaa tagctataaa cgatatacac tatatatacc ctacagatat 7380 tacgcgatac aatagcactc atcacgtgtc ctccgatgta gccgttagaa gagatttgga 7440 ttgtatatcc tcatctttca gacatataaa aacagaatgc ttgtgtggta taaaaccaat 7500 taaaaactca ataaccaaat ttgtaaagct ctcccagccg aacaaaatcc tatactactc 7560 ttctcatcct aacgaaatat acctcaaatg taacaaaaca ctgacacatc ccgcgtatca 7620 agctggggta ataacactta gccaagactg taaaatacag accaagaaca tagaaatcca 7680 gccaaccatg aaaattgaag ctgtagaaac aaaaatgtat ttcaagccgc tggccaaaat 7740 actgaattta agcgcagagc aaaaagaaga aacaaatatg gatcagctct acctcataat 7800 aattacaagt acaatagcct gcgtcgccac tttaatatta ggaataacca tagcatttat 7860 tatcaaacaa gtacgagcta aaatgtacac tttacgcccc ccgccgttta aaccgtcacc 7920 aaatagtaac ccctctacga ggaattacgg gggtcagga 7959 // ID GYPSY51-LTR_AG repbase; DNA; ANG; 368 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY51-LTR_AG is an LTR of retrotransposon GYPSY51_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY51_AG; CsRn1 lineage; GYPSY51-I_AG; GYPSY51-LTR_AG; KW Gypsy clade. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-368 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY51_AG, a member of the CsRn1 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 95-95 (2004). XX DR [1] (Consensus) XX CC GYPSY51-LTR is a long terminal repeat of GYPSY51_AG (its internal CC portion is deposited as GYPSY51-I_AG). XX SQ Sequence 368 BP; 97 A; 115 C; 81 G; 75 T; 0 other; tgtagcgacc agaccgccat ctggcgtgag aatcgtgagc gatcgtgaca tccagggaca 60 aggacacgga tctccgtgaa gcgcactcac caggcacacg aatgtgtcaa agtgacgtgc 120 gccgcgtgtc cagcgctaca cttcctgctg tcagcgagca cattctctct cttgcgaccc 180 aacctcgaaa gcgaacagac ctcttccttc gctccgcgct cgaaacttcc tcaacgtgaa 240 accctctcgc acgaacgtgc gcaaccgttc ataatatagt gtaaaataaa gttccgtatt 300 acctactcac gaaacccaac gcgttcgcga cataaaaaat tagggaccac ttttgtggcg 360 cccctaaa 368 // ID T1 repbase; DNA; ANG; 4634 BP. XX AC M93689; XX DT 03-DEC-2002 (Rel. 7.11, Created) DT 03-DEC-2002 (Rel. 7.11, Last updated, Version 1) XX DE T1 is a non-LTR retrotransposon. XX KW CR1; Non-LTR Retrotransposon; Transposable Element; AGT1; KW CR1 clade; ORF1; ORF2; T1; T1_AG; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4634 RA Besansky J.N., Bedell A.J. and Mukabayire O.; RT "Evolution of the T1 retroposon family in the Anopheles gambiae RT complex."; RL Mol. Biol. Evol 7(3), 229-246 (1990). XX RN [2] RP 1-4634 RA Besansky J.N., Bedell A.J. and Mukabayire O.; RT "A retrotransposable element from the mosquito Anopheles RT gambiae."; RL Mol. Cell. Biol 10(3), 863-871 (1990). XX DR Genbank; M93689; Positions 1 4634. XX CC T1 is a CR1-like non-LTR retrotransposon. It encodes a CC DNA/RNA-binding CC protein (T1-ORF1p) and the reverse transcriptase (T2-ORF2p). CC T1-ORF1p (positions 169-1497): CC ILAPSLLLFRQFCRDIVWLRSCSCHSSVCAVSFVMQCSTCNAPTDSANSVSCAGVCGSKHHTHCTGLSRD CC STRELGRNNQLLWLCKNCNEFRNGTNSLLTSEIAALLELVKAEILTTIDSSLSSLRSAIKSDLLAEILAL CC ADKLTPVLAKPSVSQPSRTHTSTNASSLNATNTRTTKTASTRRTFTNSMELTADIQQAANDTNTVEASDS CC CNHYTHRTKVTSDISAGPCRTNTKSSSDPVLNHDTTNTGIAEKVWLYFTNIKSHVSADDMRVWLKAVLPT CC DNIDVYRLTKKGANLDLMSFISFKVSIPKSLKDLALQSTIWPVSLTVREFVDRGLPKQRIHERARFDPSA CC LTSHRSSSANCSSAAPKSTAHPDHFLDHRSPSPQRGNQSLSQMTEILEAIQPEFPPTPPQLSPGVGLQSQ CC NNLSNTNRSPQISPFAKRIAHN CC T2-ORF2p (positions 1487-4414): CC LTINRPFVIYYQNVRGLRTKYNELRLSANESGFEMLALTETWLNESIPSNMVLDSDSYNIYRCDRSRLNN CC ERSRGGGVLLACSSRYPSVALNMNQPTLEALCIRVSFPKFRLYVGIVYVPPYLSSDRNYFESLSAFIXDA CC YMHMKPNDHLILLGDFNQPALGWSPAAAVRSDSSLPMRHYVPHISLNSSSSCFLDVLNLHELYQLNGVHN CC HSNHYLDLVLSNSAAAACSSVYPASSLLLPQDAHHPALEIALPSSLFRASRVRNELPSAPNSLSVRYNFR CC LTDYRKLNSILSRADWSFFYQCTSVDEAVQSFNALLTSALLSCTPIFRSPPNPPWSNRTLRNLKKDRMKY CC LRRYRLNRSAFNFRLFKYAASAHRLYNRARFEAYSSRLQSRFRSDPASFWQFVRIRRGCNTLPNEMVLDS CC RTASTPVEICELFSAHFSQMFEPPVSDPNLIEGGLLYTPENLINLSDISVSSETVVQVLFGLKRSFTPGP CC DGIPASVLINCKDVLAPHLAKIFNLSLSLGVFPALWKSCWLFPVHKKGCRSIVSNYRGITQTCATAKTFE CC LCIFPTILHSCSSAISPKQHGFMPGRSTSTNLMSFVTNIFRSFEAGTQLDAIYTDFHAAFDSLPHSLLLA CC KLSKLGFGDGIISWLSSYLSNRSCRVKTGSYLSEEFFCTSGVPQGCVLSPLLFSLFINDVCNVLPPDGHL CC LYADDIKIFLPVSSSSDCMSLQHYLNAFVHWCSSNLLRLCPDKCSVISFSHSLSPISFNYTLSNSSLSRV CC LSIRDLGIILDSRLNFKLQLDEVLLKANRTLGFILRFTSIFRDQSFLRNLYYALVRPLLEYASIIWNPPT CC IDGCSRIESIQRLFTRVAFRRLFGAASLPPYETRLQLFNLHSLSFRRQVSQACFIGGLLLSDTDAPDLLS CC SISLYVPSRSLRPRDPLSIETRHTLYTFNDPILSCFRLFNHFYYLFDFDSSLNSFRNRIFSSNSL. XX SQ Sequence 4634 BP; 1059 A; 1101 C; 880 G; 1593 T; 1 other; agagagagat cgactgtcaa accgagtggt tgtgccttcc ttgtgctcct gatttttgat 60 gctgttttgc cgtgttactg cctgattttc aacatttgca acattggtgc tgctgagttg 120 cttgtactgg ctagtgcgtg ccttcgattg ttatcaagtt gcacttgaat tcttgcacct 180 agtctcctgt tatttcgtca gttttgccga gatattgttt ggcttcgttc ctgctcctgt 240 cactcgtcag tttgtgctgt gtcattcgtt atgcagtgct caacctgcaa tgcacccacc 300 gatagtgcaa attcggtgtc ctgcgccggt gtgtgtggct ccaagcatca tacccattgc 360 acgggtttgt cccgtgattc tactcgagag cttgggcgga ataatcaatt gttgtggttg 420 tgcaaaaatt gtaacgagtt tcgcaatggc acaaactcac ttctcacaag tgagatagca 480 gccctactcg agttggtgaa agcagaaatt ctcaccacga ttgactcatc tctctcttct 540 cttagatcgg ctatcaagag cgatttgctt gctgagatcc tcgctctcgc tgataagcta 600 acacccgtat tagctaagcc gtctgtttct cagccatcgc gaacgcacac gtccactaat 660 gcatcgtcac tcaacgccac taataccaga acgactaaaa cagcatccac tcgccgtaca 720 tttaccaact caatggagct cactgcagat atccaacaag cagcgaacga taccaacact 780 gtggaagctt ctgatagctg caaccactac actcatcgta ctaaggtgac tagtgatatt 840 agtgctgggc catgccgaac aaatacaaaa tcatcttctg atcctgtttt gaaccatgat 900 accacgaaca ctggcatagc agaaaaagta tggttatact tcacgaacat caaatcgcat 960 gtctccgctg atgatatgcg tgtgtggctt aaagctgtgc tgccaaccga caacatagat 1020 gtttaccgtc tcacgaaaaa gggtgcaaac cttgatttga tgtcctttat atcgtttaaa 1080 gtgagtattc ctaaatcact taaggatctg gcgcttcaat ctactatttg gccagtttca 1140 cttactgttc gtgagtttgt tgatcgtggc ctaccaaagc aacgtataca tgaaagggcc 1200 cgatttgacc cttctgcgct tacttcgcat cgttcaagca gtgcaaattg ctcttcagct 1260 gcgccaaaaa gcaccgctca tccggatcat tttttggatc atcgatcgcc atccccacag 1320 cgcgggaatc aatcactatc ccagatgacc gagatcctag aggctatcca accggagttt 1380 cctcccacac cccctcagtt atcaccgggg gtggggcttc aatcacagaa caatctcagc 1440 aacacgaatc gctcaccaca gatcagcccg tttgccaaac ggatagctca caattaatag 1500 acccttcgtg atctattacc aaaatgttcg aggccttcgc accaaatata atgaattgcg 1560 cctttctgcg aatgaatcag ggtttgaaat gcttgccctt actgaaacct ggttaaatga 1620 atcgattcca tccaatatgg tcctggatag tgattcttac aatatatacc gttgcgatcg 1680 cagcaggtta aacaatgaac gatcgcgtgg gggtggtgtg ctgcttgcat gttccagtcg 1740 ttatccgtct gtggcactta acatgaatca acctacgctt gaagctttat gtattcgtgt 1800 ttcttttcct aagtttcgtc tttatgtggg gattgtttat gtgccaccgt atttgagcag 1860 cgaccgcaac tatttcgaat ccctttctgc tttcatcagn gatgcataca tgcatatgaa 1920 accgaatgat catcttatcc ttcttggcga cttcaatcaa ccggcgttag ggtggtcgcc 1980 tgcagccgca gtaaggtcag attcatcttt acctatgaga cattatgtgc cacatatctc 2040 tttgaattca tccagttcct gctttttgga tgtgttaaat ttgcatgaac tctatcagct 2100 gaacggggtg cataaccatt caaatcatta tctggacctg gtgctctcta actctgctgc 2160 tgctgcttgt tcttctgtgt atcctgcttc gtcactgctc ctgccccagg atgcccatca 2220 tcctgctctg gaaattgcgt taccgtcttc tttatttagg gctagtaggg ttaggaatga 2280 attgccttct gctcctaatt cattgagtgt tcgttataat tttcgtctta cagactatcg 2340 taaacttaat tctattctat ctcgtgccga ctggtctttt ttttatcaat gtacatcggt 2400 cgacgaggct gtccaatcgt ttaatgcttt gttaacctct gcactccttt catgtacacc 2460 tatttttcgt tcccctccta atcctccctg gtccaatcgt actcttcgca acctgaaaaa 2520 ggatagaatg aaatatctta ggaggtatcg tctgaaccga tctgctttca actttcgttt 2580 atttaagtac gctgcctctg cgcatcgact atacaacagg gctcgttttg aggcctattc 2640 gagtagactg caatcgcgtt tccgttctga tccagcatcc ttctggcaat ttgttaggat 2700 tcgaagaggg tgcaatacgt tacctaatga aatggtactt gattctcgaa ctgcctctac 2760 gcctgttgag atctgtgagc tattctctgc acatttttcc caaatgtttg agccaccggt 2820 tagtgaccct aaccttattg agggtgggct actctacacg ccagagaact taattaatct 2880 ctccgatatt tcggttagct ctgaaacagt tgtacaggtg ttatttgggt tgaaacgttc 2940 ttttactcct ggtccagatg gcattcctgc ctcagtttta ataaactgta aggacgtgct 3000 tgctccacac cttgctaaaa ttttcaacct ttcactttct ctcggggtct ttcctgctct 3060 ttggaaatcc tgttggcttt ttccggtaca caaaaaggga tgccgtagca ttgtctctaa 3120 ttatcgtggg ataactcaaa catgtgccac agccaaaact tttgagctat gtatctttcc 3180 aaccatactt catagttgta gttccgctat tagccctaaa cagcatgggt ttatgcctgg 3240 taggtctact tctactaatc tcatgtcttt tgttaccaat attttcagat cttttgaggc 3300 aggtacccaa cttgatgcaa tatacactga ctttcatgct gcatttgata gtttgcccca 3360 ctctttacta ttagctaaac tatctaaact tggttttggt gatggcatta ttagctggct 3420 gtcctcatac ttaagtaatc gatcttgcag ggttaaaacc gggtcgtact tatctgagga 3480 gtttttttgt acgtcaggtg tccctcaggg ttgtgtgcta agtccacttc tgttttcttt 3540 gttcatcaat gatgtctgta atgttttacc tcctgatggt catctccttt atgcggatga 3600 tatcaaaatc tttttacctg tgtcctcttc ttctgattgt atgagtcttc agcattacct 3660 taatgcattt gttcattggt gttcatccaa cttacttcgc ttgtgccctg ataaatgttc 3720 tgttatttct ttctctcact ctctttctcc tatttcattt aactatactc tctctaactc 3780 gtctctctct cgtgttttgt ccatccgtga ccttggtatt atactcgaca gtcgtcttaa 3840 ctttaaactg cagcttgatg aggttctact aaaagctaat cgaactcttg ggtttatttt 3900 acgttttacc tctattttta gagatcaaag cttcttaaga aacctttatt atgctctggt 3960 aaggcctctt cttgaatatg ctagcatcat ctggaatcct cctactattg atggctgttc 4020 gagaattgaa agcattcagc gcctttttac cagggttgct tttcgtcgtt tgttcggtgc 4080 tgcctcacta cctccctatg aaacgcgatt gcagttattc aatcttcact ctttaagctt 4140 ccgccgccaa gtgtctcagg catgttttat tggtggctta ttactttctg atactgatgc 4200 tcctgattta ctctcgtcca tctcgttgta tgttccctct cgttcccttc gtcctcgtga 4260 tcctctgtca attgaaacac gtcatactct ttatactttc aatgatccta ttctatcctg 4320 tttcaggttg tttaaccact tttactatct ctttgatttc gactcctctc tcaactcttt 4380 ccgtaaccgt attttttctt ctaattctct ttaattattc ttctaagttt cattaagttt 4440 tgatagtctc tacgctctac ctatgttttt ttttctttaa tttttttgct aggtctagac 4500 tagtttagtt aggcttagtt tttcatgaat tattttgttt atttgttagg gtttatttag 4560 ttttaagtct gccttattta gccttgatgg cggatattgt attaataaat gaaatgaaat 4620 gaaatgaaat gaaa 4634 // ID RETRO935_AG_LTR repbase; DNA; ANG; 350 BP. XX AC . XX DT 06-FEB-2003 (Rel. 4.1, Created) DT 06-FEB-2003 (Rel. 4.1, Last updated, Version 1) XX DE Anopheles gambiae long terminal repeat from RETRO935_AG DE retrotransposon - a consensus. XX KW Long terminal repeat; retrotransposon; NINJA; RETRO935_AG_I; KW RETRO935_AG_LTR. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-350 RA Jurka J. and Drazkiewicz A.; RT "RETRO935_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 21-21 (2002). XX DR [1] (Consensus) XX CC Related to NINJA from Drosophila simulans. 5 bp target site CC duplication. XX SQ Sequence 350 BP; 88 A; 72 C; 93 G; 97 T; 0 other; tgtcacttat ggtgacgacg cgatcgttcg aggatcgaca aggagcgtcc gtattttagc 60 tagcggccgc tagaggtctc ggactggacg ataaacgttc ggactgaacg aaagggatcg 120 accgagcgtt tgacggatag ccggtgaccg catggactgc gaaatagggc tttctttttg 180 atcccggacc tgcgtggaag cggacgaatt tttagttttt tcttcactct ctcggctact 240 aaaattttgt gctgttaaag gcaaaataaa ataatcctga attacttaaa aaagccgttt 300 gtttggtaaa ttatttatct gcaggccggg cgttgtgcta acgcgcaaca 350 // ID RETRO937_AG_LTR repbase; DNA; ANG; 366 BP. XX AC . XX DT 06-FEB-2003 (Rel. 4.1, Created) DT 06-FEB-2003 (Rel. 4.1, Last updated, Version 1) XX DE Anopheles gambiae long terminal repeat from RETRO937_AG DE retrotransposon - a consensus. XX KW Long terminal repeat; retrotransposon; NINJA; RETRO937_AG_I; KW RETRO937_AG_LTR. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-366 RA Jurka J. and Drazkiewicz A.; RT "RETRO937_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 22-22 (2002). XX DR [1] (Consensus) XX CC Related to NINJA from Drosophila simulans. 5 bp target site CC duplication. XX SQ Sequence 366 BP; 119 A; 74 C; 79 G; 94 T; 0 other; tgttattgtt taacacagga aaacatcgaa tgtcagcgca agtgtcaagc agtgacattt 60 gacccacgtt actgctcgac tgtgggactg ctcgactaac gatgaggatc gcaagcccgc 120 gaatttgcaa tggtggaaaa gccataaaag ggaaatttgg gaaagcactt tctctttttc 180 tcgtcatcac cgcggaacca gaagcagtgg aattttttta accactttgt tataagtaaa 240 tttacagtac caccgttacg tactgcgtta atagaattca ataaagtgaa gttaagagaa 300 cctaactaac gtcgcgagtg aatcatttcg ggaaaaaaaa tatcaccgga cgttgctaac 360 gcaaca 366 // ID BEL5-I_AG repbase; DNA; ANG; 5282 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 08-MAY-2003 (Rel. 8.04, Last updated, Version 1) XX DE BEL5-I_AG is an internal portion of the BEL5_AG LTR DE retrotransposon - a consensus sequence. XX KW BEL; LTR Retrotransposon; Transposable Element; 5-bp TSD; KW BEL5-I_AG; BEL5-LTR_AG; BEL5_AG; Bel clade; RING Zn-finger; KW integrase; peptidase; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5282 RA Kapitonov V.V., Pavlicek A., Drazkiewicz A. and Jurka J.; RT "BEL5_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(4), 69-69 (2003). XX DR [1] (Consensus) XX CC BEL5_AG is an young family of Bel/Pao-like LTR retrotransposons. CC BEL5-I_AG, an internal portion of BEL1_AG is flanked by CC BEL5-LTR_AG CC LTRs. The BEL5-I_AG consensus sequence was reconstructed based on CC multiple alignment of 20 copies; they are ~1% divergent from CC the consensus sequence. CC The consensus sequence encodes one 1723-aa BEL5_AGp Bel-like CC protein CC (pos. 93-5261). CC BEL5_AGp is composed of the peptidase (pos. 118-200), RING CC Zn-finger CC (pos. 300-375), reverse transcriptase, and integrase (pos. CC 1414-1600) CC domains. XX FH Key Location/Qualifiers FT CDS 0..0 FT /product="BEL5_AGp" FT /translation="MAEARKVKSLRIQQTATQLSIQAIHTFSKSYNKETQR FT AEAIIRKQNLQKQYAKFIDLQDELLPLDEGNEAENLALRQTVESAYYDAEA FT NLASAVEANDKKPCVPASNIKLPDVKLPVFDGKPQNWSSFHSIFVAMIDSA FT ELYSGVQKLYYLRTSLSGPALQLIQSVPISEENYSVAWNLLLNHYNHPKRL FT KQLHVEALFEDAALKKECANELRKLIENFEANVNALTQLGEPTAQWDTLLI FT QMLSRQLDPSTLRSWKEHSAEKKIDSYSDLVAFLYRRVGVLEVLPSTSSGK FT PPKQRVFATTTMPNNNTKGCACCNRDHPVYTCDEFKKLSLNAKQKVITQHK FT LCYNCLRPGHHLRDCKSASTCKSCHKRHHTQLCSLPQPSPTVPSSEEDQRD FT PPTTLASTSVVESITCASAGQHKTVLLATATVIIVDDEGHKHNARVLLDSG FT SESCFITENLAQQMKSTRERSNLCISGISSTNTTAKQSIRATLRSRVGRYF FT ANLQFFILPRVTGNLPSSSIDTTGWNLPDNIFLADPHFDCIGRIDVLIGAE FT VFFDIMRPAGRILLGKDQPVLVNSELGWIVSGPAVSTLTPTSQASSITVNH FT ASTTVQDVHKLMERFWAIEEGEIVNAQSNEHAACEEHFRRTVSRNSSGRYV FT VRLPLKEHLLTKLVDNRKAAVRRFHFLQSRLSSNNEFRNSYSSFIDEYAEL FT GHMKRISEEEYNNTNQHHYYLPHHAVTRQESLTTKLRVVFDASSKTSSGIS FT LNDVLMVGPTIQDDIRSIIMRARKHSIMIVADIKMMYRQVLVDARDTSLQL FT IVWKPTPSESLQTYKLCTVTYGTASAPYLATRVLSQLADDEGSSYPIAAKV FT LKKDFYVDDLLTGTSTAAEASEVITQLTALVSKGGFTLRKWATNDEQVRQT FT ISKDKLSEDESFCFDRDQIIKTLGLHWHPLNDSMTYRIKPFEEKLITKRST FT LSGIARLFDPIGLIGPVVTKAKIFMQSLWTLKASDGSIWNWDTELPEQLQK FT QWLSFKNELNLLNTIQIQRCVLLNEATSVQLHIFADASQTAYGACAYLRST FT NKAGQIKTSLVASRSRVAPLKSQSIPRLELCSALVASELYKSIQQAMQLDA FT DIYFWLDSTVALYWIQASPSKWNTFVGNRVSKIQQATSSCTWSHINGQENP FT ADHISRGLTASELVNCDLWWNGPPWLQLDQEQWPKPQLSSQLPSEVSIEGR FT SITTTAAATKSPPCDAILLVNELLSKFSDYHKLLRVIAYCFRMRTKRDSSQ FT ETVVISTDELWNAEIRILQVVQRDIFEKEWTQLRQNSPVSNKSRLKWFHPF FT LCDDNLIRIGGRLSKSNQPFESKHQILLSAGHPLAEMLIRHLHKKHHHAAP FT QLLITILRQKYWVIGARSLAKRICHECVPCCRARPRLLEQFMAELPTSRIT FT PSRPFSIVGVDYWGPIGLSPIHRRASSGKAFVAVFICFATKAVHLELVANL FT TTAKFIQAFRRFVARRGLCSDIHSDNGRNFLGASRELRALVTSKQHRIEII FT QECTSQGMRWHFNPPKASHFGGLWESAIHSAQKHFFRVLGKTTLPQDDMET FT LLCQIECCLNSRPLVPLSDDPSDLEPLTPGHFLVGSNLKAVPDNKLEDIPS FT NRLKHYQLVQKLLQQIWTRWSAEYLATLQPRSKWLKPPVKIDVGQLVLVKD FT ESTTPLHWPLGRIIKTHPGDDGVARVVTLKTASGEYTRPIAKLCLLPVTSM FT VQN" XX SQ Sequence 5282 BP; 1604 A; 1294 C; 1086 G; 1296 T; 2 other; ttttggtcct tcgaatcgcg gatcgacgat tgatcatcat ttagttcttc gagcaacgaa 60 ttgtggtcat tcgaagagtg taaacaaacg aaatggcaga ggcacgtaag gtaaagtcgc 120 tgcgtatcca gcaaacggcc acgcaactct ctatacaggc aattcataca ttttccaagt 180 cttataacaa agaaacgcaa agagcagagg caattattcg caaacaaaac ttacaaaaac 240 aatatgcaaa attcattgac ctgcaagatg aactgttgcc cctagatgaa ggcaatgaag 300 cggagaatct tgcgcttcga caaaccgtgg aatcagcgta ctatgatgct gaagctaatt 360 tagctagtgc tgttgaagca aatgataaaa aaccatgtgt accagcatct aacatcaagc 420 taccggacgt gaagcttccc gtcttcgatg gcaagccaca aaattggtct agctttcatt 480 cgatctttgt cgcgatgatc gatagtgcgg agctgtattc aggcgtacaa aagttgtatt 540 acctgcgtac atcgctatcc ggcccagcac tccagctaat acaaagcgtt ccaatcagcg 600 aagaaaacta ttccgtggca tggaatctgt tgctcaatca ctacaaccac ccaaagagat 660 tgaagcagtt gcatgtggaa gcattatttg aagatgctgc gctgaaaaag gaatgtgcaa 720 atgaactacg caaactgata gaaaatttcg aagccaatgt aaatgcatta acccaattag 780 gcgaaccaac tgctcaatgg gatacgcttc taatacaaat gcttagccgt cagctcgatc 840 cgtcaacact acgaagctgg aaagaacatt cggcagaaaa gaaaatcgat tcgtatagtg 900 atttggttgc gtttctgtac cgtcgagtag gagtgttaga agtgttgcca tcaacgtcat 960 caggtaaacc acccaagcaa cgtgtatttg caacaaccac aatgccgaac aacaacacca 1020 agggttgtgc ttgttgcaac agagaccatc ctgtgtacac gtgcgatgag tttaaaaaac 1080 tatccttaaa tgcaaagcaa aaggtcataa cacagcacaa attatgttat aattgtcttc 1140 gtcctggtca tcatctacgt gactgcaaat ctgccagcac ctgtaagagt tgtcacaagc 1200 gccatcatac acaattgtgt tctttaccac aaccctcgcc tactgtaccg tcatcagaag 1260 aagatcaacg agatcctccg accacgttag catcaacatc ggtcgtcgag tcgatcacat 1320 gtgcttcagc aggtcaacat aagacagtcc tcctggccac tgccactgtc ataatcgttg 1380 acgatgaagg ccacaaacac aacgcacgag tgctgctaga ttcaggaagt gagagttgtt 1440 ttatcactga aaatctagcc cagcaaatga aatcaacaag ggagagaagc aatctatgca 1500 tctctggaat cagttccacc aacacaaccg caaagcagag catccgagcg acacttcgct 1560 cgcgggttgg gcgatacttt gccaacctgc agttcttcat actaccaaga gtcacgggga 1620 atcttccatc gtcatcgatc gacaccacgg gatggaacct gcctgacaac atctttcttg 1680 cggaccctca cttcgattgc atcggccgaa ttgatgtctt gatcggcgcg gaggtctttt 1740 tcgacattat gagaccagct ggacgaatac ttctcgggaa ggatcaacca gtccttgtca 1800 actcggagct cggatggatc gtatcggggc cagccgtaag tacactcaca cctacttcac 1860 aagcttcttc tattacagtc aaccatgcat ctacaacggt tcaagatgtt cataaactta 1920 tggaacggtt ctgggmaata gaggaaggtg aaatagtcaa cgcacaatct aatgaacatg 1980 cagcgtgcga agaacacttt cgtcgcaccg tttcacgaaa ttcttccgga cgctacgtcg 2040 tgcgtcttcc actgaaagaa catcttctca caaaactagt tgacaatcgc aaggcagcag 2100 ttcgtagatt tcattttttg caatcccggc tcagttctaa caacgaattt agaaacagct 2160 acagctcatt tatcgatgaa tacgctgaac tggggcacat gaagcgcatt tcggaggaag 2220 aatataacaa cacaaaccaa catcattatt accttccaca ccatgcggtg acgcgtcaag 2280 aatcactaac aaccaagttg cgcgttgtct ttgacgcctc cagcaagaca tctagcggca 2340 tatcgttgaa cgatgttttg atggtagggc caacaattca ggacgatatc cgatctatca 2400 tcatgagagc acgtaaacat tcgatcatga tagtcgctga tattaaaatg atgtaccgtc 2460 aagtgctcgt agatgmtcgt gatacatcgt tacaactcat cgtatggaag ccaacaccgt 2520 ccgaatcact gcaaacatac aaactgtgta ccgtcacata cggtacagct agtgcaccct 2580 atttagcaac acgagtattg tcacaacttg ccgatgacga aggcagtagc tatcctattg 2640 cagccaaggt attgaaaaaa gatttttacg tcgatgattt acttaccggc acttccaccg 2700 cagccgaagc ttccgaagta attacacaac taactgcgct tgtttccaaa ggtggtttta 2760 ctctacgtaa atgggccacg aatgatgaac aagttcgcca aaccatctca aaagacaaac 2820 tttcagaaga cgagtcattt tgtttcgatc gcgaccagat tatcaaaact cttggtttgc 2880 attggcatcc attgaatgac tccatgacgt atcgcatcaa accatttgaa gaaaaactga 2940 tcacaaaacg ctcaacgtta tcgggaattg cacgattatt cgatccaatc ggccttatcg 3000 gaccagtagt cacgaaggca aaaatattca tgcaatctct ctggacactt aaggccagcg 3060 atggctcgat atggaactgg gatactgagc ttccagagca gctccagaaa caatggctat 3120 catttaaaaa tgagctcaac ttacttaaca caatacaaat acaacgatgc gttctactaa 3180 atgaagctac tagcgtccaa ttacacattt ttgccgacgc atcccaaaca gcatacggcg 3240 cttgtgccta cttgcgctca actaacaagg caggccaaat caaaacatca ctagtagcat 3300 ctcgatcgcg ggtcgcgcct cttaaatcac aaagcattcc cagactagaa ctatgcagcg 3360 ccctcgtagc aagtgagcta tacaaatcta tccagcaagc tatgcagcta gacgccgata 3420 tctatttttg gcttgacagt acagtcgctc tttactggat tcaagcatca ccgtcgaaat 3480 ggaacacctt tgtcggcaat cgtgtgtcca aaatacaaca agccactagc agttgtacat 3540 ggagccacat aaacggccaa gaaaatcctg cagatcatat ttcacgggga ttaactgcaa 3600 gcgaactcgt caactgtgac ctttggtgga atggcccgcc atggctccaa ctagatcaag 3660 agcaatggcc caaaccccaa ctgtcatctc aactaccatc agaggtttca atagaaggtc 3720 ggtcgatcac tactacagct gctgccacca aaagtccccc atgcgatgca atactgttgg 3780 taaacgagtt gctatccaag ttttctgact accacaaatt gctacgagta attgcatatt 3840 gctttcgaat gagaactaaa cgtgattctt cgcaagaaac tgtcgttatt agcacagacg 3900 aattatggaa tgctgaaatc agaatcttgc aagtggttca aagagatatc tttgaaaaag 3960 aatggactca gctccgtcaa aacagtcctg tttccaacaa atccagacta aaatggttcc 4020 atccgtttct ttgcgatgat aatcttatcc gcattggtgg acgtttatca aaatctaatc 4080 aaccatttga aagcaaacat caaatattgt tgtcagcagg acatcctctg gcagaaatgc 4140 tgattagaca cttgcataaa aaacatcatc atgccgcacc gcaactattg atcaccatcc 4200 ttcgtcaaaa gtattgggtc ataggggcca gatccctagc taaacgcatt tgccatgaat 4260 gcgtaccttg ttgtcgtgct cgtcctcggc tgctagaaca atttatggca gaactaccaa 4320 cttcacgcat aacaccaagc cgaccattct caatagtggg agtggattat tggggtccca 4380 tcggtctatc acccattcat cgccgtgcat catctggtaa agcatttgta gctgtcttta 4440 tctgtttcgc tacaaaggct gttcatctcg agctcgtcgc aaaccttacc acagccaaat 4500 tcatccaagc atttcgtcgt ttcgttgctc gtcgtgggtt atgcagtgac attcacagcg 4560 acaacggacg gaacttccta ggcgcctcca gagaactgcg agcattggtg acgagcaaac 4620 aacatcgaat tgaaatcatc caagaatgca cgtcacaagg aatgcgctgg catttcaatc 4680 caccgaaagc gtcacacttt ggtgggctat gggagtcagc aattcattct gctcagaagc 4740 attttttcag ggtccttggc aaaacaactc tgcctcaaga tgacatggaa acgttgcttt 4800 gtcaaattga atgctgcctc aactcacgtc cactggttcc tctcagtgac gatccgtctg 4860 acttagagcc actaacacca ggccattttc tggtcggaag taatctaaag gcagtccctg 4920 ataacaaatt agaggatata ccatccaatc gtcttaaaca ttaccaactg gtacaaaagc 4980 ttctacagca gatttggaca agatggagcg cagaatattt ggcaactctt cagcctagga 5040 gcaaatggct caaaccacca gtaaaaatag acgttggcca acttgttttg gtcaaggacg 5100 agtcaactac cccattgcat tggccgctag gacgcatcat caaaacacac ccaggcgatg 5160 atggagtagc acgagttgtg acattgaaga cagcttctgg cgaatacact cggccaattg 5220 cgaagttgtg tcttcttcca gtaacttcaa tggttcagaa ctaacgtctg aaggggccag 5280 ta 5282 // ID NotoAg1 repbase; DNA; ANG; 5540 BP. XX AC AB090823; XX DT 14-SEP-2005 (Rel. 10.09, Created) DT 14-SEP-2005 (Rel. 10.09, Last updated, Version 1) XX DE Anopheles gambiae retrotransposon NotoAg1 DNA, complete sequence. XX KW Non-LTR Retrotransposon; Transposable Element; NotoAg1. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5540 RA Kojima K.K. and Fujiwara H.; RT "Evolution of target specificity in R1 clade non-LTR RT retrotransposons."; RL Mol Biol Evol 20(3), 351-361 (2003). XX DR EMBL/GenBank/DDBJ; AB090823; Positions 1 5540. XX FH Key Location/Qualifiers FT CDS 552..1838 FT /product="NotoAg1_1p" FT /translation="MREAIRQLTNSLAEANARNERINEELTQMRILMTKQQ FT EYTERRELIAREEMEKMRAAHERDRTALNKLLMQGAGTSSHRAAATPTTPT FT PQPRRMQQHQEKQRQPPQQQHQQIGPSTSAAPPQLLVSGASFDPEGDDGQG FT SFAEVVRHKWGRNTGKPRGYQQQQSQSHRQVVIGTQQECLQPEQQHQRQQQ FT HTVRRHNVDKVEVIPGENQTWETVYQMVKDAIKFDPAHKDLADHVVIGRRT FT HAARLRIQLSCTADSTLMLQEVQQIIGNAGIARVITEMGEILITHIDPLAS FT EEDLKEAVDRKLQASAGVTKVSMWQLSDGTKRARVRLPAKAAKQLVGQKLT FT VSCCISNIKEAPAINLQQQRCYRCLERGHIARECRSPVDRQKACIRCGAEG FT HLAKDCNAEVKCAVCSGPHRVGHSDCVRPMLRCPH" FT CDS 1829..5290 FT /product="NotoAg1_2p" FT /translation="MSSLRVLQLNVDHCREGQGLALQSAREHRADVLILSD FT MFTPPNNNGRWEFDASRKVAIVATGSYPIQRVWGSTVPGLVAAKVAGIDFI FT SVYAPPSLSPQEYERLLEAVELEASSHSHVVIAGDFNAWHTEWGSRRNNLR FT GEELLQMVEVLGLSILNNGSAPTFIGRGAARPSVIDVTFATPSLVLHDTWE FT VLDFARSDHQLIRFETNSPALAARRVKLSQRNRSQQRSPRRDPPINRQHTP FT CAGRRWKTKQFRENSFLLALKDVNFAEQAVTDADIVEMMTRACDEVMQRAN FT HLSSNPYRDLYSWSPELERLRGICLAARERLRLITDLQERSFVAADHRTAK FT RNLEKAIRASKRQQIDALIDTAEDNEFGGGYRVVMSTLRGSRVPQEKDPIE FT LGRIVSDLFPNHPPVPWPYVSDVTQGEASVDDVTPRELQDIAHQMATRKAP FT GLDGIPNAAVKAAIGMYPDVFCRMYQDCLTRGTFPSEWKRQRLVLLSKTGK FT PPGESSSYRPLSMLDALGKVLEQLILNRLNKHLEDPDSPRLSDAQYGFRRG FT RSTFSAIQRVVDAGRRAKSFRRTNHRDKRCLMVVALDICNAFNTASWQSIA FT DALRNKGVPSALLNIIGSYFEERKLIYNTSAGPVERHISAGVPQESILGPT FT LWNVMYDGVLGVELPPGAELIGYADDLVLLAPGTTPAAAAVVAEEAVSAVD FT RWLREHHLELAHAKTEMTVISSLQQPPEDITITVGGTEVPFSRTLKYLGVR FT LHYNLSWVPHVKAVIQKATQIVQAVTQLMPNHRGPKTSRCRLLAAVADSTM FT RYAAPVWHGALTNRECRSLLKRVQRKAAIRVARTFRTVRYETAVLPAGLVP FT ICRAVAEDTRVHSRRGTGVSSSELRKEERQRTIEEWQTTWDADAAADNASR FT YVRWAHLVIPDVGAWQLRNHGEVTFHLSQVLSGHGFLREYLNKMRFTSSPA FT CPRCPGVVEGVEHVMFECPRFAEVRSELLDGVLPETLEAHMLQSPTNWSNV FT CEAAKRITSKLQRCWDDECAILAAQAMLEEPANRLDPEAVRCTRNDLRNVA FT RRTQTVRQREEQCGERPSMPSSSPRTSERRANIRARMARLRQRHRQHQQDE FT RRGVEGGDIERGESVYPELASSPNNRQGGLTSAEKAAAVEADVASR" XX SQ Sequence 5540 BP; 1448 A; 1446 C; 1621 G; 1025 T; 0 other; gggtagggct tgagctgtca aaatagtgct gacgtccaaa agtgcgctcc gtgttaaact 60 aggaaattat agttaaaaac accagtatag tcctaaaaag aaacattggt agatgtgtgc 120 agtgatatag tatgtgcgca gcacgagtac cgagtggaaa gtggagtatt tttcattaaa 180 aagataaaag tgcgagggcc caagtctgga acaacaatac aacaatatgg cgtaaaccgg 240 gtgacattgg aaagacgtca cgcctaactg acaaaaaaaa aacaatccca aagggcaaaa 300 ctacatttaa agggggagct actagtaaaa gtgtcagaca agagtcggat aacgctcgga 360 taagaccaag aggcggacaa atagcaaggt gtgtaaattg aagtagcgtg atttcgacct 420 tcgggacggc gttacgttcg aatgccagca acgcaacagt gccgaagctt ggcaccggtc 480 ccgtggctgc aggacagcgg ctaagtagag gagctacgca gaagcaagca gcttcgccag 540 tgcttacgga gatgagagag gcaatcaggc agctcactaa ctcgttggcc gaggcgaatg 600 caagaaacga acgcatcaat gaggagctaa cacagatgcg catcctcatg actaaacagc 660 aggagtatac ggagcgccgg gaattgattg cccgggaaga aatggagaag atgcgtgcag 720 cacatgagcg cgaccgtact gcgctcaaca agttgcttat gcagggggcc ggtactagca 780 gccatcgggc agcagcaaca ccaacaacac caacacctca gccacgccga atgcagcagc 840 atcaggagaa acagcggcag ccgccacagc agcagcacca gcaaataggc ccatctacct 900 cggctgcacc gcctcaactc ctagtgtcgg gagcatcgtt cgacccggag ggggacgatg 960 gtcaaggcag tttcgctgag gtggtgaggc acaaatgggg gcggaatacc ggcaaaccgc 1020 gcggctacca gcagcagcaa agccagtctc accgacaggt ggttatcggc acgcagcagg 1080 agtgcctgca accggaacaa caacatcaac gccagcaaca gcatacggtg cgtcggcata 1140 atgtcgataa ggtagaagtc atcccaggcg agaaccagac ctgggaaact gtttaccaaa 1200 tggttaagga tgccattaag ttcgacccgg cgcataagga tctggcggat cacgtagtaa 1260 tcggccgccg cactcatgct gcccgtcttc ggatacagct cagttgcacg gcagactcaa 1320 cgctgatgct acaggaagtg caacaaatta ttggaaatgc tggtattgcg cgagtgataa 1380 cagagatggg cgagattctc atcactcaca tcgatcctct cgcaagcgaa gaagatttaa 1440 aggaggccgt agacagaaag ttgcaagcta gtgccggcgt tactaaggtc agcatgtggc 1500 aactgtccga tggcacaaaa cgagcccgcg ttagactacc ggcaaaggca gccaaacaac 1560 tcgtggggca aaagttgacg gtaagttgct gtataagcaa tatcaaggaa gccccggcca 1620 tcaatctcca acagcagcgc tgttaccgct gcctggagcg tggccatatt gctcgcgaat 1680 gtcgttctcc ggtcgaccga cagaaagcat gcattcggtg tggagcagaa ggccacttgg 1740 ctaaagactg caacgccgag gtgaagtgcg ccgtgtgcag tggtcctcat cgcgtcggtc 1800 acagtgattg tgtacgcccc atgctgcgat gtcctcacta agggtacttc aactcaatgt 1860 ggatcattgt cgggaaggac agggcctagc actgcaatcc gcgcgggaac atcgtgctga 1920 tgtcctgatc ttgtcggaca tgtttacgcc tcccaacaat aacgggcgat gggaattcga 1980 cgcatcgagg aaagtagcta tagtagccac cggctcgtac ccaatacaac gggtatgggg 2040 cagtacagtg ccgggactgg tggctgctaa agtggccggg atcgacttta tcagcgtcta 2100 cgctcctccg agcctatctc cacaggaata cgagcggctt cttgaggccg ttgagctgga 2160 ggcctcatcc cactcccacg tcgtgatcgc tggtgatttc aatgcttggc acacggaatg 2220 gggtagcaga cgcaataacc tgcgtggcga ggaattactg cagatggtgg aggtgctggg 2280 actctccatt ctcaataatg gcagcgcacc gacgttcatc ggcagaggag cagcaaggcc 2340 cagtgtcatt gacgtgacct tcgcaactcc gtcgctagta ctgcatgaca cctgggaggt 2400 actagatttc gccagatccg accaccagct gatccggttc gagaccaaca gccctgcact 2460 ggccgcaagg agagttaagc tttcccagcg gaatcggtcg cagcaacggt ctccccgccg 2520 tgatccacca atcaaccggc agcacactcc atgtgccggt aggaggtgga aaactaaaca 2580 attcagggaa aattctttcc tcctagcact caaagacgtg aacttcgccg agcaagctgt 2640 gactgatgcg gatatagtcg agatgatgac gagggcatgt gatgaagtga tgcaacgagc 2700 caaccacttg tccagcaacc cttatcgtga cctttactcg tggtctcctg agctggagcg 2760 gctacgtgga atatgtctag ccgcgcgcga gcggcttaga ctcatcaccg atctacaaga 2820 gaggagtttt gttgcagcag accatcgcac ggcgaaacgc aacctggaga aggcgattcg 2880 tgccagcaaa cgtcagcaga ttgacgcact gatcgataca gccgaggata atgagtttgg 2940 tggcgggtac agggtggtga tgtccacgct gcgcggcagt cgagtgccgc aggagaaaga 3000 cccaatcgag ctggggcgga tcgtgtctga cctgtttccc aaccacccgc cggtcccatg 3060 gccgtatgta agtgatgtca cccagggaga ggcatccgtc gacgacgtga ctcccaggga 3120 gctgcaggat atagcccacc agatggcaac aaggaaggca ccaggactag atggaattcc 3180 caacgccgca gtgaaggccg cgatcgggat gtacccggat gttttttgca gaatgtacca 3240 ggactgctta actcgtggca cgttcccgtc cgagtggaag cgccagcgcc tggtactgct 3300 ttcgaagacg ggcaaaccac ccggggaaag cagctcatat cggccgctga gcatgctcga 3360 cgcactcggc aaggtattgg agcaactaat cctgaaccgc ctcaacaagc atctcgagga 3420 cccggattca ccgcggttgt ccgatgccca atacggtttc cgccgaggac gctccacctt 3480 tagtgcgatc cagcgtgttg tagacgcagg gagaagggcc aagtcgttcc gtcgtaccaa 3540 ccatcgcgac aagcgctgtc tgatggtggt cgcattggat atttgcaacg cgtttaacac 3600 cgctagttgg cagtctatag ctgatgcgtt gcggaataag ggggtcccat cagcgcttct 3660 aaatataata ggaagctact tcgaggaacg caagctgata tacaacacca gcgcgggccc 3720 ggtcgagcgt cacatcagcg cgggagttcc acaggagtcc atcttgggcc cgaccctgtg 3780 gaacgtgatg tacgacggag tccttggcgt tgagctacca cctggggcgg aacttatcgg 3840 ctatgccgat gacctcgttt tgctggctcc aggcacaacg ccggcagcag cagcagtagt 3900 agctgaggaa gctgtgtcag cggtagaccg gtggctgcgc gagcatcact tggagctcgc 3960 acatgcgaaa acggagatga cggtgatctc tagcctgcag cagcctccgg aggacatcac 4020 catcaccgta ggaggtacag aggtgccgtt ctcgcgtacc ctcaaatacc tcggggtacg 4080 cttacactac aacctgtcgt gggttcctca tgtgaaggcg gttattcaga aggcaacgca 4140 gatagtacag gcggtcacac aattgatgcc gaaccaccga ggaccaaaga cgtcacgatg 4200 ccgcttgctt gcagcggtcg ccgactcgac aatgcgatac gctgcacccg tctggcacgg 4260 agccttgact aaccgagagt gccgcagtct gctaaagcgc gtgcagcgaa aggcagcgat 4320 ccgcgtggct cgaacgttcc ggacggtaag gtatgagacc gccgtgctgc ccgcgggact 4380 ggtgccaatc tgcagagccg tagcggagga cacccgagtt cacagcagac gcgggaccgg 4440 tgtaagtagc agcgagctcc ggaaagagga gcgacagcgg actattgaag agtggcagac 4500 gacttgggac gcagacgccg cagcagacaa cgccagcaga tatgtcaggt gggcacacct 4560 cgtaattccg gacgtgggag cctggcagtt gcggaatcac ggagaggtga cgtttcattt 4620 gtctcaggtg ttgtcaggac acggattttt acgcgaatac ctgaacaaaa tgagattcac 4680 ctcgtctccg gcctgccctc gttgccctgg tgtagtcgag ggagtagaac atgtaatgtt 4740 cgaatgccct cgctttgctg aggtgaggag tgagctattg gatggagttt tgccagaaac 4800 gttggaggcg cacatgcttc aatcacccac caactggagc aacgtgtgcg aggccgccaa 4860 gcgcataacc tcaaaactcc aacgctgctg ggacgacgaa tgcgccattc tcgccgcaca 4920 ggccatgctg gaggaacccg ccaatcggct cgatccagaa gcagtccggt gtacccggaa 4980 tgaccttcga aatgtagcga gaaggacgca gacggtgcgt caaagggagg agcagtgtgg 5040 cgaacggccg tctatgcctt catcgtcacc acgaacgtcg gagcgccggg cgaatatccg 5100 cgctcggatg gcaagacttc ggcagcgaca tcgccagcat caacaggatg aacgacgagg 5160 agttgaagga ggtgatatag agagagggga gtctgtctat cctgagctgg ccagctctcc 5220 caacaaccgg caaggcggtt tgacctcggc agagaaggcg gcggcggtgg aggcggacgt 5280 ggcctcccgc tagtcagcca aaaaatcaca cgaaagcacg cgaaggaatg ccaaaattgg 5340 tgcgaaattg ccccacacaa gggcagagag tgcaaattac cagcgagacc aaggaggata 5400 aaggatgcgc tcctaagagc tataacatcc cctcccccta gacccctcgc ggggcacagg 5460 ggaaggggca ggaagagggt taggtaattt ttaaatatta taaatttact gaaataaact 5520 aacccgattg ttaaaaaaaa 5540 // ID CR1-3_AG repbase; DNA; ANG; 5485 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 12-MAR-2003 (Rel. 8.02, Last updated, Version 1) XX DE CR1-3_AG is a CR1-like non-LTR retrotransposon - a consensus DE sequence. XX KW AP endonuclease; CR1 clade; CR1-3_AG; DNA/RNA-binding; PHD finger; KW Non-LTR retrotransposon; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5485 RA Kapitonov V.V. and Jurka J.; RT "CR1-3_AG, a family of CR1-like non-LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(2), 14-14 (2003). XX DR [1] (Consensus) XX CC CR1-3_AG is a family of CR1-like non-LTR retrotransposons. CC The CR1-3_AG consensus sequence was reconstructed based on CC multiple alignment of ~50 copies identified in the CC sequenced portion of the genome. Given the ~3% divergence CC of these copies from the consensus sequence, transposition of CC CR1-3_AG occurred less than 2 million years ago. CC The 3' terminus of CR1-3_AG is composed of the ATAA CC microsatellite. CC CR1-3_AG encodes two proteins: a 418-aa CR1-3_AG-ORF1p CC (positions 1057-2310) and 910-aa CR1-3_AG-ORF2p (positions CC 2314-5043). CR1-3_AG_ORF1p is DNA/RNA binding protein composed CC of the PDH domain (positions 5-40). CR1-3_AG-ORF2p is composed CC the AP endonuclease and reverse transcriptase domains. CC Putatively, the last protein is translated through the ribosomal CC frameshift. CC PKLFNSVVKNIPGTITPTLQQYPSLAASLSVSANEPSSFTNMPPxYNPQTPLLRGSGSPLDSDTLDTIPH CC TDTRMWLFFTRFSPSVTTEQISLMVQVRLALDKRDVFVHRLTKLGADTSTLSFISFKVGIPATLRNKALS CC PKTWPSALTYREFRDYRTNNYNTNTCATTETNSMLDQNVTLATDTYHNPLSHSGGSAAMPTTTTDTLAQP CC VTISSPAMTTTDTALFTNEPHLDLNERMLVTTTPTRSPPECLAAPKKRPKRGNAKRTDESADAGPSDE CC CVPVAQRKSGPDWSNASLRRLKKIKSKAYADYSRTRSSLHRRIFFDALNNYRRxNRVLYRSFIRRTERQL CC FSKPTRFWSFWNKRRNIRSIPPSMSYNGxTSIDTSDICNTFANRFADAFTLPVHNPNTLAEATRNTPSDA CC IDFIIPTIDEALIARTLNDIKPSTSSGPDNIPAYILKHCRQSLAPILAKIFNDSLMRGTYPASWKHARMV CC PIHKKGSRLHASNYRGIVSLCACAKVFELILYNPLLTAVQNYMSPSQHGFLPRRSSTTNLAEFVGYCFDN CC MDRGTQVDAVYIDFRAAFDSISHDILLSKLKKLGFLDWHITWLRSYLTGRSYYISIGSHRSHSFTSSSGV CC PQGSNLGPLLFLIYINDLSFVLPPGQHLMYADDVKIFAPVRNDSDCVRLQTILENLDSWCSRNALQVCAD CC KCQCISFSRARHPITFTYTMLNTALARTTCIRDLGVLLDQKMSFRPHIDSVVAKGNQLLGLITRTCSEFT CC DPMCVKSIFCAIVRSCLEYCCPIWCPLGVGDINRLEAIQRRLTRYAVRLLPWQSHHARPTYHQRCLLLGH. XX SQ Sequence 5485 BP; 1459 A; 1439 C; 1091 G; 1489 T; 7 other; tcagctatca tgtagcgcat cctcatccct agcgccgggc ggggtgggga atcgcgacat 60 gatagagtga acggccactt tccccgcgct gtgtgtgtgc gtgtgtgtga cgagacccga 120 aaacgcgaga caaatttcgg cacataaaaa aaaaaatatt gtcactatac actgtgctac 180 atgatgctta gaaggtcgtt gaagcatgaa acatgttcaa attaaaaaat attagattag 240 tgttagacgt tctgatacga ttgttttaaa taggcttcta caccctcaac tggcaggggc 300 atcgaatggt ctcatcagca gcggcacgaa ataaaatgaa ccatcaatac accgataaca 360 ctttgaatcc caatccagct tctcccacag cgtctcaagg tcacacacaa attaataaat 420 gctaacaccc accaataaca atacacaacc gcaatgcaca tatgagtatc aacgcataca 480 ccgcaacagc caatcagctg gcggcggaag gcacgggcag aagcgcaagt aaggcagggc 540 agggtgaggg agagggagaa gaagcggacg aaaaaataaa tgggcgccaa tatttttaaa 600 ttttttgacc gtttgttttg ctccaccgcc ataaatagtt cgtccatttc ttaaacagat 660 ggcgctaacc ctgcgcaggc ctgcgcagga atcgctgaag tcccgcgcgg ggaacgggcc 720 aaaatctgcg ctggccagcg cagaatttgg tacatttcct gcgcaggaaa ctgctcgaat 780 ttgcgcagcg gggatctgtg atgtacatat tataatgtac acactctcat tccaaacgtc 840 actgtgactg gtcgaagatt ctctgcgctc cgactatttt caatattatt ccgagaaaac 900 ctgtatcttt gctgaacctg ttgctggtgt aggtgacttc attgcctggt tttgtttaac 960 aactggatta ttcaccgtca atcacctgga aacgttttcc ggcccacaat cattacgtcg 1020 cttcaacatc acaatcaatc gctcatccac aaagcaatgg ctggtatttg ttccgcctgc 1080 gctaacgaca tcgtagctgc tgatcgcatt gtgaaatgcc agggttggtg caactctgag 1140 tttcacttct catgcagcgg actttctgag gaactgtccg ctactataga gtcctgtgca 1200 caactttttt gggcctgtaa agcctgtgta aagtttcaca aggatccgcg tacggccgtg 1260 ttgaggtcat ccaccccgta cactcacact tctgtcgacc tactgtccag catagccgac 1320 cttaaagcgg gcctccgtag cgagctgtca cagcatacca cagctattaa gttagagctt 1380 ctggaagttt taaaggcgga gatccgttcc tgcttgcgat cgacgcatgc cgccaccgat 1440 ttaccgtctc aacggcccat tcatcacaat tcagcgccta aattgtttaa ttcagtggtc 1500 aaaaatatcc ctggtactat tacacccaca ctacaacagt atccatcatt agccgcttct 1560 cttagtgtga gtgcgaacga accgtcatcc ttcaccaata tgccaccast ttacaacccg 1620 caaacaccac tactcagagg atcgggatcg ccgctcgatt ctgacacact agacaccatc 1680 ccacacactg atacgcgaat gtggctattc tttacgcgtt tctccccatc ggttaccact 1740 gagcagattt ctctcatggt gcaagtacgt ctagcactcg ataagcggga tgtgtttgta 1800 caccgtctga cgaagcttgg tgccgacact agtacactct catttatctc atttaaggtg 1860 ggcataccag ccactctacg caacaaggct ctctcaccta agacatggcc ctctgctctt 1920 acctaccgag agttccgtga ctatcggacc aataattata acactaatac ctgtgcaaca 1980 actgaaacaa attcgatgct cgatcaaaac gttactctcg ctaccgacac ttatcataat 2040 cctctctcac attctggagg gagtgccgcg atgccaacta ctacaacaga tacgctcgca 2100 cagcccgtta cgatttcctc acctgccatg accactacag acacagcttt gtttacaaac 2160 gaaccacatc ttgaccttaa tgagcgaatg cttgttacca ccacacctac caggtcacct 2220 cccgaatgct tagccgctcc taaaaaacga ccgaagcgcg gaaacgctaa acggactgat 2280 gaatctgctg acgctggccc gtcggatgaa tagcaatctt gcactgcatc cggcgatgca 2340 cctcgctcag ctcatcccag tctctctatc tactaccaaa atgtacgtgg tctacgaacg 2400 aaaactacaa atcttcgcct ggcgctatca gaatcagaat atgattttat cattctcacc 2460 gagacttggc ttactcagtc cataccttct tcgctcctca ctgacgatca ttatcatatc 2520 tacaggtgcg ataggaatct ttccaacagt gccctctcac gcggtggggg tgttttaatt 2580 gcatgttcct cttcaatacc gacatgtgaa atcgcatcgc ctaataccat actggaacaa 2640 ctttggatca aaacattgct gccaggtgtc tctgtttaca tcggcgttgt ttacattccg 2700 cctagtcatg cgaatgaccc cgcagtgatg aacgctttac atgatagtgt acgtgaaatt 2760 tcaagccgca ttaaagagag cgatttatta tacgtcttcg gagatttcaa taaacctgat 2820 atcagatggg agctgactaa tacatcagaa gccaccgatt gctctccatg ttattctgtc 2880 atgcattatg cacctttatg caattccgtg gctaataccg atttcgttga tgggttgcat 2940 agtaccggat tatttcagtt gagtggtatt gcaaatcaat ctgggcgtca attggatctg 3000 gtcttcgcaa accttgccgc aaccaatatt ttgtgcgact caatcacacc tctgcactct 3060 gtgaatggta cttcggctct agagaactcc ctcccatacg taacacactg tagtattcca 3120 cttctcagtg aggactttca tcatccttca ttggatatga tgatttatta tcccgtacaa 3180 ctatcccaca ccaccaacag tcacactcgc agtacagtca atagaaattt cttcaaaacg 3240 aatgtggaac gtatgaattc tcttattgtg tcgtttgacc gcaattttga ctgctccaac 3300 tttgccacta tcgacgaagc caccgatttc tttagcgttt ttatgcgctc agcgattaat 3360 tcctgcgttc ctgttgctca acgaaagtct ggccccgatt ggtctaatgc atctttaaga 3420 cggttgaaaa aaataaaatc aaaagcctac gcggattaca gtagaacgag atcatcgctg 3480 cataggagaa tttttttcga tgcactgaac aattatcgtc gacawaatcg tgtgctctac 3540 cgctccttca ttcgccgtac tgaaaggcag ctgttttcta agccgacacg gttctggagc 3600 ttctggaaca aacggcgcaa tataagaagt atccctccgt caatgagcta caatggcsaa 3660 actagtatcg atacatccga tatttgcaac actttcgcca atcgtttcgc tgatgcattc 3720 acccttcctg ttcacaatcc taacacacta gcagaggcca cycgcaatac tccatcggat 3780 gctatcgatt ttattatacc cacaattgac gaagcattaa ttgcgcgcac actcaacgat 3840 ataaaaccat ctacatcatc tggacctgac aatattcccg catacatttt gaagcactgc 3900 cgtcaatcac tcgcacccat tcttgccaaa atatttaatg attcccttat gcgtggcacg 3960 tatcctgcgt cctggaaaca cgcgcgaatg gttcctatcc ataaaaaagg cagtcgactt 4020 catgctagta attatcgtgg cattgtttcc ctatgcgctt gtgcaaaggt gtttgagctc 4080 attctataca atccgctact cacagcagtt caaaactata tgagccctag tcagcatgga 4140 tttctcccaa ggagatcttc caccacaaat cttgctgaat ttgttggyta ctgcttcgac 4200 aacatggatc gtggtactca agttgatgca gtatatatcg acttcagggc tgcgttcgat 4260 agtatytctc atgatattct actctcgaag ctaaaaaaac tcggtttcct cgactggcac 4320 atcacctggc tgcgttcata tttaactggt cgttcgtact acataagcat aggatctcat 4380 cgttctcact ccttcaccag ctcctccggt gtgcctcaag ggagtaattt gggaccgcta 4440 ctcttcctca tctatataaa tgatctatct ttcgttttac cgccaggcca acacctaatg 4500 tacgccgacg atgtaaaaat attcgctcca gttagaaacg acagtgactg tgtacgcctt 4560 caaacgatcc ttgagaatct ggatagctgg tgcagcagaa acgccctcca agtgtgtgct 4620 gataaatgcc agtgtatatc attcagcaga gcccgtcacc ccatcacgtt tacatacact 4680 atgctcaaca cggctttggc tcgcacgaca tgtatccgtg atctgggggt gctactcgat 4740 cagaagatgt catttcgccc tcacattgat agcgttgttg cgaagggaaa tcagctactt 4800 ggtttaatta cgcggacctg tagcgagttt accgatccca tgtgcgtcaa gtcgatcttc 4860 tgtgccatcg taaggtcgtg cctggagtac tgctgtccga tctggtgccc gcttggcgtt 4920 ggtgacatca atcgcctcga agccattcaa cggagactca ccaggtacgc ggttcgactc 4980 cttccatggc aatcccacca cgctcggccc acctaccatc agcggtgtct gcttctcgga 5040 cattgaacca ctctgctctc gacgtaaaat atgacgccca atgccttttc atattccggc 5100 tccttaaach cggagagatc gattccccgg catcgaatac tagccagcat caatttgttc 5160 gctccctgtc ggattcttag atccaatttc catctccgtg taccgcgtac ccgcaacaac 5220 cattagccag ggacacccta ttatacgtat gtccctcgag ttcaatgaag tgttagattt 5280 gttcgatttt agtatgtcta cttctacgtt caaggagaaa ttgcgtctac gtcacattta 5340 atgtttgctt ataactagag gtctatttat atgctataaa tgatcattgt tatacatacc 5400 ttaatttaat tataaggttg ccgattagac acgatggtcc gtcggtttat atatgaaata 5460 aataaataaa taaataaata aataa 5485 // ID RTAg3 repbase; DNA; ANG; 6448 BP. XX AC AB090812; XX DT 14-SEP-2005 (Rel. 10.09, Created) DT 14-SEP-2005 (Rel. 10.09, Last updated, Version 1) XX DE Anopheles gambiae retrotransposon RTAg3 DNA, complete sequence. XX KW Non-LTR Retrotransposon; Transposable Element; RTAg3. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-6448 RA Kojima K.K. and Fujiwara H.; RT "Evolution of target specificity in R1 clade non-LTR RT retrotransposons."; RL Mol Biol Evol 20(3), 351-361 (2003). XX DR EMBL/GenBank/DDBJ; AB090812; Positions 1 6448. XX FH Key Location/Qualifiers FT CDS 1206..2738 FT /product="RTAg3_1p" FT /translation="MMASSGMSTRASARSASVDCRSSLASGSKLFAPEPRV FT ALPRVSVTGINKPTVATKAASTTPELELLKATIQQLEEQNLEMKEQNFRLA FT EQITRMCQLLQEEKEEAKRREEKLKAQMEKLAAAHQRDRNLLNSLLAAKVA FT GGQPSASSRQPPTPLPRRSSAQPQQQQQQQQRNQQEQEQPRASTSHAVMLP FT RSEASTAVRGDVVPELTFSEVVRRRYRGKATGKPRSQQQPQQQQQPQQKQQ FT QLQRRQQQQQQHQGQRYVPPQLRQQAHQQQQRQQQKVRPRPDKIEVVPSAG FT HSWYTLYKTVRDAVKQDPHKGLADHLKMGKRSHAANFRMELSNSANASLVR FT AEGQEIVGDAGLARVITDMADVLITNIDPLATEEDIKKAIEREHQEPIEIV FT RVSVWELQDGTQRARVHLPAKAAAAFEGSKLRLCGCISKIRGVEKAAPERQ FT RCYRCLERGHLAHACRSSTDRQQLCIRCGSEGHKARDCSSYVKCAACGGPH FT RIGHMSCEHPASRST" FT CDS 2669..6187 FT /product="RTAg3_2p" FT /translation="MCGLRWTSSHRAHELRTSGFAFNLKVMQINVDHCQAG FT QDLALQAAREHRADVLLLSDIYRPPANNGRWAYDAAKSVAVVATSSYPIQR FT VLRSAVPGIVAAQIGGIVFISCYARPRRPEEDYEGFLAAVQLEASTHSQVV FT IDGDFNAWHTEWGSARNSQRGEDLLQLIQSVQLQVINSGNEPTFIGRGAAT FT SSVIDVCFATPSIARPETWEVHEFARSDHQLITYSVGEAEQQSRGLSTGGP FT SAGRQRVICAGRRWITTQFHVDSFRSALEDVNFAEQATTHAGLVDAMVDAC FT DIVMQRAPNVLQHQHRDVYWWAPVIEELRNECIAARERMRLTTDLQERSLA FT AAEHRTAKTRLEKAIKVGKRAEFAKLIDIAEENELGVGYQVVLSHLRGSRV FT PPETDPVELGRIVTDLFPTHPPVYWPETDDVDSGASDFDRVTPEELQEIAA FT HMAIRKAPGLDGIPNAAVKVAIEKYPGVFCRVYQDCLNTGTFPQQWKRQRL FT VLLPKPGKAPGESSSYRPLCMLDALGKVLERLILNRHLEDPDSPQLSDAQY FT GFRRGRSTISAIQSLVDAGKASRSFGQTNNRDKRCLMVVALDVRNAFNTAS FT WQSIANALREKGVPSGLLRILQSYFTDRELIFNTSEGPVVRCVSAGVPQGS FT ILGPTLWNVMYDGVLRIPLPDEAKVIGYADDLVVLAPGRTPEESAAVAEAA FT VSAVDQWMQQHHLELAPAKTEMTIISSLKHPPSHISIDVRGTAVPYSRSIK FT HLGVLVHDHLSWIPHVTAVTQRAVQIAQAVGRLMPNHRGPKMSKSRLLAAV FT ADSVMRYAAPVWHEALNTRECRRLLERVQRKSAIAVARTFRTVRYETAVLL FT AGLLPICRAICEDTRVHSRRGTAAGAQLRKEERETTIAEWQATWDSDAAGH FT QASGYVRWAHRLIPDIGAWQSRKHGEVNFHLSQIISGHGFFRKYLADMKFT FT SSPDCPNCPGVRESAEHAMFACLRFAEVRERLMDGVNPDTLLTHMLQSQQN FT WSNVCEAAKQITTILQREWDDFRTSLFEQGVLADNAQLRNADVLRQERLRR FT YNETRNAARRAATQQRQAERLPPPPPSPRTERRREVNRLAVARLRERQRAA FT RAEMHGAYQPAPSNDSDDDDDDVENRQATNAASTSEAARTAAEESRVGLTE FT AEAAAAVEAELSSR" XX SQ Sequence 6448 BP; 1605 A; 1761 C; 1882 G; 1200 T; 0 other; gtgtacgccc cccccccacc tccggttgcg ctagaattcc gggtgtggaa atttggggaa 60 aaatttgtgt tttcggcccc ggaaacactc aattttagtg aaattggtgg tgctgaatcg 120 ttttatcgtg tttttgagac gatctgagca gtgcaaagtg tcaaaaagtg tttgtgtttt 180 tttccccata cattttgtat gggaaacttt gtatacacat atttcagtga aaactgcacg 240 gattttgtcc ggatgtgtct tgatagttgc gcgtagtgac gttccaagtg ttgtgcgaga 300 aaacggacgc gaaatcggcg gaaaaacgca aaaataaaaa agtgtcgggg gtgctaccgc 360 cggagcacgt gttttcgtaa aaacggaata aatagttaat cgtgaataac tccgtgagtt 420 ttggtccgaa tcgagtacgg ttttcaccgt tgtgctagtt ttaacgcgta caaaaagatc 480 caatcaaaaa actagtgaaa atcattaaaa aaaatttgac atttctgggt gacagttcag 540 cgatacccgt aaagcaaatt tggtttggga agcagttccc gaacagcaaa atttgaattt 600 accgttagtg ctcgaaaact gctcaaatcc tggaaagtgt gcgtgaaatt cagtgaagtg 660 gtgtagcgtc agcgggagct cgaatctgct cgattcgagg cactcaaaaa ctgctcgaat 720 ttttcgaatc tgctcgaaaa attcggaaaa ttcggagaag ctgtagggtg gtagagggcg 780 acataacaaa cacgggaaaa ctattttcca ccctttcccc ctccctccct cccggtgaaa 840 agtccctaaa gcaaaaaact aggtccaaaa aagggtcgaa aaagcgggca gcgcagtttg 900 gacgtaaaaa cgtgccggga aatttcggga ttggtgcgga tccattcctt acgtaagcga 960 cgcaccctct cgtaattttc tccacccggc acccctcccc cccagcccac caggggggtt 1020 gcaagtcgga ccgtagtgga aaaaccggcg tggaagaagg attctacagc agcgagtgag 1080 cggcagaaca tcagctagac ggcgctgcat acctggggtc atcccgaccc cccggggtca 1140 accgaccccc ggggtcacat cgacccccag ggttatctaa cccaccaact tgtacgtcgg 1200 caaagatgat ggcatcatcc gggatgtcaa cccgcgcgag tgcgaggtcg gcctctgtgg 1260 attgccgtag cagcttggca tccggttcga agctgtttgc gcccgagcct cgagtggcac 1320 tgccaagggt tagcgtcaca ggcatcaaca agcccaccgt agcaaccaag gccgcctcga 1380 ccacaccgga gctcgagttg cttaaggcaa caattcagca actggaggag cagaacttgg 1440 aaatgaagga gcaaaatttt cgcctcgcgg agcagataac tcgcatgtgc caactgctgc 1500 aagaggagaa ggaggaggca aaacgtcgag aggagaagtt gaaggcacag atggagaagc 1560 tcgctgccgc acatcagcgc gaccgaaact tgctcaactc gctactggca gcaaaggttg 1620 ccggcggaca accgtcagct agttcgcgtc aacctccaac tccgttgcca cgccgatcct 1680 ctgcgcagcc gcagcagcaa caacaacagc agcagcggaa ccagcaagag caggagcagc 1740 cccgcgcgtc gacgtcgcat gcagtcatgc tgccgcgtag cgaggcatcg acagccgtcc 1800 gcggagacgt cgtgccggag ctcacattca gtgaggtggt gcgtcgcagg taccgtggca 1860 aggccactgg caagccacgc tcccagcagc agccacaaca gcagcagcag ccgcaacaga 1920 agcagcagca gcttcagcgt agacagcagc agcagcagca acatcaggga cagcggtatg 1980 ttccgccgca actccggcag caagcacatc agcagcagca gcggcagcag caaaaggttc 2040 ggccaaggcc ggacaagata gaggtggtcc cgagtgccgg acactcctgg tacactttgt 2100 acaaaacagt tcgggatgcg gttaaacaag acccgcacaa gggccttgca gaccacctta 2160 aaatgggcaa gcgcagccac gccgcaaatt tccgaatgga gttaagtaat tcggccaacg 2220 ccagcctggt ccgcgcagaa ggtcaggaga tcgtcggtga cgccggactt gcccgggtga 2280 tcaccgacat ggccgatgtc ctgataacga acatcgatcc tctggcaaca gaagaggaca 2340 tcaaaaaggc cattgagaga gaacaccaag agccgatcga aatcgtccga gtgagcgtgt 2400 gggagcttca agatggcact cagcgagctc gagtccacct gcccgctaag gctgccgctg 2460 ctttcgaggg gtcgaagctc cgactgtgtg gctgcattag caaaatcaga ggcgttgaaa 2520 aagcagcacc tgagcgccag cgctgctatc gctgcctgga gcgcggccac cttgcccacg 2580 catgccgttc ctcgacagac cgtcagcaac tctgcatccg gtgtggcagt gaaggtcaca 2640 aagcccggga ctgctccagc tacgttaaat gtgcggcctg cggtggacct catcgcatcg 2700 ggcacatgag ctgcgaacat ccggcttcgc gttcaactta aaagtgatgc aaatcaacgt 2760 ggatcactgt caagcagggc aggacttggc gctccaagca gcgcgtgaac accgtgctga 2820 cgtcctgctc ctgtcggaca tctaccgacc accggcgaac aatgggcgtt gggcgtatga 2880 tgccgccaaa tcggtagcag ttgtggctac aagttcctac ccaatccagc gggtgttgcg 2940 cagtgctgtg cctggaattg tagccgcaca gattggcggt atcgtcttca taagctgcta 3000 cgcgcgaccg agacgcccag aggaggacta tgaaggcttt ctggccgcag ttcagctgga 3060 ggcatcaacc cactcccagg tcgtcatcga cggcgatttt aacgcctggc acacggagtg 3120 gggtagtgcc aggaacagcc agagaggtga agatctgctg cagcttatcc agagcgttca 3180 gctacaggta atcaactccg ggaatgaacc cacattcatt ggcagaggag cggccaccag 3240 cagcgtcatt gacgtctgct tcgccactcc gtccatcgct cggccagaaa cgtgggaggt 3300 gcacgagttt gcccggtccg atcatcaatt gattacatac agcgttgggg aagcggaaca 3360 acagtcccgc gggttgtcga ctggtggtcc gtcagccggc cggcagcgcg ttatctgcgc 3420 tggtaggcga tggattacca cgcagttcca cgtagacagc ttccgttctg ctctcgagga 3480 cgtgaacttc gcggaacaag cgacgacaca cgctggccta gtcgacgcta tggtcgacgc 3540 gtgcgacatt gtcatgcagc gggcccccaa cgtgttgcag catcaacatc gcgacgtcta 3600 ttggtgggca ccggtaattg aagagctgcg gaatgagtgc attgcggcgc gtgagcggat 3660 gcgcctaacc accgatctgc aagagaggag tctcgccgca gccgagcacc ggactgcgaa 3720 gactcggcta gaaaaagcca tcaaagtagg caaacgtgca gagttcgcca agcttataga 3780 catcgccgag gagaacgagc ttggagtggg gtatcaggtt gtcctgtctc atctgcgcgg 3840 cagtcgtgta ccgcctgaga cagacccggt cgagctggga cggatcgtta ccgatctgtt 3900 ccccacccac ccaccggtct attggccgga aaccgacgat gtcgattccg gagcgtccga 3960 ttttgatcgc gtgacccccg aagagctgca ggagatcgcg gctcatatgg ctatcaggaa 4020 agcgccagga ctggacggga tccccaatgc tgcggtgaag gtcgcgattg agaagtaccc 4080 gggggttttc tgccgcgtgt accaggactg cctcaacact ggtacgtttc cgcaacagtg 4140 gaaacggcag cgcttggtac tgctgcccaa gccaggcaaa gcccccggag aaagcagctc 4200 ctacaggcca ctgtgcatgc tggatgcact aggcaaagtg ttggagcggc ttatcctcaa 4260 cagacacctc gaggacccgg attcaccgca gctctcggac gcgcagtacg gctttcgtcg 4320 cggacgatcc accatcagtg ccatccaaag cctggtggac gcaggcaagg cgtcccgatc 4380 gttcggccag actaataatc gcgacaagcg atgcctgatg gtggttgcgc tggatgtccg 4440 caacgcattt aatactgcca gctggcagtc gatcgccaat gcgttgcgag aaaagggggt 4500 cccttcaggg ctgctgcgga tactgcagtc ctacttcacg gatcgggagc tcatctttaa 4560 caccagcgag ggacccgtcg tacgttgcgt cagcgcggga gttccacaag ggtccatact 4620 gggcccgaca ttgtggaacg taatgtacga cggggtgctg cggattcccc tacccgacga 4680 ggccaaggtc attggctacg ccgatgatct tgtcgtcctg gccccgggta ggacaccgga 4740 ggagtctgca gcagtggcgg aggcagcggt gtcagcagtc gaccagtgga tgcagcagca 4800 ccacttggag ctggcaccag ctaagacgga gatgacgatt atctcaagtc tgaagcatcc 4860 tccaagccac atctccatcg acgtgagagg aactgctgtc ccatattcga ggagcatcaa 4920 gcacttgggt gtactggtac acgaccatct atcgtggata cctcacgtga ctgcagtgac 4980 gcagcgggcg gtccagattg cgcaggcggt tggtcgactc atgccgaacc accgggggcc 5040 gaagatgtca aagtcccgac ttttggcagc ggtggctgac tcggtgatgc gttacgccgc 5100 acctgtatgg cacgaggcgc tgaacactcg cgagtgccgc aggctgctag agcgagtcca 5160 gcgcaaatca gctatcgccg tggcccggac gttccggacg gttcggtacg agaccgcagt 5220 gctgctcgcg ggacttctgc cgatctgcag agcaatctgt gaggacacca gggtgcacag 5280 ccgccgtggg actgcagccg gtgcacaact gaggaaggag gaacgtgaga cgaccatcgc 5340 cgagtggcaa gcaacatggg acagcgatgc agctggtcat caagccagtg gttacgtccg 5400 gtgggcgcac cgccttatcc cggacatcgg cgcatggcag tcgcggaagc atggagaggt 5460 gaactttcat ctgtcgcaga tcatctccgg tcatggattc ttccgtaagt atcttgcgga 5520 tatgaaattc acctcgtccc cggactgccc aaattgccct ggcgtaagag agagcgccga 5580 acacgcgatg ttcgcttgtc tgcgcttcgc cgaggttcgc gaaaggctga tggacggcgt 5640 caaccccgac acgctgctga cccacatgct ccagagccag cagaattgga gcaacgtctg 5700 cgaggcagcc aagcagataa caaccatcct gcagcgcgaa tgggacgact tccgcacgtc 5760 gttgtttgag cagggcgtac tagctgacaa cgcccagctc cgcaatgcag atgtccttcg 5820 tcaggaaagg ctgcggcgtt acaacgaaac ccggaatgca gcgagaagag ccgcaacgca 5880 gcagcgacag gcagaacgcc tgcccccacc accgccctca ccaagaactg agcggagacg 5940 ggaggtgaac cgtcttgcgg tggcgagact aagggaacgt cagcgagcag ctcgtgccga 6000 aatgcacggc gcataccaac cagctccatc taacgatagc gatgacgacg atgacgacgt 6060 tgaaaaccgt caagcaacca atgcagcttc aacgtcagaa gcagcgcgaa cggcagctga 6120 agagtcccgc gtagggctga cagaagccga ggccgcagca gcggttgagg ctgagttgtc 6180 ctcccgctag gatgggtgat agaacaagca ggaagccttg agagggagtc cttataaaac 6240 aaaatgggaa gaaatagaac gaggaaggga aaaaaataaa taaaaataaa tgagttaggt 6300 gcgctttgca cggatgtagg ccgctcgaaa gagcagaaaa acccccttta gacccttcgc 6360 ggggcaaaag tgtggcttag gtgagggttg ggtctagaca gtaaaatgaa atgaactgaa 6420 taaacaaccc gaatacttaa aaaaaaaa 6448 // ID GYPSY71-I_AG repbase; DNA; ANG; 4621 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY71-I_AG is an internal portion of retrotransposon GYPSY71_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY71-I_AG; GYPSY71-LTR_AG; Gypsy clade; KW MDG3 lineage; RNase-H; integrase GYPSY71_AG; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4621 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY71_AG, a member of the MDG3 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 181-181 (2004). XX DR [1] (Consensus) XX CC GYPSY71_AG is a family of gypsy-like LTR retrotransposons that, CC according to the aminoacid sequence of its reverse CC transcriptase, RNase and integrase is phylogenetically grouped CC with representatives of the MDG3 lineage of other organisms. CC GYPSY30_AG, GYPSY31_AG, GYPSY32_AG, GYPSY33_AG, GYPSY34_AG, CC GYPSY35_AG, GYPSY36_AG, GYPSY37_AG, GYPSY38_AG and GYPSY72_AG CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY71-I_AG consensus was reconstructed after multiple CC alignment of 4 copies. The consensus encodes the 1361-aa CC GYPSY71_AGP gag-pol like poliprotein (pos. 489-4571). The CC sequence of the LTRs flanking GYPSY71-I_AG is deposited as CC GYPSY71-LTR_AG. CC GYPSY71_AGP: CC MATVKQLCEDFTKIGLVRECERLGLETTGGKVEIANRIVKYRATVASGDNAGPSGSGHRAEPAN CC AIDDVENYAGNQPYESCDDDSEEKADLDLPEEGDSFVAEDDDETEEDPFQTAVRISTPKRPQRV CC YAFRDVEDSIETFGAEDGHDVRIWLAHLDSVSKSAGWNDEQKLIMLRKKMTGIARKFVSSLRNV CC QTYAILKKELIAEFAPFVRSSDVHRILANRKKETAETMREYVYEMQRIAAQIDLDEPSLCEYIV CC NGVTDDDFFKSLLYEAQTIRVLKEKLLNFEKVRMARKKKTTDKEENKRVLSSSSRVDKRAEQRC CC YNCGNKGHQARACAQTQGGPKCFSCREYGHKASECARNKSVVPAKINVTEESVGMVDVVLNKTS CC VKALFDSGSNQNLVTIGCYKRIEGSPLIDTSMWFQGFGGMRTKAIGMFTVDVTVDDNVFSGVRF CC FVVPNESMSYDAVLGRDSLNYFEVTMTTAGVKVRPYGSTDEMFSIVCDNEDNLDVSPRFSERVK CC AVISGYKPAVNVNSRVETKIILHDETPVRSSPRRFAPGEKAVLEKTIDEWLAAGIIRESESDFA CC SPVTLARKKDGSLRVCVDYRELNRKMVKDCFPMRNIEDQIDRLKSARVFTTLDLKNSFFHVPVE CC KSSQRYTGFVTHTGQYEFLRTPFGLVNSPASFSRFVADVFREFIKSERVLVYVDDLIIPSLDEE CC SNFQTLKELLNVASENGVQFNWKKSQFLKDEVEYLGYVIRGGCYRIAPSKLRSVQLFPEPKNVK CC QLQRFLGLTSYFRKFIAGYATISKPLTSLLQKGVEFVFGEEERSSFDELKRCLVTDPVLKIYDE CC SAETELHTDASKYGYGAALMQKSDDDKFHPVAFMSQQTSNAEKNYSAYHLEVLAVVRAVEKFRV CC YLLGIKFKIVTDCAAFGHTLKSKELSARIARWALMLEEYEYEVVHRPGSSMKHVDALSRAPVMI CC VKSDPMIEAIRKMQQSDERAKAIIELLKTQSFEDFVMSDGLLMKVVKGREVIVVPSGMQSDLIR CC RIHEKGHLGARKIEGIIEQEFYIPNASEKIKQTIECCVKCILAERKRGKVDGLLYPIAKGDVPL CC DTYHVDHLGPMDITEKRYKYLFVVVDAFSKYTWIYPTKTTNSFEVIQRLTTQSEVFGNPRRIIS CC DKGAAFTSNDFKRYCEDQDIERVEITTGVPRGNGQVERVNQVIIAMLRKMSVNDPAKWYKHVAN CC IQRWINSSPHQSIGVTPFEAMFGVPMRHEGDLRLGELMEEIRVAQHHDQREQTRAGARVSIEKA CC QEEQRRSYDLRARSATTYREGDLVVIKRTQFGPGRKYAAEYLGPYKVTSVRPHDRYDVEKINGE CC GPKVTSTASSHMKPYRF. XX SQ Sequence 4621 BP; 1325 A; 798 C; 1336 G; 1162 T; 0 other; tgggggctca accgggatac ttttttccca aacgagtgag gcttttgcga tgaaagcaat 60 cggtgcgata atgtgtgatg tttcgtgagg ctattttgtg attaaatagc aaaaaactgt 120 gaggcttttt ttataagcaa atagtgagac gttttgtgaa agagaagaac agtgaggctt 180 atgtagcaac gcgtgcggat cagtgaggct ttcgtagcaa tccgtgcgaa tcagtgaggc 240 ttgtgaagca attagtgagg ctttattttt agcaaccagt gagaatccgt gagtgatagc 300 gaggcttttc gaagcacacc gagtgaagaa attaagcgag gcttcacagc aatctgtgtg 360 tgaaaattct gtgtgaattt tacgggtcaa ctagtgtgac attttgtgcg tgagtgcggg 420 aagtcgcgaa ttgcctggta gtgtgtgtcg gctttgtgag tgaaagtacc gcgagagaga 480 gagatagaat ggcaacggtc aagcagcttt gtgaagattt cacgaagatt gggttggtgc 540 gcgagtgtga gaggcttggc ctagaaacca ctggaggaaa agtggaaatc gcgaatcgaa 600 ttgtgaaata ccgagcgacg gttgcgtcag gggacaacgc cggtccgtcc gggagtggac 660 acagagcaga accagctaac gccatagacg acgtcgaaaa ctacgcgggt aaccaaccgt 720 acgagagttg cgatgatgat tccgaggaaa aagcggatct ggatcttccg gaagaaggtg 780 atagctttgt tgccgaggat gacgatgaaa cggaagaaga cccttttcaa actgctgtgc 840 gaatctcgac gcctaaacga ccgcaacgcg tttacgcatt tcgagatgtg gaagatagca 900 ttgagacgtt tggagcagag gatgggcacg atgtgcgtat ttggctggca cacctcgatt 960 ccgtatcaaa gtcagcagga tggaatgacg aacaaaagct aatcatgtta cgtaaaaaga 1020 tgacgggaat cgcaagaaag ttcgtgtcgt ctttgcgtaa tgtgcaaact tatgcgatat 1080 tgaaaaaaga gctgatcgcg gaatttgctc catttgtgag gtcgagtgat gtgcatcgga 1140 ttctcgcgaa tcggaagaag gaaacggccg agacgatgcg agagtacgtt tacgaaatgc 1200 aacgaattgc tgcccaaatc gatttggacg aaccaagctt gtgtgagtac atcgttaacg 1260 gtgtgaccga cgatgatttt ttcaaatcat tgctgtacga ggcgcaaaca attcgagtat 1320 taaaagaaaa gttgctcaac tttgaaaaag tgcgcatggc tcgaaagaag aaaacgacgg 1380 ataaagaaga aaacaaacga gttttgtcat ccagtagccg cgttgacaaa cgggcggagc 1440 agcgatgcta caattgcgga aacaaaggac accaagctcg cgcgtgcgcg cagacacagg 1500 gtggtccgaa atgtttctcg tgtcgtgagt acggtcataa ggcgagcgag tgtgcgcgga 1560 acaaaagcgt cgttcctgcg aaaatcaacg tgacggaaga atcggtggga atggttgatg 1620 tcgtgttgaa caaaacatcg gtcaaggcat tgtttgacag tggaagcaac caaaacttgg 1680 tgacaatagg ttgttacaaa agaatcgagg gatcaccgct gatcgatact tcgatgtggt 1740 tccagggctt tggtggcatg agaacaaagg cgatcggcat gttcacggtg gacgttacgg 1800 tggatgataa cgtttttagt ggtgtgcgat tttttgtggt gccaaatgaa agcatgtctt 1860 acgatgcagt attgggcaga gattccttga actattttga agttacgatg acaacggcgg 1920 gtgtcaaagt caggccatat ggttcaacgg atgaaatgtt ttctattgtg tgtgacaatg 1980 aagacaattt ggatgtgtct cctcgatttt cggaaagagt aaaggcggtt atttcggggt 2040 acaaacctgc ggtaaacgtg aatagtcgtg ttgagacgaa aattattttg catgacgaga 2100 cgcctgtgcg ttcgtcgcca aggcgttttg ctccgggtga aaaggcggtg ctggagaaaa 2160 caatcgacga gtggttagcc gcgggaataa ttcgagaaag tgagagtgat tttgcgagtc 2220 cggtaacgtt agcgagaaaa aaggacggtt ccttacgcgt ttgtgttgat tatcgcgaac 2280 ttaatcgaaa aatggttaag gattgttttc ccatgaggaa catagaagat caaatcgatc 2340 gcttgaagtc agccagagtt tttaccacac ttgacttgaa aaattcgttt tttcatgttc 2400 ctgtggaaaa gtcgagccag cggtacacag gctttgttac ccacacaggc cagtacgagt 2460 ttcttagaac gcctttcggg ttggtcaata gtccagcgag tttcagccgg tttgtagcgg 2520 atgtgtttcg ggaatttatc aagagtgagc gtgtgttggt gtatgtggat gatttaataa 2580 ttccttcatt agatgaggaa agtaattttc aaacgttgaa ggaattgtta aatgtcgcga 2640 gtgagaacgg tgtgcagttc aattggaaaa aatcgcaatt tttaaaggat gaagtggagt 2700 atctcgggta tgtgattcgc ggcgggtgtt atcgcatagc gccgagtaag ttgcgatcgg 2760 ttcagctttt tccggaaccg aaaaatgtga agcagctgca aagatttttg ggacttacga 2820 gttacttccg aaaatttatt gctggttacg cgacaatttc gaagcctttg acaagtttgc 2880 ttcagaaagg tgttgagttt gtgtttggtg aagaggagcg ttcgagtttt gatgagttga 2940 aacggtgttt ggtgaccgat ccggtgttaa agatctacga cgaaagtgcc gaaaccgagc 3000 ttcatacgga cgcgtcaaag tacggttatg gtgctgcgct tatgcagaag agcgacgacg 3060 acaagtttca tcctgttgcc ttcatgagtc aacaaacatc aaacgcggag aagaattata 3120 gtgcgtatca tttggaagtg ctagcggtgg ttcgcgctgt tgagaagttt cgtgtgtatc 3180 tcttaggcat caagtttaag atcgttacag attgtgcagc gtttgggcat actttaaaat 3240 cgaaagaact gtcggctaga atcgcgagat gggctttgat gctcgaagag tatgaatatg 3300 aagtggtgca taggccaggt tcatcgatga agcatgtgga tgcgttgagc agggcaccgg 3360 tgatgattgt gaaaagcgac cctatgatag aagcgatcag aaaaatgcaa caaagtgacg 3420 agcgtgcgaa ggcaattatt gaattgttga aaacacaatc ttttgaagat tttgtcatga 3480 gtgatgggct gctgatgaaa gtagtgaaag gtagggaagt gattgtggta ccatcgggaa 3540 tgcaaagcga tttgatacgt aggatacacg aaaagggtca cttaggagct cgtaagatag 3600 agggtattat cgaacaggag ttttacattc caaacgcgag tgagaaaata aaacaaacga 3660 ttgagtgttg tgtgaaatgt atcctcgcag agcgtaaaag gggaaaagtt gacggtttat 3720 tatacccaat cgcgaaaggt gacgttccgt tagacacgta tcacgtagac catttgggtc 3780 caatggacat tacagagaaa aggtataaat atttgtttgt agtagttgat gcgtttagta 3840 agtacacttg gatatatcct actaaaacga cgaattcatt tgaagtaatt cagcgattaa 3900 cgacacagag cgaggtattt ggtaatccaa ggcgtattat aagcgataaa ggggctgcgt 3960 ttacgtcaaa cgattttaag cggtattgtg aggatcagga tatcgagcgt gtggaaatta 4020 cgacaggtgt tccgcgcggg aacgggcagg tagagagggt aaatcaagtg attattgcta 4080 tgttgcgaaa gatgagtgta aacgatcccg caaagtggta taagcacgtt gccaatattc 4140 agcggtggat taattctagc ccacatcaga gcatcggtgt tacccctttt gaagcgatgt 4200 ttggggtacc gatgagacac gaaggagatt tacgactagg tgagctgatg gaagaaattc 4260 gagtggccca gcatcacgat caacgagagc agactcgagc tggtgccagg gtttctatcg 4320 agaaagccca agaagaacaa cggagatcgt acgacttacg agcgcggtcg gctacaactt 4380 accgcgaagg cgacctggtg gtgattaagc ggacgcagtt cgggcctgga aggaagtacg 4440 cggctgagta ccttggacca tataaggtaa ccagtgttcg tcctcatgat cgctatgacg 4500 tggaaaagat caatggcgaa gggccaaaag tgacgtcgac ggcttcgtca catatgaaac 4560 cctatcggtt ttaatgcggt gaggatcctt cggggcgaaa ggatcggtcg aggaaaggcc 4620 g 4621 // ID GYPSY4-I_AG repbase; DNA; ANG; 4427 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 08-MAY-2003 (Rel. 8.04, Last updated, Version 1) XX DE GYPSY4-I_AG is an internal portion of the GYPSY4_AG LTR DE retrotransposon - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD; KW AP protease; GYPSY4-I_AG; GYPSY4-LTR_AG; GYPSY4_AG; Gyspy clade; KW gag; integrase; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4427 RA Pavlicek A., Kapitonov V.V. and Jurka J.; RT "GYPSY4_AG, a family of LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 3(4), 77-77 (2003). XX DR [1] (Consensus) XX CC GYPSY4_AG is a family of autonomous gypsy-like LTR CC retrotransposons. CC GYPSY4-I_AG, an internal portion of GYPSY4_AG, is flanked by CC GYPSY4-LTR_AG LTRs. The GYPSY4-I_AG consensus sequence was CC reconstructed based on multiple alignment of 5 copies; they are CC less than 2% divergent from the consensus sequence. The A. CC gambiae CC genome contains about 10 copies of GYPSY4_AG. CC The consensus sequence encodes the 1423-aa GYPSY4_AGp protein CC (pos. 119-4387), composed of gag (zinc-finger, 264-379), protease CC (417-502), reverse transcriptase (pos. 628-797) and integrase CC (pos. 1145-1290) domains. XX FH Key Location/Qualifiers FT CDS 59..4387 FT /product="GYPSY4_AGp" FT /translation="PACIRVYLCTCVYDTTPFSIMLRKKELRRALIVDVPD FT TATVTQLRQLYASHEPVARSPRAAPPTTSATTPAPACANHQDAAILCLPHY FT NGDDDFAHHENVANAAASTNNTTDAVSALPSAHGVAAALPRGPDDIEAQFE FT KLRQQQQLAELRQKVHQLETQQPAALCVKDFEAFIEPLDADKNPNVIRWFR FT DLERLFALYRVRDADKFFFTLRLLTGTAANVAKELVVTTYDELKKELIDNL FT HVVATPESVYRQLRNRRLRPQESALHYLFDMQRIAGQASIADSELIPIVID FT GLGSPSITSSLHFMPLTMDDFRKKLKLFESCRHLCTTQPPSADARATTNSR FT MERPRPSQEPIRCFNCSRFGHLQNACPRPKRPPGGCFRCFQTGHVYRNCPE FT RRANATVEGNTSSDEALATNQEVSLTFFHPSAKRTTLPCVRSLLDTGSPVS FT FISDTIVPVKMLGPLSATEYCTMIKGPLYSRGKIDCTIRFKNHSVRHSFII FT LPGIAWPVIIGRDLLNSLNIFLTYSSLTTSCITKPLSTELKEVDTILPEKL FT DDAIRSICALDVAEADNELDLGKTLSLEQRSIVNSIVENSYLNYTSDVIPL FT KHPMKINLTHDTPIFTKPRRLSYGERQQVKQIVDKLLAENIIRPSNSPYAS FT ALVLVRKKSGEVRMCVDYRPLNKITVRDNYPLPLIETCLEHLCGKKFFSLL FT DLKSGFHQVPMSEESIPYTSFVTPDGQFEYLKMPFGLRNAPSEFQRFINSI FT LREFIDDGRIVVYLDDIIIASTDLSSHFSTLRSVLEKIKQNNLELRLDKCK FT FVHEEIEYLGYKANFSGIQPSDRHIKALTNYPMPTNLKQLRRCLGLFSYFR FT RFVPSFSCIAKPMTNFFRRTKYLTSIQIACMLLKPYVTNLCILLSFPYSTQ FT NGKPNYTVTQVPLVLALFSFRNRMTISCTLLLTFPKPLQKTSPSYTVMSLK FT LFPSFTLLSASTLMFMGSPLVTDCNSLVETLKNRNASAKIARWSLFLENYD FT YTICHRSGTSMPHVDALSRTEAVGAIGEIDLDFQLQVAQTRDPSIEALKHR FT LESEEVDGFLLQDGLVYRDIPDGQPQLYVPSEMVDNVIRHTHERIGHLGIN FT KTFSKISQHYWFPHMKPTIDKFIKNCLKCIVYSAPHHTNARNMYSIPKEPL FT PFDTIHIDHLGPLPSSSLRKKYILVVIDAFTKLTKLYPTSSTNAKEVCSAL FT SQYMSYYSRPRRIVSDRATCFTSTLFEDFLESHNISHVLNATGSPQANGQV FT ERVNRVLRPILSKLSDAPDQTDWVSKLRSAEYALNNTVHTSTNFCPSVLLF FT GVEQRGKVPDELAEYLDEKFDRASRDLEAIRAKALENIEESQRKNEEYFSK FT KHKPPQCYKEGDLVAIRYSDTTDSGNKKLNPKFRGPYVIHKVLPHDRYVVR FT DVEGCQLTQLPYDGVLEANKLRRWTESSD" XX SQ Sequence 4427 BP; 1170 A; 1185 C; 898 G; 1174 T; 0 other; tctcagaagt gggattacca acaaaaatcg cctacaaaac cgcctgcaag ccgcctaacc 60 agcctgtatc cgtgtgtacc tgtgtacgtg tgtgtatgac acaacgccat tttccatcat 120 gctgaggaaa aaagaacttc gtcgggcgct tatcgtcgac gtgccggata ccgctaccgt 180 cacacaactc cgacagcttt atgcctccca cgagccggtc gctcgttccc cccgtgcggc 240 gccgcccacc acctcagcga cgacaccggc tcccgcttgt gcaaaccacc aagatgccgc 300 cattttgtgc cttccacatt acaatggcga cgacgatttt gcgcatcacg aaaatgttgc 360 gaacgctgcc gcttcgacga acaataccac tgatgccgtt tccgcccttc cttctgccca 420 tggtgtggcc gccgcccttc cccgcggccc tgacgacatc gaggcccaat ttgagaagct 480 gcgacagcag cagcagctag ctgaattacg ccaaaaggtg caccaacttg aaacgcagca 540 gccagccgcc ctttgcgtaa aggactttga agctttcatc gagccactcg acgccgataa 600 gaaccccaat gtcatccgat ggttccgcga cttggagcgt ctctttgcac tttaccgagt 660 gcgcgatgca gataaatttt tcttcaccct tcggctcctc accggcacag ccgctaacgt 720 cgcaaaagaa cttgttgtca ccacttatga tgagttgaag aaagagttga tcgacaatct 780 tcacgtcgtt gctacgcccg aatctgttta tcgccaactc cgtaaccgtc gattgcggcc 840 ccaggaatcc gccctgcatt acttgtttga catgcagcgc atcgcaggcc aagccagtat 900 cgccgattca gaactgatcc cgatcgtcat cgacggcttg ggaagcccgt caattacgtc 960 gagtctgcat ttcatgcctc ttacgatgga cgacttccgg aagaaattga aacttttcga 1020 atcttgccgt catctttgca ccacccagcc cccttccgct gatgcccggg ccacaacgaa 1080 cagccgtatg gaacggcccc gcccatcgca ggaacccatc cgctgcttca actgctcccg 1140 attcggacac cttcagaacg cgtgcccgcg gccgaagcgc ccacccggcg gatgttttcg 1200 ttgtttccag actggacacg tctaccgtaa ctgccctgaa cgtcgggcca acgccactgt 1260 cgagggcaat actagttcgg acgaagctct cgccacaaat caagaggtga gtttgacatt 1320 tttccaccct tctgctaagc gtaccaccct tccctgcgtt cgttcccttc tcgacacagg 1380 aagtcctgtg agcttcatta gcgacacgat agtaccagtt aagatgctag gacctctttc 1440 cgctaccgaa tactgcacta tgattaaggg accactttac tctcgaggaa aaatcgattg 1500 tactatccga tttaagaatc attccgttcg acactctttt attatattac ctggaattgc 1560 gtggccagtc attatcggtc gcgatttact gaactcactt aatatttttc ttacgtattc 1620 atctcttaca acttcatgta ttactaaacc tctatcgacg gaacttaaag aagtagatac 1680 gattcttcca gaaaaattag acgatgctat taggagtatt tgtgcgctcg atgtggctga 1740 agccgataat gaactggatt taggaaaaac actatctttg gaacaacgtt caatagtcaa 1800 ttctattgtt gaaaattcat acctcaacta tacttcagat gttataccgc tcaaacaccc 1860 tatgaaaatc aatctgactc atgatacacc aatatttact aagccgcgaa gactctctta 1920 tggtgaaaga cagcaggtta agcaaattgt tgataaactg ttagcagaaa acatcatccg 1980 gcccagtaat tctccttatg cttctgcgct tgtcctcgtt aggaaaaaga gtggcgaggt 2040 tcgtatgtgt gtggattacc ggcccctcaa caaaattaca gttcgggaca attaccccct 2100 accccttatc gaaacttgtt tggagcatct gtgtggaaaa aaattcttca gtttgctgga 2160 tttgaaaagc ggattccatc aagtcccaat gagtgaggag tctatcccct acacttcttt 2220 tgtgacccca gatggtcaat ttgaatatct gaaaatgcca ttcggtcttc gtaacgcccc 2280 ttccgaattc caacgtttta ttaattctat cttaagggaa ttcattgatg atggcagaat 2340 agtagtgtac ctcgatgaca tcatcatcgc ttctaccgat cttagctctc acttcagtac 2400 ccttcggtcc gtattagaaa agattaagca gaataattta gaacttcgtc ttgacaagtg 2460 caaatttgtc catgaagaaa tagaatactt gggctacaaa gctaactttt ctggaattca 2520 gcctagtgat aggcacatta aagcacttac taattaccct atgcccacta atttaaagca 2580 actcagacgt tgtcttggtc tgttttcata cttccgacgg tttgttccat ctttctcttg 2640 catcgctaaa cctatgacaa acttcttcag aaggacgaag tatttaactt cgattcaaat 2700 tgcgtgcatg cttttgaaac cttacgtgac aaacttgtgc attctcctat cctttccata 2760 ttcgacccaa aacgggaaac cgaattacac tgtgacgcaa gttcctttgg ttttggcgct 2820 attctccttc agaaacagga tgacaataag ttgcaccctg ttgcttactt ttccaaaacc 2880 acttcaaaag acgagtccaa gttacacagt tatgagcttg aaactctttc catcatttac 2940 gctcttaagc gcttccacac ttatgttcat gggctcccca ttagttactg actgcaactc 3000 tctggtcgag acccttaaga accgtaatgc ttccgctaag attgccaggt ggtccttgtt 3060 tctggaaaat tacgactata ccatctgtca tcgctcaggc acttctatgc cccatgtcga 3120 cgcactgagt cgcaccgaag ctgtgggtgc catcggtgag attgaccttg acttccagct 3180 tcaagtagct cagacgcgtg acccatctat cgaagctctc aaacatcggt tagaatcaga 3240 agaagttgac ggattcttac ttcaagatgg gcttgtctat cgcgacatac ctgatggtca 3300 acctcaattg tatgtccctt cggaaatggt cgacaacgta attagacaca ctcacgagcg 3360 aattggccac ctgggcataa acaaaacctt cagcaaaatc agtcagcatt actggttccc 3420 ccacatgaag cccactatcg acaaattcat taagaactgc ctcaagtgca ttgtttattc 3480 tgcacctcat catactaatg cccggaatat gtacagcatc cctaaagagc ccttaccctt 3540 tgataccatc catattgacc atttaggtcc gctccctagt tcttccttac gcaagaagta 3600 tatacttgtt gttatcgatg ctttcactaa attaaccaaa ctttacccaa cctcctcaac 3660 taatgcgaag gaagtgtgtt ctgccctttc ccaatatatg tcttactata gccgccctag 3720 gcggattgtt agcgatcgag ctacttgttt cacctcaacc ttgtttgagg acttcttgga 3780 atcgcataac attagccatg tcctcaacgc caccggatcc ccacaagcca atggacaggt 3840 agaacgggtg aaccgtgtgt tgcgtcctat ccttagcaaa ctatctgatg ctccagacca 3900 gaccgattgg gtatccaagt tgcggtcagc cgaatacgct ttaaacaata ccgtccacac 3960 atctacgaac ttctgcccct ctgtcctact ctttggtgtc gagcaacgcg gtaaagttcc 4020 agacgagtta gccgaatacc tggatgagaa atttgatcga gcctctaggg acttagaagc 4080 cattcgggct aaagcgttag aaaacataga agagtctcaa cggaagaatg aggaatactt 4140 tagcaaaaag cacaaaccac cacagtgcta taaggaaggt gacttagtgg ctatacgtta 4200 ctctgatacg accgatagcg gtaataagaa gctcaatcct aaattcaggg gaccttacgt 4260 catccataaa gtgttgcccc atgataggta cgtggtacgc gatgtagaag gatgtcaact 4320 cacacaacta ccctacgatg gggttctaga agcgaataag ttgcgacgtt ggaccgagtc 4380 cagtgattag gaaattgagg gcaatttatt gttcaggata gccgagc 4427 // ID TRANSIBN1_AG repbase; DNA; ANG; 978 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 13-JUN-2005 (Rel. 8.04, Last updated, Version 2) XX DE TRANSIBN1_AG is a TRANSIB-like DNA transposon - a consensus DE sequence. XX KW Transib; DNA transposon; Transposable Element; 5-bp TSD; KW TRANSIB superfamily; TRANSIBN1_AG; nonautonomous DNA transposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-978 RA Kapitonov V.V. and Jurka J.; RT "TRANSIBN1_AG: a family of target site-specific nonautonomous RT TRANSIB DNA transposons from African malaria mosquito."; RL Repbase Reports 3(4), 83-83 (2003). XX DR [1] (Consensus) XX CC TRANSIBN1_AG is a family of nonautonomous DNA transposons that CC belongs CC to the TRANSIB superfamily originally identified in Drosophila CC (see CC description of TRANSIBN1-TRANSIB4 in drorep.ref). CC TRANSIBN1_AG is characterized by a remarkable target site CC specificity. CC Its copies are inserted into the CCagtGG target site, and CagtG CC is CC a 5-bp target site duplication. There are ~100 copies of CC TRANSIBN1_AG CC in the genome/. CC The TRANSIBN1_AG consensus sequence was reconstructed based on CC multiple alignment of 30 copies. Given the ~2% divergence CC of these copies from the consensus sequence, transposition of CC TRANSIBN1_AG occurred recently (in the last 1 Myr). CC TRANSIBN1_AG has 17-bp terminal inverted repeats (2 mismatches). XX SQ Sequence 978 BP; 332 A; 167 C; 163 G; 314 T; 2 other; cacagtgggc aaccgccata caaacgccgg gatgaaaatc aattcctcgt gctattgcar 60 tttwtcttca ttcaatacaa ttgctcttac tatacagggt agtcctatac taaaatcgtc 120 aagacagcga ataaaactta ataattatcg ctcacaacat tgcattattg cgtatcagtt 180 aacagcatca ataataattg ttaggaatta aaacgaaggc ggaataagtt tctgactgaa 240 aacgaatttt taaagtatta cgcactaaaa aagttgtgtt tttcatggtt tgtttggaaa 300 agagccgatt cctatcttac gacctttttt aaactgtttt tcctctcttc agctttgttt 360 tctgcctgtt tgtttcaaat gcccgtttat gacaggtagt tggatactgg tggtgtatgg 420 ctcatacaac aacacgtcaa aatcgctcgc gctattttca agaaaaagtt taataatttt 480 cggggtcgca gatggtcgca gctaatttta acgacaatgc gtaggaaaat tgttgatctt 540 tccaatgata tacgactcac gagtgaaatc gagtcgaatc ataaaaaaaa tcactctcca 600 aaataaaaat atccaaaaat tcagcagtga tgtttggttt tcaatcattt atgaacttta 660 aaaacaagtt tttgcaaaat attaagacat aacatcaaag tatgacaaaa aacctttcca 720 acgacacatt gattatcaaa atctaaccat catatactaa aatatgatgg tttatgttcg 780 gtcgaaaaat agctcaaagt tgggacaaaa aacccaaagt ttacactttg atggcctata 840 tctcagtaag tttaagataa aaacgtgaaa tattttggtt tcaactaagt ttaagtatct 900 attttaagaa aatgattatg gtgtaaacct gcgatgaagt tggtttttcg atttttatac 960 aggcgtttgc ccaatgtg 978 // ID GYPSY64-I_AG repbase; DNA; ANG; 4359 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY64-I_AG is an internal portion of retrotransposon GYPSY64_AG DE - a consensus sequence. XX KW LTR Retrotransposon; Transposable Element; 5-bp TSD gag; KW AP protease; GYPSY64-I_AG; GYPSY64-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY64_AG; mag lineage; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4359 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY64_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 167-167 (2004). XX DR [1] (Consensus) XX CC GYPSY64_AG is a family of gypsy-like LTR retrotransposons that, CC according to the aminoacid sequence of its reverse CC transcriptase, RNase and integrase domains is phylogenetically CC grouped with representatives of the mag lineage of other CC organisms. GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, CC GYPSY23_AG, GYPSY24_AG, GYPSY25_AG, GYPSY26_AG, GYPSY27_AG, CC GYPSY28_AG, GYPSY55_AG, GYPSY56_AG, GYPSY57_AG, GYPSY58_AG, CC GYPSY59_AG, GYPSY60_AG, GYPSY61_AG, GYPSY62_AG, GYPSY63_AG, CC GYPSY65_AG, GYPSY66_AG, GYPSY67_AG, GYPSY68_AG and GYPSY69_AG, CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY64-I_AG consensus was reconstructed after multiple CC alignment of 4 copies. The consensus encodes the 1424-aa CC GYPSY64_AGP gag-pol like poliprotein (pos. 74-4345). The CC sequence of the LTRs flanking GYPSY64-I is deposited as CC GYPSY64-LTR_AG. CC GYPSY64_AGP: CC MEGAQEKDSAAGGGESSKNVSEAILKILANQQSLMSSMAQQLQLTNQSIQKFTHVEVVLDSLSS CC NMSEFVYDKENGYTFDAWYSRYSELFDRDACNLDNAGKVRLLLRKLSPQDHERYNSFILPKLAR CC EFTFEQTVKKLKSLFGATISTFRRRYNCLQMTKDDVDDYLSYSCKVNKSCVDFKLSELTEEQFK CC CLIYVCGLKSSSDAEIRMRLINKLNEAQDITLQQIVEQCNSLVNLKQDTVLVEQPSSVQYVANK CC GSSQQQRHPSGGNKQQDHPRTPCWSCGAMHFHKDCPSRNHKCKDCGKIGHSEGYCACFTSMATT CC SAPAQKAPWKKQYKRKQHQGQTSKIVTVNHVTQRRKFVSVHLNNIPHRLQIDTGSDITIISHQA CC WKRIGSPAVKPATCNARTASGDPLQLAAELECSITINNVTKQGKCFVTDPNVNLNVLGIDMMDL CC FGLWNEPITAFCNQICTTKTTNIAELRSRYPDVFNDKMGLYNKTAVQLTLKGTPTPVFRAKRPV CC AYMMEAVVEDELHRLESLGIIKKVDFSDWAAPIVVVRKPNGTVRICADFSTGLNNVLESNNYPL CC PLPEDIFVKMANCVIFSHIDLSDAYLQVPVDEASQPFLTINTHKGLFQFTRLSPGIKSAPGAFQ CC KLMDTMLAGLNSTTGYLDDILVGGRNEDEHQQNLHLVLNRLRDYGFTVRIEKCNFNMRQVKYLG CC QILDAQGIRPDPDKIAPIVSMPPPHDIPTLRSYLGAINYYGKYVQEMRTLRQPMDQLLKAGMKF CC HWSTACQRSFDRFREILQSPLLLTHYNPKMEIIVSADASNVGLGARIAHKFPDGSIKAIYHVSR CC SLTSAESNYSQIEKEALALIFAVTRFHRMIYGRRFILETDHKPLLAIFGAKKGIPTYTANRLQR CC WALTLLLYDFSINYISTDSFGHADVLSRLINRHVRPDEEMVIANLTFEKSIRSILNESLQAVPL CC SFKTIQNTTKNDDTLQQIIKFIKEGWPPKTIINDPKILQFYQRRDGLSVVADCIMYGERLVVPP CC SSRESVLKQLHKGHPGIERMRSIARQYVYWPNVDEDVAHIVKSCIECSSVAKTDRKTTLESWPV CC PEKAWQRLHLDYAGPVNGYYYLILVDAYSKWPEVMRTKDITTTATLRMLRNIFARHGQPETLVT CC DNGTQFTSNMFETFCEHYSIVHLKTAPFHPQSNGLAERFVDTFKRALKKITAGGETLEEAIDTF CC LLCYRSTPCRSSPEGKSPAEHIYKRPIRTALELLRPPSSSHKVHDNKQEKQFNLKHGAKKRHYS CC PQDLVWAKVYHNNKWSWAHGQVIEQIGSVLYNVWLSSTRKLIRSHCNQLRSRHEAEVSQQEQLA CC TTDVQIPLAILLDNCGLNDEVERETTTSTTLPSEMLADLAPPRQRSRVGSTRNNNQPPVPTRQS CC SRQRVPPTRYDAYHLY. XX SQ Sequence 4359 BP; 1371 A; 1013 C; 942 G; 1033 T; 0 other; gtggcgacga ggcggtagaa gtttgcaaaa aaaccagcga aaattttccc gggaccgtgt 60 gttatcatcg acaatggaag gtgctcaaga aaaagattct gcagcaggcg gaggagagtc 120 atcaaagaat gtgtcggaag cgatactgaa aattctcgcc aaccagcaga gcctcatgtc 180 ttctatggca caacagcttc aattgacgaa tcaatccata caaaagttca cacatgtgga 240 ggttgtactt gactcattat caagtaatat gtcagagttt gtctacgaca aagaaaacgg 300 atacactttc gacgcctggt attcccgtta cagcgaactt ttcgatcggg atgcttgcaa 360 ccttgacaac gcgggaaaag tgaggttact tttacgcaaa ctgagcccac aagatcacga 420 gcggtacaat agtttcatat tgccaaaact agctcgcgaa ttcacattcg aacaaacagt 480 gaaaaagcta aaatccctgt ttggcgctac catctctacg tttcgacgca gatacaattg 540 tcttcaaatg acaaaagatg acgtagatga ttatctttca tattcctgta aagtgaacaa 600 atcctgtgtt gattttaaac tttccgagtt aactgaggaa cagtttaaat gcttgatcta 660 cgtatgtgga cttaagtcaa gcagcgatgc agagattcgt atgaggctga tcaacaaact 720 gaacgaagca caggacatca cgctccaaca aatcgtcgaa cagtgcaaca gtctcgttaa 780 cctcaaacag gacactgtgc ttgtagagca accatcgtca gtgcagtacg ttgctaacaa 840 aggttcatca cagcaacaac gtcatcccag cggaggaaac aaacagcagg atcatcctcg 900 tactccttgc tggtcttgcg gtgcaatgca cttccacaaa gattgtccga gtcgaaacca 960 caaatgcaaa gattgtggta aaattggaca ttccgaggga tactgcgcct gtttcacatc 1020 aatggcgacc accagtgctc cagcgcagaa ggcaccgtgg aagaagcagt acaagcggaa 1080 gcaacatcag ggacagacat cgaaaatagt gacagtcaac catgtcacgc aaagaaggaa 1140 gttcgtctcc gttcatctca acaacattcc tcatcgactg caaattgaca cgggatcgga 1200 catcaccatc atctcacatc aggcatggaa gcgtatcggt tctccggcag tcaaaccagc 1260 cacttgcaat gctaggacag cgtcgggcga tccgttgcaa ttagcggcgg agctggagtg 1320 cagcatcacc atcaataacg ttacgaaaca gggtaagtgt tttgtaactg atccgaatgt 1380 caatcttaac gttttaggga ttgatatgat ggaccttttt ggactgtgga acgagccaat 1440 cacagcgttc tgcaaccaga tctgcaccac gaagacgaca aacatagcag aactacggtc 1500 tcgttatcca gacgtcttca acgacaaaat ggggttgtac aacaagacag cagtacaact 1560 tacgttgaag ggcacgccta caccagtatt tcgtgcaaag agacccgttg cgtacatgat 1620 ggaagctgtt gttgaagatg agctgcatcg tctggaaagt cttggcatca tcaaaaaagt 1680 ggacttttct gactgggcgg cacccatcgt cgtggtacga aaaccgaacg gcaccgttcg 1740 tatttgtgcg gatttctcga cggggttgaa caacgtgctg gagtcgaata attatccttt 1800 accactgcca gaggatatct tcgtgaaaat ggctaactgc gtcattttca gccatattga 1860 tttgtcggac gcctacctac aagtacccgt agacgaagca agccaaccat tcctaaccat 1920 caacacccac aagggactgt ttcaattcac acgattgtca cccggcatca aatcagcgcc 1980 aggggcattt caaaagttga tggatacgat gctcgctggg ctcaatagca ccacagggta 2040 cctggacgac atattagtag gtggacgaaa cgaagatgag catcagcaaa acttacatct 2100 cgtgctaaac cgtttgcgag attacggatt taccgtacgc attgaaaaat gtaatttcaa 2160 tatgcgccaa gtcaaatatc tgggacaaat ccttgatgca caaggaatcc ggccagatcc 2220 agataaaata gcaccaattg tgagcatgcc accgccgcac gacattccaa cgctgcgatc 2280 ataccttgga gccataaatt attatggcaa atatgtccaa gaaatgcgca cactccgtca 2340 gcccatggat caacttttga aggcaggtat gaaatttcat tggtccacag catgccaaag 2400 atcattcgat cgttttcgag aaattttaca atctccatta ctgctaacgc attacaatcc 2460 aaaaatggag ataatagtat ctgcagacgc ttcaaacgta ggattgggtg ctcgcattgc 2520 tcacaagttt cctgatggat caataaaagc catttaccat gtgtcgcgta gcttaacatc 2580 agctgaaagt aactatagcc aaattgaaaa agaggcgctg gctttgatat ttgcggttac 2640 acgctttcac agaatgattt atgggcgtcg attcattctg gaaacagatc acaaaccttt 2700 attggctatt tttggtgcaa agaagggtat accaacgtat acagcaaatc gtttacaacg 2760 atgggcatta actcttctgc tctacgattt ttcgattaac tatatctcta cagatagttt 2820 tggtcacgcc gacgtattat cacgtcttat caatcgacat gtgcggccag atgaagagat 2880 ggtcatagct aatctcactt tcgaaaaaag tattcggagc atcttgaacg aatcactgca 2940 agcagtccca ttgtcgttca agacaattca aaatacaacc aaaaatgatg ataccttaca 3000 acaaatcatc aagttcatta aggaaggttg gccacctaaa accatcataa acgatcctaa 3060 aatcctacaa ttttatcaac gacgagatgg actgtctgtc gtagcagatt gtattatgta 3120 tggagagaga ttggtcgtgc ctcccagttc cagggaaagt gtcctcaagc agctacacaa 3180 gggacaccct ggtatcgaac gcatgcgctc gatcgcacgg cagtatgtgt actggccaaa 3240 cgtcgatgaa gacgttgcac atatcgtaaa atcgtgtatc gaatgttcta gtgttgcgaa 3300 aacagataga aaaacaactc ttgaatcctg gccagttccg gaaaaagcat ggcaaaggct 3360 acatctcgat tatgcagggc ctgttaacgg ttactactat ctgattctgg ttgacgcata 3420 ctctaagtgg ccagaagtga tgcgcactaa agacatcacc acaaccgcaa cattgcgcat 3480 gctccgcaat attttcgcaa gacacggaca acctgaaaca ttggtcactg ataatggtac 3540 acaatttacc agtaatatgt tcgaaacatt ttgtgagcac tatagtattg tgcatttgaa 3600 gactgctcca tttcatccgc agtcaaacgg actagcagaa agattcgtcg acacattcaa 3660 gagagccctt aaaaaaatta cagcaggggg ggaaacgtta gaagaagcaa tcgacacctt 3720 tctgctgtgc tatcgttcaa caccatgtcg tagttcaccg gaaggaaaat caccagctga 3780 gcacatttat aaaagaccaa tacggacagc tctcgaatta ttacgtccac cttcttcatc 3840 gcacaaagtc catgataaca aacaagaaaa gcagttcaat ctcaaacacg gagcaaagaa 3900 gcggcattac tcccctcagg acttagtgtg ggctaaagtg taccataaca acaaatggag 3960 ttgggctcat gggcaagtta ttgagcaaat tggtagtgtg ctgtacaatg tatggttgtc 4020 atcaacgagg aagctcattc gatcgcattg taatcagcta cgaagtcgac atgaagcaga 4080 agtttcccag caagagcaac tagcaactac agacgttcag ataccgctgg cgatactcct 4140 cgataattgt ggtcttaacg atgaagtaga aagagaaacc acaacatcca caacactccc 4200 atcagaaatg ttggcagacc tggcaccacc acgtcaacga agccgtgttg ggtcaacacg 4260 taacaacaat caaccaccgg tacctacacg tcaatcatcc agacaacgcg taccaccaac 4320 cagatacgac gcgtatcatc tttactaaaa aaagggagg 4359 // ID TransibN3_AG repbase; DNA; ANG; 959 BP. XX AC . XX DT 21-MAR-2005 (Rel. 10.03, Created) DT 12-APR-2005 (Rel. 10.03, Last updated, Version 1) XX DE TransibN3_AG is a family of nonautonomous DNA transposons - a DE consensus sequence. XX KW Transib; DNA transposon; Transposable Element; KW Interspersed repeat; non-autonomous; TransibN3_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-959 RA Kapitonov V.V. and Jurka J.; RT "RAG1 core and V(D)J recombination signal sequences were derived RT from Transib transposons."; RL PLoS Biology0-0 (2005). XX DR [1] (Consensus) XX CC TransibN3_AG is a family of nonautonomous DNA transposons that CC belongs to the Transib superfamily. TransibN3_AG elements are CC characterized by 13-bp terminal inverted repeats (1 mismatch) and CC 5-bp target site duplications. XX SQ Sequence 959 BP; 336 A; 154 C; 145 G; 324 T; 0 other; cacagtgggc agctgccggg atgaaagtcc aaaaattaat tatcatattt ctttaatccg 60 aggaaagcat tgtaatctac agtatcaaac taagaaaaac acgagatttc agtcccctag 120 tcgtattagt tattacgata gagactccta aagtcttagg tcttgaaaaa actcatcata 180 acaactatat ttttttacct tctttaaacc attgtaactt tcacaaaaat caatcaaatt 240 ttaaaaatga atatgttttg gaaaggtttt aatctattct aaacaaaaat atttagttta 300 tgaaaatatg tcaattactt aaaaagttaa agagtatagt gtgcactact tcctctaaat 360 tgtatcaatc atgtatcttt gacatcatgc caaatttgaa caccgtaaaa caccgtggta 420 tcatgtatac cacgtcaaag ctttacatct aaacaaaaga tctgtaaaga aattagccgc 480 tactttcgcg caactttggc agatattggc attttaatga cagtataaaa tactttcaat 540 aaaacaaagc gttttctggt gaagtatgac taatatccgt atatttggga tgtttagggc 600 agtttttgat gaaatttgat catcgaatgt gatgatcaat gatgacgtga tgattgatca 660 tcgatatttc ggaatgttgc tagcaaacca aatgtaccat gtaagcttta ttttattaca 720 gtcagggtac gtagtatatt aagtataacc ccgtatattc gttacgcagc caaagaagaa 780 catgtatggg atagtaagga tattttcaaa aataattata atctttgcta cataatcata 840 aaacattcat attttgttta atttagattt tcaacattta cactataatg tacctaagaa 900 atcatgtatt caatgacaat tacttttatc ccgctttttg ttctacggtg cccactgtg 959 // ID RTE-2_AG repbase; DNA; ANG; 3172 BP. XX AC . XX DT 28-FEB-2009 (Rel. 14.02, Created) DT 01-MAR-2009 (Rel. 14.02, Last updated, Version 1) XX DE RTE-like non-LTR retrotransposon - a consensus sequence. XX KW RTE; Non-LTR Retrotransposon; Transposable Element; RTE-2_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-3172 RA Jurka J.; RT "RTE-like non-LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 9(2), 647-647 (2009). XX DR [1] (Consensus) XX FH Key Location/Qualifiers FT CDS 14..1654 FT /product="RTE-2_AG_1p" FT /translation="MLGAACPFCPGILRDFNQGCVGPTTRLEVGGEALQAT FT PGKTYATXGNQRISNRTNRYRPTQPNKANDWKLGTWNCRSLTAPGSTRILS FT DEVRARGFGIVALQEMRWKGVTERPYRSDCMIYQSGGEKHELGTAFLVIGE FT MRKRVIGWWPINERMCRLRIRGRFFNLSIINVHSPHLGSTDDDKDNYYTQL FT EREYDRCPQHDVKIVIGDFNAQVGREEAFKPTIGSFSAHRLTNDNGLRLVN FT FASSKHMNIRSTFFQHAPRFSYTWRSPQQTLSQIDHVLIDGRHFSDIIDVR FT TYRGANVDSDHFLVMVKLRQKLCVANKLRYQPTPRLNTDRLKQAVVARDFA FT IALGEALPEDNTTEAMSLNDHWRMVEQAISSTAERTIGRVTHNQRKEWFDD FT ECRRALSEKNAARTRMLQRETRQNVEDYRRLRRQQTLLFQDKKRXFEESDE FT QLMQQLSQSGETRKFYRMLNAARSGFTPMTAICRNEEGDILSDEREVIDRW FT KCYFDGHLNGADTGEADAGSRGEQPYDSSSMTMTKCPRHLWTKSSAPSNS* FT " FT CDS 1705..3150 FT /product="RTE-2_AG_2p" FT /translation="MGPERLAVIMHRLIVRIWDQEELPDEWKLGVIHPVYK FT KGDRLDCANFRAITVLNAAYKILSRILFCRLSPLATDFVGSYQAGFVGGKS FT TTDQIFTLRQILQKCREHQIPTHHLFIDFKAAYDTIDRNELWNTMQQYGFP FT GKLIRLLRATMDGVQCKVRVTNMLSESFESHRGLRQGDGLSCLLFNIALEG FT VMRSAGFDIRGTIFTRTLQFLGFADDIDIIGRTTAAVCEAYTRLKREAARI FT GLRINATKTKYLLAGGSDRDRARLGSRVSVDGDDLEVVEEFCYLGTIVTSD FT NNVSSEIRRRIVQGNRAYYGLHKLLRSRRLQQHTKCAIYRTLIRPVVLYGY FT ESWTILTEDANALAIFERRVLRTIFGGVCEHGVWRRRMNHELAELFGGADI FT LTVIKAGRIRWLGHVMRMPDSCPTRKVLVSDPFGTRRRGAQRARWLDQVES FT HLSEIGCSRGWRTAAQDRVSWKRIGDLAMSTRRAHT*" XX SQ Sequence 3172 BP; 813 A; 824 C; 930 G; 593 T; 12 other; gcgtctgttc tccatgttag gggcggcttg ccccttctgt cctggcatcc tacgggactt 60 taatcaagga tgcgtaggcc ctacgacgag actggaggtt gggggagagg ccttgcaagc 120 cacccctgga aaaacatacg caacgaawgg aaaccaaagg atttcgaacc ggaccaaccg 180 ataccgaccc acgcaaccga acaaggcaaa cgattggaaa ctcgggacat ggaactgcag 240 atctctcaca gcaccyggaa gtacccgcat tctttcggac gaggtgaggg cccgtggctt 300 cggaatagta gcacttcagg agatgcgctg gaaaggagtg acggagcgcc cstatcgtag 360 cgattgcatg atctaccaga gcggtggtga aaagcatgaa ctcggtacgg cgttcctggt 420 cataggtgag atgcgaaagc gagtgatcgg gtggtggccg atcaaygaac ggatgtgcag 480 gttgaggatt cgtggcaggt tcttcaacct gagcattata aatgtgcata gtccgcacct 540 tggragcacc gatgacgata aagacaatta ttatacgcag ttggagaggg agtacgaccg 600 ctgcccacaa cacgacgtta aaattgtcat cggggatttt aatgctcagg tcggacggga 660 ggaggcattt aaaccgacga taggaagttt cagtgcccac cggctgacca acgacaacgg 720 gcttcggctc gtaaacttcg cctcctccaa gcacatgaac atycgcagca ccttcttcca 780 gcacgcacct cgcttcagtt acacctggag atcaccgcag caaacacttt cccagatcga 840 ccacgttctc atcgatggaa ggcacttctc ggatataatc gacgtaagga cctatagagg 900 agcaaacgtc gactcagacc atttcctggt tatggttaag ttacgccaga aactgtgcgt 960 ggcyaacaaa ctgcgctatc agcccacccc aaggctcaac acagaccggc tgaaacaagc 1020 tgttgtggcg agggacttcg caatcgcgct tggggaagcg ctgccggagg acaacactac 1080 cgaggcgatg tctctcaatg accactggcg tatggtggag caagccatca gcagcacggc 1140 cgagcgaaca attggccgcg tgacccataa ccagaggaag gaatggtttg atgatgagtg 1200 cagacgagca ctctccgaga agaacgcmgc gcggacccgc atgctccagc gcgagacccg 1260 kcagaacgtg gaagactaca gacgactgag gaggcagcag accctrctct tccaggacaa 1320 gaagcgcsgc ttcgaagagt cggacgarca actcatgcag cagctatccc agtcggggga 1380 aactcgcaag ttctacagga tgctgaatgc ggcacggagc ggttttactc ccatgaccgc 1440 tatatgccgc aatgaggagg gagatatcct gtcggacgag cgagaggtga tcgacaggtg 1500 gaagtgctac ttcgacggac acctgaatgg agcagatacc ggagaggcag acgcaggaag 1560 cagaggagag caaccctacg acagcagcag catgacaatg acgaagtgcc cccgccatct 1620 ttggacgaag tcatcagcgc catcaaacag ctgaagtgta acaagtcagc tggcagcgat 1680 ggtctggtgg ccgaactgtt caagatgggt ccggagaggc ttgccgtcat catgcatcgg 1740 ctgattgtga ggatttggga tcaggaagaa ctaccggacg agtggaaact gggtgtcata 1800 cacccagtgt acaaaaaggg cgacaggctg gactgtgcta acttccgagc catcacagtc 1860 ctgaatgctg cctacaagat cctgtcccga atactcttct gcagactttc gccccttgct 1920 acagatttcg tcggcagcta ccaagctgga tttgttggag gcaaatcaac taccgaccaa 1980 atctttactc tacggcagat cctccagaaa tgccgagagc accagatccc tacgcaccac 2040 ctgttcatcg acttcaaggc ggcctacgat accatagatc ggaacgagct atggaacacc 2100 atgcagcagt acggattccc tgggaagctg atacggctgt tgcgggccac tatggacggg 2160 gtgcagtgca aggtgagagt gacgaacatg ctgtcggaat cgttcgaatc tcaccggggt 2220 ctgaggcaag gggacggact ctcctgtttg ctgttcaaca tcgctctgga aggtgtcatg 2280 cgaagcgcgg gcttcgacat ccggggcacg attttcaccc gaactctcca attccttggc 2340 ttcgcggatg acatcgacat catcgggcga acaactgcgg cggtgtgcga ggcgtacacc 2400 cgactgaaac gcgaagccgc aaggattgga ttgaggatca atgcgacgaa gacaaagtac 2460 ctgcttgccg ggggctctga ccgtgataga gcccgactcg gaagcagagt atcagttgac 2520 ggcgacgatc tcgaggtggt agaggagttc tgctaccttg gcacgatcgt aacttcggac 2580 aacaacgtaa gcagcgaaat ccgaaggcgc attgttcagg gaaatcgtgc ctactacggg 2640 ctccacaaac tcctgcgatc cagaagactc caacaacaca cgaaatgcgc gatatatcgc 2700 acactgattc gtccggtggt cctctacggg tacgagtcct ggactatact gacggaggac 2760 gccaatgcac tcgccatttt cgaacggcgg gtgctaagga ctatctttgg cggtgtgtgc 2820 gagcacggcg tgtggagaag gaggatgaac cacgagctag ctgagctgtt tggcggtgca 2880 gacatcctga cggtcatcaa agccggaagg atacgatggc tggggcacgt gatgaggatg 2940 ccggactcat gccccaccag gaaggtgctc gtcagcgacc cgttcggcac gaggcgtaga 3000 ggagcacagc gagctcgctg gctggatcag gtggagtcgc acctgtcgga gatcggatgc 3060 agccgtggtt ggaggactgc agcccaggac cgagtttcct ggaaacgaat tggcgacctg 3120 gccatgtcta cgagacgtgc tcatacatga gcaggccaag aagaagaaga ag 3172 // ID AGM1 repbase; DNA; ANG; 5983 BP. XX AC AF060859; XX DT 27-JUL-1999 (Rel. 4.06, Created) DT 27-JUL-1999 (Rel. 4.06, Last updated, Version 1) XX DE Anopheles gambiae Moose LTR retrotransposon, complete sequence. XX KW LTR Retrotransposon; Transposable Element; AGM1. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5983 RA Biessmann H., Walter F.M., Chuan S., Le D. and Yao G.J.; RT "Moose, a novel family of LTR retrotransposons in the mosquito RT Anopheles gambiae."; RL Insect Mol. Biol0-0 (1998)In press. XX RN [2] RP 1-5983 RA Biessmann H., Walter F.M., Chuan S., Le D. and Yao G.J.; RT "AGM1."; RL Direct Submission to Repbase Update (21-APR-1998)Developmental RL Biology Center, University of California, Irvine, CA 92697, USA. XX RN [3] RP 1-5983 RA Biessmann H., Walter F.M., Chuan S., Le D. and Yao G.J.; RT "AGM1."; RL Direct Submission to Repbase Update (29-JUL-1998)Developmental RL Biology Center, University of California, Irvine, CA 92697, USA. XX DR GenBank; AF060859; Positions 1 5983. XX SQ Sequence 5983 BP; 1808 A; 1152 C; 1427 G; 1596 T; 0 other; tcgttgctaa ccaaaaccat ccaacgaaaa acacattcaa attgttcgat ctcaaatgca 60 tgcggttcta ccacagtgcc ggattatggc agctgacgtt ttgtcattct tgtgggtcta 120 gggacgtacc gctgtgaaag gggaaccgcc tttttgtggc aattgtcacg agccgaagaa 180 aagcgaagca cgcataaatg gtgtgtgaat aaagaagtga acttacttgc caacaaagaa 240 caaccgactg attatttcgt gtaaaaaaaa agtaaaaata tcacaacgta gggactttca 300 cggaacattt tggtgcaagt gaccaggata agtgcctgat tatcggtgtc aattgtggta 360 atcataatta gtgcaaaagt tctcgcttgt tctttgctgt gtgcgtgtga gtgtgtgctt 420 ggagctggct aggattttgt gtgcgtgtgc gcgtgtgcaa cgtattaaac agtgtgttta 480 acgcaagtgt aacaatggcg acggcgcgag aagacaaaat tcgtggcaaa gagcaaaaaa 540 gaaaaaacat tatagattcg atgcggcgaa tcgatttatt cgtacaaacg tacactgtgg 600 acaagattca tgaggtgacc acaaggctcg agaggctcga gaaagtgtgg tatggttttg 660 aagaagttca agaagagtta gacaagctaa cccttgaggg agacaacgct gaaaatgaaa 720 agacgcgtgc agaaatggag gagctataca tgagtgttag atccaatcgc tacggctgaa 780 gccatcgtct aatccaatag tcagcgacgt taaaccggta cccattgtgt cccaaatgaa 840 attaccagta atacaattgc ctgagtttgg aggtgatttt aatgattggt taccatttca 900 tgacacgttt gtgtccctga ttgataaatc agatgagctt tctggtgtgc aaaagctaca 960 ttatttgaaa gctgcgctca aaggtgaggc agcacggcta atgagccaat tctcactaca 1020 gatgagaatt acaaaatgca tggcaaatgt tggtagaccg ctatggcata aacatttgct 1080 taagaaacgt catatccaag ccattttgag gctaccgaag attatcaata gcaatctaga 1140 tttattgcga cgcactgtag atgatttcca gcgacacaca ttggtgttag aacaactcgg 1200 tgaaccaata aaacatctta gctcattcct tgttgagctt cttagtgaga aattggatag 1260 tgcttctctt gcggcgaggg aagaagcaca agcggataaa agttacacat acagtgacat 1320 ggttgagttt ctgcgtaagc gtgtgcgttt gttggaaacg cttgctaacg atacgggtga 1380 aacgagtaag cgacaaccgc gtgtaaaagt gagtgtgaac actgctgcag cggccgagaa 1440 aaaagtggat atgtgtgttg tgtgtggaaa gcaggggcat acaatagtga attgtcggcg 1500 tttcaatgaa ttcgatgcaa agaagcgcca cgaagttgtg aggcagcaca aattgtgttg 1560 gaattgcttg cagggcagcc attttgtaac cagttgtacg tcaaggtatg ggtgtcaaac 1620 gtgtggcaaa cggcatcata ccttgctgca tgctgaacga agtagtagtg taatagcaga 1680 tgattcggtg ggttctgtta gtacgatggt gcttgctaat attccaatgc agtgcaactc 1740 tactgaccga agctcatatt ctaatgttat gctgacaacg gtggtgttgt ttgtagttga 1800 cgcaaatggt acgcaacatc cagtacgtgc gttgttggat aacggtgcgc agcccaatgc 1860 gataagtgag cgattgagtc agcttttgtg cttgcgcgta tgcgtaccca tgtatccatt 1920 acaggtgtgg atggaacgac gactcaggcg tcatgtgaaa tgaaggtgga aatccgttcc 1980 agatttacgc aatttgctct gaaactaaat ttcttggttt tgagcaaggt aacagcaaac 2040 actccagcca catctttctc cacatcatgt tggaaattac ctgctgggtt ggcgctcgca 2100 gacccagaat tccatcagtc tggacgagtg gatatgctaa tcggtgcatc gcatttctac 2160 acgtttctga gggaaggccg gctcaaactt agtgaacatg gtccattgtt agttgaaaca 2220 gtgttcggtt gggtggtaac aggtgaagtg ctccgagaag aagctataat tcagcaacaa 2280 gcagctcagt gtcatgttat gttatcgtcg gaaaacatta gcgatcaact tgagcggttt 2340 tggaagatcg aagagctgca tgtttcgcat ttctcggcgg atgagcagag atgcgaagct 2400 tattatgagc aaacggtatc acgagacgaa acaggcagat atatcgtgaa actgcctaaa 2460 catcagcaac attctaccat gattggaaaa tcagaaacga catcacttaa gaggtttgct 2520 ggattagaac gtaagttatt tgctaactca caacttcgtc agcagtacaa tgagtttatg 2580 ttggagtaca tacaactcgg tcatatggtt cctgtgtcgc ctgacaacct ggatgcagca 2640 acctgctgct accttccaca tcaccctgtg tttaaggaga caagttccac gactaaaatg 2700 agagtggtat ttgatgggtc agcaccaact agcacaggac actctctgaa tgatgcgcta 2760 ttggttggac cagtcatcca agatgatctt ttgagcttaa taatccgttt tcgtaaattt 2820 caggtcgcgc tagttccgga cttggaaaaa atgtatcgtc aggtacttgt gcatccggag 2880 gatcgaccgt tacagagata tggtggggcg tatgagttgc ggacagtaac atacggtttg 2940 gctccttcat ctttcttagc tacaagaaca cttcagcagt tggctgaaga tgagggtgac 3000 gcgtttccta ctgccaagga tacgctgaaa aaacaactgt acatggatga ccttattgct 3060 gggtcaaata gtgttgatgg agctatacag ctacgtgagg aattgagtgc actggcgcag 3120 agaggcggtt ttacttttcg aaaatggtgt tcgaattcat tagccgtttt atctgacgtt 3180 cccgctgaac aattggcaac aaaatcatcg ttaaggttcg acgacaagga gacaattagt 3240 acgcttggta tatgttggga accagaaatt gatacgttcc agttcaacat ttctattact 3300 acgaaatcag agagagacac catgcgtacg atactatcaa tgattgctga actgtatgat 3360 ccattgggat tgatttcacc tgtaattatt acagccaaag ttttgatgca atcgctctgg 3420 cgtttgaagt tgagttggga tgatacggtg cctgaggaac ttcaaaggaa ttggattaga 3480 tttcgagcag aattaccaga acttaaagat tttagtatcc ctagattcgc tttcgctcat 3540 caatatcggc aagcagaaat acattgtttc acagacgcgt cagagcttgc atatggcgcg 3600 tgcatctata ttcggtctga agctgaggat ggaagcattc atgttaattt gctagcatca 3660 aaatcgagag tggctccact gaaggcattg acgattccta gacttgaact ttgcggtgca 3720 ttattaggag ctcgattgca tgagaaggta atggctgcaa tggagattaa gttcgttgct 3780 catcgatttt ggaccgattc taccgtggtg ctagattggc taaatgctga atcaaagact 3840 tggaaaacat tcgtagcaaa ccgagttgct gagatccaag caattcgaga tgctgtttgg 3900 caacatgtat ctggacaaga aaatccagct gaccttattt cacgcggagt tttaccgcat 3960 cagctcatca acaatcaatt gtggaaacaa ggtccacaat ggttatcaga gagaaaggag 4020 aattggcccc aacaaaaaga gagaacaggt caaattacaa cagacgaaat aagatcgaat 4080 gtcgttttaa caacgcaaat acaagaaaaa aatgaaatat ttacaagata tgggtcatac 4140 cagaagctaa tcgacgtggt tgcatattgt tttcgttttg ttcataatgc tcgtcgtctg 4200 caatcaagaa taagcaatag tgcattgacg gtaaaggaac ttgctgatgc taaaaagcga 4260 cttgtaaaac ttgtgcaagc tgaagaattt accaacgatc tttataaaat ccacaagggg 4320 attcctgttg ctcgaaattc aacgttgaag cttttaaacc cgtttataga taatgaagga 4380 ataatccgcg taggtggccg gctcagaaat tctgatttga attataatat taagcatcaa 4440 atagttcttc ctggattcca tccatttact caactcctca tcatggacaa acatgtaaaa 4500 gcaatgcatg gaggaatatc atcaactctt aacgcagtta gagatgagat ttggccaatc 4560 aatggaaaaa gagctgtacg taaggtcata cgaaattgtt ttcgatgttg cagagcaaat 4620 ccacaaccta taattcaacc tgaagggcaa ctaccagcag aacgtgttac agttaacgag 4680 gtgttcagtt gtacgggtct agattattgt ggacctttgt atttgaggcc aacacaccgc 4740 aaggccgcac caaataagtg ctacatttgt gtatttgtct gcatgagtac gaaagcagta 4800 catttggaat tagtcggaga tttgagcaca aattcgtttc tgatggcact tgatcgcttt 4860 gtttatcggc ggggtaaacc aaagcacatt tattcagaca atggtaccaa ttttattggt 4920 gccaaaaatg aacttcatca gatctacaaa atgctgttca acgattctgc agatagcaaa 4980 atagcaaaac atttggcaaa agaagagata caatggcatt tgataccccc acgcgcccca 5040 aatttcggag gcctttggga agcggcggtg aaggtagcca aaacccattt gattcgtcaa 5100 ctaggatcat cgcgtttatc atcggaggaa atgactactg ttttagtaaa aatagaaggt 5160 tgcatgaact cgcgaccatt agttccgctt tctgaagatc ccaatgattt gacggcatta 5220 actccagcgc actttcacat tacaaacaat ttgaaggtta ttctcgaacc tgacttgaaa 5280 gaggtgccta tgaatcgtct gggaagatac caactccttc acgggtacac gcaaaacttc 5340 tggatacact ggaaacagga ttatctgaaa aatcttactg ttttgcatcg gtcagctaag 5400 caatctaagc aattatctgt cggagatata gttattctga aggatgagca gcttccagca 5460 gttcaatggc cattggcacg tgtcgtcgaa atacaccctg gagctgatgg aatttcccga 5520 gttgcaacac ttcgtacggc atccggtatt gtgaagcgag cagtatctaa aatctgtccg 5580 ttacaatgca gtaatcaaag aatggattga aaactgagtg tttcaaggtg gccggtatgt 5640 tcgatctcaa atgacctgcg gttctaccac agtgccggat tatggcagct gacgttttgt 5700 cattcttgtg ggtctaggga gctaccgctg tgaaagggga accgcctttt tgtggcactt 5760 gtcacgagcc gaagaaaagc gacgacgcat aaatggtgtg tgaataaaga agtgaactta 5820 cttgccaaca aagaacaacc gactgattat ttcgtgtaaa aaaaaagtaa aaatatcaca 5880 acgtagggac tttcacggaa cacaaattaa ccagcaatca aatcaggtga gacagaaaac 5940 ctaccatgca ataaatgctt agcacacaca aaccgtagac agg 5983 // ID CR1-1a_AG repbase; DNA; ANG; 5247 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 12-MAR-2003 (Rel. 8.02, Last updated, Version 1) XX DE CR1-1a_AG is a CR1-like non-LTR retrotransposon - a consensus DE sequence. XX KW CR1 clade; CR1-1_AG; CR1-1a_AG; DNA/RNA-binding; endonuclease; KW Non-LTR retrotransposon; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5247 RA Kapitonov V.V. and Jurka J.; RT "CR1-1a_AG, a subfamily of CR1-like non-LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(2), 13-13 (2003). XX DR [1] (Consensus) XX CC CR1-1a_AG is a subfamily of CR1-1_AG non-LTR retrotransposons. CC The CR1-1a_AG consensus sequence was reconstructed based on CC multiple alignment of ~10 copies identified in the CC sequenced portion of the genome. Given the ~2% divergence CC of these copies from the consensus sequence, transposition of CC CR1-1a_AG occurred less than 1 million years ago. CC The CR1-1a_AG and CR1-1_AG consensus sequences are 79% identical CC to each other. The 3' terminus of CR1-1a_AG is composed of the CC CAT CC microsatellite. CC CR1-1a_AG encodes two proteins: a 444-aa CR1-1a_AG-ORF1p CC (positions 441-1772) and 932-aa CR1-1a_AG-ORF2p (positions CC 1776-4571). CR1-1a_AG_ORF1p is DNA/RNA binding protein composed CC of the PDH domain (positions 3-38) and gag-like zinc knuckle CC regions (aa positions 340-444). CR1-1a_AG-ORF2p is composed of CC the AP endonuclease and reverse transcriptase domains. XX SQ Sequence 5247 BP; 1575 A; 847 C; 1255 G; 1570 T; 0 other; tcaacccttg aggtgaatgg tgacaagtgc agtcctgtga atgtgtagct cccgtttgtg 60 aactgtgaaa acttgtgaaa tagagagcta agtgttattt tgttttgttc tacttgttac 120 ttaatgttag tgctacctgc ggtatcattg tttagtattt gtattattgt tttgtgaatg 180 taaaacacgg gtgaagtatc tcggtcgaca gcgtcactgc gttgtcgaaa gtgattaatc 240 tttgctcggt atgtgtttgt gttcgtgcta gacatacgta tcgttgtccg tttgtcaaga 300 cgtgtacacg agtgtaagcg tttgatatcg caggctgtag tagagtagtg tttgtgtgca 360 tttgctgtta cctttttttc atgtctgtgt gataattttt ttaattctaa aacagctcgc 420 ccgaattttg tttttacggc atggagtgtt tagcatgctc cgccgtagtt ttaataaacg 480 atgacccaat tttgtgtgcg gggaaatgtg gtggcaactt tcatcgtcgc tgtgttaccc 540 cctcgctctc gaaaacagcg gccaaaataa ttaatgagaa taagaatgtg ctgtatatgt 600 gtgatagatg tttagaacac aaatcgggct tggcgggtat ggatgtagat gtgagtggat 660 cgtacgattt actcacacaa tccataaaaa atttggagtc gaatgtgagt gtatggatat 720 cgagtgcctt ggaaaaggga atcgagactc taaaaactga gctctgcgca caagtggaac 780 gcaaattgga gcaaacattg cgtgaaagct tgagtgcggt agaatgctca aataaggcaa 840 aggaagcctt gcgtgcaacc tttgacgata ctaaggccag agaaacagta gaggatgaaa 900 gttgggctac agtgactaag aaaagaaaaa ggacgaatag tgggaacagt aatgttcaaa 960 ccattattaa tcgttttgac acggggaatg ttaattcgac gcccaagatt tctgacaagg 1020 ttactggacc cattttagca aacaaaaata agaatagtaa gactctggtt attgtaccaa 1080 aggtgggtca atcttgtgat aagacaagag ctgaccttcg cgctaagctg gatccaagga 1140 agcagcaggt gtcggaattc cgtaacggca aggacggtca ggtgtttgtt caatgttctg 1200 ctcaggttaa attagatgaa ctcaggaaag aagtagaaaa cattttggga gatgagtatg 1260 caacggattt accattgtca cgtgtaaaga taattgggat gagcgaaaaa tacactgact 1320 cacatttagt agatctttta aaatctcaaa atgagggaat accctggaaa caggtcaagg 1380 tcataggaat gtttgaaaat aaaatttaca agtaccaaaa atataatgcg gtcttggaaa 1440 ttgattatga gtctgaccta tgtctggcaa aattaggaaa aataaatgtg ggatttgata 1500 ggtgtaaaat ttcgaagtcc gtgcatgtta tgaggtgtta tggttgtggt caatttaatc 1560 acaagagcac cgagtgcaag aataagcaag cttgttcaaa atgtggtgat gaacacaaaa 1620 cgtccgaatg tacttcatct tctttgaaat gtgtgaattg cgtgttagca aactctgtta 1680 gaaaccttaa actagatgta agacatgcgg ccaatgatta taattgtccg atgtttaaaa 1740 aacagataga aaggcgtatg caactttctc aatagcaggt aggggaggag ttaggacggt 1800 tcagagagat tttatatttc aatgtagccg gtctttcgtc caactacgcc atgtttcgtg 1860 agacagtaga gaaagttcaa ccgttgctgg tcttaatctc tgaaactcat gtgattgagg 1920 aagaagcatt tcagcagttt catattaatg gttatagggt tgtgtcgtgt ttgtcccact 1980 cacgtcatac aggaggtgta gctgtttatg ccaggagcga aattgtccta aaggtgattt 2040 ttaatgaatc attggagggt aattggtttc ttggtgttgc agtttctaag ggaatgacag 2100 caggcaatta cgggatattg tatcattcgc caagtgcaag tgactcaagg tttgttgaca 2160 ttttggaaga atggttagat aggttcttga attttagcaa acttaacatt atcgtcggtg 2220 atttcaatat tgactggtta aatgttgaaa aatctgcgaa gttgaaaagc ttaatggatt 2280 cagtaaacat gaaacaaaaa gttaacgaat tcacacgaat tgctaggcag agcagaacat 2340 tgatcgatca ggtttacagt agcacagact caatcaaggt cactactgat ccgttattaa 2400 aaatatcgga tcatgaaaca cttgttttga atataaacgt gaaacgttgt gaaacgattc 2460 aacgaaaatt taaatgctgg aataggtact cgaaatttgc tctttgcaat catgtgtcac 2520 aaggtttaag gcaagaagca ccgagcttca acgaagctgc agacttgtta tggaacacat 2580 tgaaaaaagc tatgggcacc ttggttgaag agaagacaat cttgtcaaga gagactagta 2640 ggtggtatac tttggaactt agccgtgcga aacggaaaag agacgaagca taccaacatt 2700 ttattagatc aaactcaggc tacgattgga ctgaatacac tagactaagg aatacataca 2760 gcaggaatct caaaactact cgtagaaatt actttagtgg tgagatatct aagcataaag 2820 gaaatagcaa ggagttgtgg aaagtgctta aaagtttact aaggccagag gaatcacgcg 2880 tttctgttgt aaaatttgat gggttggtag aatcggaaga ttcaataata tgccagaaat 2940 tcaatttgtt ttttgtaaac agtgttttag atataaatca aaacattgct gatgccagag 3000 cgcctgattc tttgttttgt aatgatgctc caggaaacca attcaaattt cagaaaatta 3060 caaaagagaa attaaaaact atttgtttca gcctgtccaa aacagcgggc atagggaatg 3120 ttaactgtaa tactattcag gattgcttcc atgtggtagg agagtctctt cttatagtga 3180 tcaatcagtc gctggaggag ggtatttttc cggagacttg gaaggaatca ttggtaatac 3240 ctattcctaa agtgagcgga gctgccagtg cggaagagtt tcgtcccatc aatatgttgc 3300 atgtactcga aaaggtgctg gaattggtgg ttaaggagca attggtccag tttctaactc 3360 gaaatgacct gttgattagt gaacaatcag gatatcgaca gggacactct tgtgaaactg 3420 ctttaaatct tgtactggcg aggtggaagg tgttgatgga tcgaaaggaa tcgatagttg 3480 ccgttttctt ggatctaaaa cgagcatttg agactatatc aagaccgtta ttgctgcaga 3540 ccttaaggcg ttttggtatt gtggggaaag agctcaattg gttcgaaaat tatttaaaag 3600 acagaactca gagaacagtt tttggaaact ctatatcaga gcctatagaa aatacccttg 3660 gagttccgca aggaagtgtt cttggaccaa ttttgtttat aatgtatatc aatgacatga 3720 aacaggtttt gaagtcttgt gagatcaatc tttttgccga tgatactgtt ttgtttatct 3780 cgcacaaaga catcaagcaa gcagagtctc tgattaattt cgatttaaac gctctggatg 3840 gttggcttag gtacaaaaag ctagcattaa acgttaagaa gactagttac atggtaatga 3900 ctgctggtgt attagacagt cccccatcca tcgtgataaa taaggaacca atcgaaagag 3960 tccgtcaggt taaatatctg ggggttattt tagacgacag attgaagttc aacactcaca 4020 tagactgggt catcgctaaa gtggcatcaa agtgtggggt tattagtagg ctggcaaaag 4080 atctcgattt ttttgggata gttaacctct ataagtcact gatttcacca cattttgatt 4140 tttgctcgtc gattctgttt cttggcaata agggacaaat taaaaggctt cagagattgc 4200 aaaatcgtat tatgcggtta gctttagggt gtggccgacg tacgtcgtct tttgttatgc 4260 tagatattct tcagtggatg tctgtagagc agaggattgt gtatcaaacc atgaccttta 4320 tatttaaact tttggggggc cttttgccag ggtatttagg ggaacgcatt gttcgaggat 4380 ctgatgttca tcggcactgt acacgcagag caaatgagcc gagggttccc aacttaattt 4440 cacatggtgc cagaaactct ttgtttttca aggggattca attatacaac agattacccg 4500 gagaaatcaa gaatgcgagt aacttgccag acttcaaacg taggtgtgcg gcatatgtta 4560 aacaaactgt gtaatgtcat atttgtgtag aagtcctatg tcactatgtt atgtacaact 4620 gcacttgtca tcatgagctt gatgatgatg ataagatttt tcttgatata tgaaaacaaa 4680 ttagaaaaaa tatataagaa agagacaaca taagtttgag acacgcgcgc gtacaagtgg 4740 acaatcggat ttagtttggg agagcttgct gtctgcatgt ctgggaggtc acagcgagat 4800 tcactcattg atgattgtaa gtggccaatt ccagatacat taggttctcc tgcatcggaa 4860 ggtagtgccc tggtgccatc gttggtacca gcggaactgg ccatatgcat gttgcagtgt 4920 gccctgataa tttccaatac attaggttct gctgcttcgg aaggtagtgc cctggagccg 4980 gtgtggcacc agtggaactg gccatatgca tgtcgcagtt atgtcctgat gtgttcgatg 5040 gagttgaccc gttttttctt gatgaaacta cgtcggtcgt cttgggtgtg tgtatgactt 5100 ggtggattct tttgaatgac gtccgatgcc accacttgct cgtaaatttt tattttattc 5160 aaaaactacc ttagagtaat attatcgtaa agatacttcc gtccttctca aacctgtgtt 5220 ggggtaagag gtgggactta tcatcat 5247 // ID RETRO25_AG_LTR repbase; DNA; ANG; 316 BP. XX AC . XX DT 06-FEB-2003 (Rel. 4.1, Created) DT 06-FEB-2003 (Rel. 4.1, Last updated, Version 1) XX DE Anopheles gambiae long terminal repeat from RETRO25_AG DE retrotransposon - a consensus. XX KW Long terminal repeat; retrotransposon; BEL; DIVER; RETRO25_AG_I; KW RETRO25_AG_LTR. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-316 RA Jurka J. and Drazkiewicz A.; RT "RETRO25_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 8-8 (2002). XX DR [1] (Consensus) XX CC Related to BEL and DIVER from Drosophila melanogaster. 5 bp CC target site duplication. XX SQ Sequence 316 BP; 89 A; 73 C; 60 G; 94 T; 0 other; tgttacggca agcgtaatgg taacgcgtca cccttagtac gcctagcgta aggtagaggg 60 cgctttcaag aaatttacct acgtgtaaat tttacttaat cataattcta ttacgcagtg 120 caaattgctg tagatcggta tcgtgcagct cgaattcccg caagtgcagc tcgaattacc 180 gcaagtgcag ctcgaattcc cgaattctgc cgtatataag cgagtgtttt tcctgaataa 240 atttagtcaa gttccagagt tcaaagcaac aacatcgtgt cttcatcatc ttcaaacttc 300 ctcttcatca ttaaca 316 // ID GYPSY6-LTR_AG repbase; DNA; ANG; 181 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 08-MAY-2003 (Rel. 8.04, Last updated, Version 1) XX DE GYPSY6-LTR_AG is an LTR of the GYPSY6_AG LTR retrotransposon - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD; KW GYPSY6-I_AG; GYPSY6-LTR_AG; GYPSY6_AG; Gyspy clade. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-181 RA Pavlicek A., Kapitonov V.V. and Jurka J.; RT "GYPSY6_AG, a family of LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 3(4), 82-82 (2003). XX DR [1] (Consensus) XX CC GYPSY6-LTR is a long terminal repeat of GYPSY6_AG (its internal CC portion is deposited as GYPSY6-I_AG). XX SQ Sequence 181 BP; 76 A; 33 C; 31 G; 41 T; 0 other; tgtaaacaat ttaaaattag gttacttata cttatctcgg gtgcaataat ctaatctaca 60 agaactgcaa taaaatagga atgacagcga gcagaacggc acttcctacg acacacgaag 120 aaacgataga aagccctcgt gtacgaaaaa agcaacaatt gtaaaaataa agctgtttac 180 a 181 // ID R7Ag1 repbase; DNA; ANG; 6470 BP. XX AC AB090820; XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE Anopheles gambiae retrotransposon R7Ag1 DNA, complete sequence. XX KW Non-LTR Retrotransposon; Transposable Element; KW reverse transcriptase; gag-like; R7Ag1. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RA Kojima K.K. and Fujiwara H.; RT "Evolution of target specificity in R1 clade non-LTR RT retrotransposons."; RL Mol. Biol. Evol 20(3), 351-361 (2003). XX DR Genbank; AB090820; Positions 1 6470. XX SQ Sequence 6470 BP; 1443 A; 1666 C; 2157 G; 1204 T; 0 other; cagtcgcaat cgagagctcg acccgaacgg acgtgtttgc gcagtgctcc gtgtggtgag 60 agggtgttcg tgagagcgag agagtgggag cgattgatat tcgtgtatcg aataatagtg 120 agcgtgtgta tgcgaagcgg aagcgagggt gaacttctgg aactatctgg aactttctcg 180 gctgttgccg tgcgtggctt ccttttgccc gtgcgtttgt gtatgcgttt gtgagtgtgt 240 ccgcaagcga gtggaatatt ctcggctgtt gacgcgcgtg gcgcttcgtt gtccgggctt 300 gtgtttcgtg taagcgagtg tgtgcgcaag aacgaggaga gttttctgga actatctgga 360 actttctcgg ctgttgccgt acgtggcttc cttttgcccg cgcgggtgtg catgcgtgtg 420 tgagtgtgtg cgcaagcgag tggaacattc tcggccgttg acgcgcgagg tgctccgttg 480 cccgtgcgtg tgcttcgtga aagcgtgagt gtgtgcgcga gcgagagtag aatattctgg 540 aactatctag aactttctcg acagttgccg cgcgtggctt ccttttgccc gcgcggatgt 600 gcatgcgtgt gtgagtgtgt gcgcaagcaa gtggaacatt ctcggccgtt gacgcgcgcg 660 gtgctccgtt gcccgcgcgt gtgcttcgtg aaagcgtgag tgtgtgcgcg agcgagagta 720 gaatattctg gaactatcta gaactttctt gacagttgcc gcgcgtggct tccttttgcc 780 cgcgcgtgtg tgtatgtgtg tgtgagtgtg tgcgcaagcg agtggaacat tctcggctgt 840 tgccgcgtgt ggcgctccgt tgctcgcgcg tagcgttccg tgtcactgtg tgtgcgcttc 900 tcatagcatc acagcgccgg ggacattcgt ggtggcgggg tggtagcaac ccgccaccat 960 ggataagcaa ctgagaggaa ggaccatatc ggtcgatgag cgtcctgcgg tcgttataag 1020 gaagctgggg agcgagaaaa aactcgggac catcgtcgag gaaccatcat cggcaggagt 1080 gccagcgagg accatggcga cgggaggagt gaaatcggcg ggaacggcga cgaaactggc 1140 gacttcaaca ccggtttcca ccggagaagt gcgtcggatg ctggcggatg caaaagctga 1200 caatgagacg acggtcggca ttgtcaagcg gttggaggag caaatccagt tgctgcgctt 1260 gcaaatggag gcctccaacg agcagctgaa ggaagcgcaa agggaggcaa gagaggcgcg 1320 cgaagacgct cgggtacgcg aagcggaaca ccgcgaagag cttcgcaagg agaaggagct 1380 gttcaacgct cttctagcgc aaacccttgg tggaaccagc ggagctcggc tagagagcca 1440 gcaggaactg cagcgagagc aggagctgct tcggaggatg gaaagccagc aacgacagga 1500 acagcggcaa cagctggaag atcaacagcg ccaaaggtgg cgtcagcagc agcagaaaca 1560 acaacggcag cagcggctac ctgcgcagca atggccgacg gtgcagcaga gcgtgcgtgc 1620 tcagcgtcag ggcgtgacgg agtcggcatc ctcggcggta cctgacgagg caggaacgtg 1680 ggtggaggtt gttcgcggca atcagcgcgg gaataagcag aacggagtga atctgcccca 1740 gcagtcagcc cagcggcagc cagcacaccg gcagcatcag cagtggccgc accagcaaaa 1800 tgggcagcag cagcagcagc ggatgggcat tcatcagcag gagaagcggc gtccgcgacg 1860 aaaacgcccg gatgaaattg tcgttgtgcc cgccccagga gtgtccttca aggaaatgta 1920 tgtgaagata cggaccaacc cgcggattgc cgatttccag cggcaaattg gggttggcag 1980 aagaacgccg agggaccacc tcctgctgcc tttgtcccgc gacgtcgata gcgcggcgct 2040 gaaggacatc atccaggagg tcatcgggga acgtggatcg gtaaccgtca gaacagagat 2100 ggctgaggtc gtcctgactg gaatcgacaa catgatcgac gaggaggcga tcaaaaaggc 2160 gctcatgacc actcttggaa agcagtcatt ggtggccacc gtgaaccttt gggagcgccg 2220 agacatgacg aagcgggctc gcgtgcgcct cccacgagca gaagcggaac ttgtcaaaga 2280 tcgccgactg gagctgggct acacgtattg ttcggtacat gaagccccaa aagtatcggg 2340 tcagctgact cgctgcttcc ggtgtctgga gcggggacac atcgccgcga cgtgcacggg 2400 tgaggatcgg tccaagcgtt gtctacggtg tggtgaccaa actcacaagg cgtcgggttg 2460 caccaacgag gtcaagtgca tgctgtgtgg cggcgcccac cgtattggtg ccgcagcctg 2520 cggtggacaa ccctcgaact gaaatggaag tgctacagat caacgtcaat cgcagtagga 2580 gcgcgcaaga cctggcgctc aacacgatgc gggtggagcg ggcggatgtc tgtctgatgg 2640 tggaactgca cagtgtcccc aggaacaatg ggaactgggt ggccgacagg gacgggaagg 2700 tggccatcat agccagcagc gaaacgtacc cggttcagca agtggtctcc gtgacgcagt 2760 ctgggatcgc ggctgcccgg atcaatgacg ttctcttcat atgttgctac gtgtcgccct 2820 cagcgggcgt ctctgagttc gaggaggtaa tgcagcgcat cgatgtgttg gcgagaggtc 2880 acccgcgcgt cgtctttgcg ggggatctca atgcgtggca caccgcttgg ggaagttgcc 2940 gcaccaatgc aaagggagag gctgtggtcc agctcgtcga cagcttgggg ctggaggtgt 3000 taaacaccgg caccgcccca accttcctgg gcaacggagt ggctcgcccg agcgtggtgg 3060 acgttgcctt cgccagcagc agcatcgccg gagtcaacac cattccggag catcggtgga 3120 ggattattag catatactcg tacagcaacc acgtgtacat tcggtttgct gtaggggagc 3180 tgctccagcg gccagcggca gatagtcgtc gacaggaggg tccttccacg cgagaaagcg 3240 gcacgagatg gcgtacccgt catttcgacg ccgagctttt cggtgtagcg ctcgacgtag 3300 cttcgttcac agagcgggtt acaagtgccg aaagcttgga gagagtcatg acggaagctt 3360 gtgacgcagc catggcgcga gtgttccctt cacaaggtca ctcgggacga cctgcttatt 3420 ggtggacgcc agcaatcgag gtcctgtgtg agaactgccg cctcgctaag gaacgccttg 3480 aagctgccat cgacgaggaa gagcagatcg ccgcagccag cgaccttctc caggtgcgga 3540 ccgccctgga ctcatccatc accaccagca agaaggagca tttcgatgaa atactgcggg 3600 gcctcgcgga agacgagacg ggacaatggt accgcaacgt acttagccgc cttagtggaa 3660 gctggacggc gagggagcgc gattcatcgg tgctagaggt catcgtttcc actctgttcc 3720 cccagcatcc tccggttgac tggccagcgt caccaggcca agtcctggag aggggagagg 3780 aggaaccagt tcgggacgtt aatgaacagg agctgctgga catcgcgagc tcgctgaacc 3840 cgaggaaggc tccaggactg gatggtgtgc caaacgccgc tttgacggcc gccatccgga 3900 agcatacgga catcttcaag aagttgttcc aggaatgctt ggacaacgag cggttcccgg 3960 atgagtggaa gaagcagaaa ctggccctga tccccaagcc gggcaagcca ccggggctcg 4020 cttcatcctt ccgcccgatt ctgctgctga acaacccggg caaagtgtac gagcggttgc 4080 tgttgtcgcg aataaacgat gtcatcgagg atcctgaatc accgaggttg gcagaaaacc 4140 agtacgggtt caggaggggc cgttcgacag tgcaggcgat tcagctggtg gtggatgcag 4200 gcagtcacgc gatgtcattt ggccgtacaa acaacaggga taaacgctgc cttctagttg 4260 tggcgctgga tgtgcgtaat gcgtttaaca ctgccagctg gcagtgcatc gctacggcgc 4320 tggaggacaa aggcgtgccg aggcagctcc gcaacatcct tagagactac tttgccaaca 4380 gggagctcgt ctatgacacc gcagacgggc ccgttacacg ccgagtgact gcaggtgttc 4440 cacaggggtc cattctgggc ccgaccctgt ggaacatcat gtacgacggc gtgttgcggg 4500 tcgagctccc tgaaggggct agcgtcatcg gctatgcgga tgacatagtg gtcatggcac 4560 ggggttgcac accacaggag gcggcattgg tggctgaaca ggcggtggac gcgattgcgg 4620 cttggatgga ggaccatcac ctgcagctcg ctccggagaa gacggaagga gtaatgatct 4680 ccagtctgcg aagaggtcaa ctgaaggtgc cgttccgcgt aggggacacc atcatacaca 4740 gcaaacagtc gatccggtat ctgggggtcc agatccatga ccacctgtcg tggaagccgc 4800 acgtggagct gtcgacggct aaagccctcc gcgtggtagg tgtggtcacc gcagtaatga 4860 ggaaccacag tgggccccag gtggccaagc gtcggctgtt ggcggcagtg gcggagtcga 4920 tcatccggta tgctgccccc gtgtggtccg aggcgacgga tctgcagtgg tgccagagga 4980 agctggccca ggtgcagagg cccctggctc gcggtgtcac cagctcgttc gtgtcggtgg 5040 catatgagac cggagttgca ctggcaggcc ttgtgccgtt caggctgctg gtacgggagg 5100 acgcaaggtg ccatcggagg ctcctagctg ccccgggcgc cagccgcaag gacatccggc 5160 tggaagagag gcaggggact ttccaggagt ggcagcgagc gtgggatgcg gcggccgcag 5220 ctccaacggc cagtcggtac gcggtttggg cccaccgaat gatcccggac ctgcacttgt 5280 ggatgagtag gcgacatgga gaggttgatt tccacctctc acaggtgtta accggacatg 5340 gatatttccg ggaatacctc catgtctgcg gttttgcccc atcggcggaa tgtccacggt 5400 gcccggggtc ggttgaatca gtggcgcatg tgctgttcca gtgcgaggtc ttccacgaga 5460 tccgggtgga gctgttgggc tacggcacca gcgacccagt gaacgagaac aatctcggca 5520 tgaagctgct ggagagcccc gagcggtgga acagcatcca ggaagctgca cgcaaaatca 5580 cgaaggtgtt gcaacagctt tggcgtgagg acgagctgca gctcaatctc caggcacacc 5640 ttgcagcact accgacgaga gcagcagcgg tagacgccgg tccgctggat ggtgagcagg 5700 tgtcggttga tggggtggct gaactctttc gttccaaccg aggtcgagcc agacggacca 5760 gaagaggtcg tcggagggcg gaagaacggg tggaagtgcg ccttgcatcc gcaatggcag 5820 cagccgaacg agaacgtgag gactctatcc tgatggcggc ggtgcgggcg gaggaagcag 5880 gagaagcacc accacccatc cccatgagaa gacgcggact gcctccatct ccgagaacgg 5940 taagagcgcg gcacgaacgg agactgtatc tgcaacgtct ctaccgtcaa cgcgccaggg 6000 aagggactct accaacagtc ccacacggta gaaaccggag gagcaggtcg gccccttcgg 6060 aagccgatac catccgacgg cggatgagaa ggcgtgagat ggagcggcta cgccgaacag 6120 cgcgaagggt gccgtccaat cagggagtgc gtgaggtgct atccgccgac gccctggcag 6180 ccatcacgga agcaacaacc tccggccgtt agaaggagga ttatggggta ggaagtgcat 6240 gggacagcaa aagggtacgg cgcaaaagaa aaataaaaaa tgcgttgggg ttttaattcc 6300 gtaaggaaaa aagaataagg aacgaaataa ataagaacca ataaaggcga tgtcacaagt 6360 gacacatact cctggttcca agccccgcta gggaacgggt ccaggagcag gagtggggat 6420 ttaacgttaa gttaatcttg caaataaatc cttaagtatt aaaaaaaaaa 6470 // ID GYPSY7-I_AG repbase; DNA; ANG; 6534 BP. XX AC . XX DT 16-JUN-2003 (Rel. 8.05, Created) DT 16-JUN-2003 (Rel. 8.05, Last updated, Version 1) XX DE GYPSY7-I_AG is an internal portion of the GYPSY7_AG LTR DE retrotransposon - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD; KW GYPSY7-I_AG; GYPSY7-LTR_AG; GYPSY7_AG; Gypsy clade; Gypsy group; KW env; gag; integrase; protease; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-6534 RA Kapitonov V.V. and Jurka J.; RT "GYPSY7_AG, a family of LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 3(5), 87-87 (2003). XX DR [1] (Consensus) XX CC GYPSY7_AG is a young family of gypsy-like LTR retrotransposons. CC GYPSY7_AG belongs to the Gypsy group of the Gypsy superfamily. CC GYPSY7-I_AG, an internal portion of GYPSY7_AG, is flanked by CC GYPSY7-LTR_AG LTRs. The GYPSY7-I_AG consensus sequence was CC reconstructed based on multiple alignment of 5 copies; they are CC ~0.4% divergent from the consensus sequence. CC The consensus sequence encodes the Gypsy7_AG1p 434-aa gag-like CC protein (pos. 507-1808), the 1061-aa Gypsy7_AG2p pol-like protein CC (pos. 1756-4939), composed of the protease, reverse transcriptase CC and integrase domains, and the Gypsy7_AG3p env protein (pos.4962- CC 6506). XX FH Key Location/Qualifiers FT CDS 507..1808 FT /product="GYPSY7_AG1p" FT /translation="MEALAGRIAALEARFSESNVTDDFQDPPLFFTKQDGS FT AVDPESFEKIPGVVKDLPIFCGDPSELNSWINDVDGIIRLYQTISSHSLEK FT QNKFHMICKFIRRKIRGEANDALVASNVGINWNMMRKTLITYYGEKRDLET FT LDFQLMSVYQKGRTLEVYYDEVNRLLSLIANQIQTDDRFNHPEASKAMIGT FT YNKKAIDAFIRGLDGDVYKFIRNYEPTSLAAAYSYCISFQNLECRKMLTKP FT KHFNTPPSAPRNQIPLPTPHLPPRVFQHQQRPMTANNVRPHFAHHPPIQNF FT AGNFTQRPVWNQPNQQRPIFQRTNFNQPNQMKNFTQQRNNFRQNGPEPMEI FT DPSIRSHQVNYANRPNSSNIRPLKRQRAFNIEAVPRRELEPTSYEDNLYDD FT DVESQASYERYMRNVEKQEKLNENSHYDEISREAELNFLG" FT CDS 4962..6506 FT /product="GYPSY7_AG3p" FT /translation="LIFYYRIAFFTLYGVLQASINIFDLTNNPLAIVPLGQ FT AKIRIGYLRTIHPIDLTELEEIISRVFENSTNSTGKSPLQSLINLKLEKLN FT ATISKIRPRRLRTKRWNSIGTAWKWIAGSPDAEDLTIINTTLNSLILQNNE FT QLLINNGLSRRFQETTNIANHVIDLQNRIQREHQTEIQQIIKIANLDALQA FT HIKTLQEAILAAKHGIPNSELLSIEDLNTVAEFLAQNGIYYTSVEEMLTQA FT TAQVTMNSTHVIFMLKFPRLSYETYEYNYIDSIIQNDKRILIKHNYIIRNL FT THMFELPQPCIDQSSHQLCESKDLEEPSRCIRQLVQGEHTECMYEKVYSTG FT LVKHINNANILLNDATAEISSNCSNINHILNGSYLIQFHNCNIFINGELFP FT STEVSITGKPYISTLGLIAKEDGIRDEPSIEHLRNITLQHREKLHTISLVN FT NSLTWKLHIFGSIGLTTIVLITIAILYFITSIRRTKISLNIPTNNTNRQDV FT HHIETFVKKPTTFHALGRL" FT CDS 1757..4939 FT /product="GYPSY7_AG2p" FT /translation="KFSLRRNFSRSRIKFFRLKSALPYFIYHGKAGQQIKI FT LIDTGSNKNFINPLHAKISHDVIKPFFVSSVGGDLLITKYSQAQIFAPYSD FT VNVKFYHLQGLKSFDAIIGYDTIKEMGAFVDAKRDNLVLENFIIPLSLHPL FT QEVNRIEIRDTHLNHQEKEKLHLFLNKFQDLFQPPDEKLPFTTKVEATIAT FT NDTEPIYCKSYPYPLSLKQEVETQIKKLLNNGIIRPSRSPYNSPVWIVPKK FT VDASNEKKYRLVIDYRKINLKTKSDRYPIPDTSTVLANLGNNKYFTTLDLA FT SGFHQIRLAEKDIEKTAFSINNGKYEFLRLPFGLKNAPSIFQRVMDDVLRE FT HIGKICHVYIDDIIVFGKTFDEHLKNLEIVLNTLREANFKIQPDKSEFLRT FT EVEFLGFIVSEYGLKPNEKKIESILKYPEPQTIRELRSFLGLSGYYRRFVK FT NYAALAKPLTKLLRGEDGQGHCKITKNQSKNFPIKLDDDAKRAFKTLKEVL FT SSDDVLAYPDFDHDFILTTDASDKAIGAVLSQNVNGVEKPITFISRTLSKT FT EENYATNEKEMLAIVWALHSLRNYIYGAKIIILTDHQPLTYAMSPKNNNAK FT LKRWKAFIEEHNYELSYKPGKTNVVADALSRIQINSLTPTQHSAEEDDLSF FT IPSTEAPINVFRNQLIFQKGTISSYEFVNPFPKFKRHTFIEPQFSIDFIKD FT KLKRFMIPGIINGIFTDEPTMGIIQETFKNLFNISTMKARFSQTQVQDICD FT QEQQIEEIRKIHNFAHRNAKENSLQAIKKFYFPSMRNKIEQYVKNCETCKV FT EKYERRPPEYIPVKTPIPKYPGEIVHVDIFAYNANFLFISSMDKFSKYLKL FT KPIKSKSIADVKEVLLQLLYDWNLPRQIIFDNECTFVSNVIEQSILNLGVS FT IFKTPVNRSESNGQVERCHSTIREIARCTKGLNPDMSLITLIQQAVYKYNN FT TIHSFTKETPRKVYIGEQSEELSFRDRSKLKEKIESKIIKIFEEKNEKIKD FT DKYQDYEPNQFAYEKNKTMNKRDSRYKTVVVKENHPTYIIDSNNRKIHKIN FT LRKN" XX SQ Sequence 6534 BP; 2503 A; 1267 C; 1070 G; 1694 T; 0 other; ggcgcccgaa tagggacctc tagtgaagtg aaaatcaagt gacgagtagt gatcgcgaaa 60 gagcgtaatc acaccgtgct tagaacatct cccgatcgca gattgcagcc gtgtgtatga 120 gcacctaagt agccggtagc agccaataac tcgttgccgg aggaaaggga caacgccgac 180 acaccgagaa ggaaaaattc accggaggcc cgagcaacag aaaccaaccc cgtcagagga 240 ggcgaagcaa ccagccccgt aagaagggtg aagcatgaat ttattaatgt aagtaaaact 300 ttcaatcata gtaagtataa acatttatta aagcgacaaa aaatattaaa acattataat 360 cagtgttaaa gtgagtgatt agtgagtgaa aagaaaatca ttatcacaaa agttaagtca 420 aactggagga attagttcaa tcattacgac aattctcagt cacagtaaaa tctatagata 480 acgaagaaaa actaacaaac aaagatatgg aggcactagc tggcagaatt gcagctttag 540 aagcacgttt tagtgaaagc aatgttacag atgattttca agacccacca ctttttttta 600 caaaacaaga tggtagtgca gtagatcccg aatctttcga aaaaatccct ggagttgtaa 660 aagatctccc aattttctgc ggtgacccaa gtgaacttaa tagctggatc aatgacgtag 720 atgggataat ccgactatac caaactatat ctagccatag tttggaaaag caaaataaat 780 tccacatgat ttgtaaattc atacgtagaa aaattagagg tgaagccaac gatgctttag 840 tagcatctaa cgtagggata aattggaata tgatgagaaa aactctcata acttattatg 900 gagagaagcg agatttggaa actctcgatt ttcaacttat gagtgtctac caaaaaggtc 960 gaactttgga agtttattac gacgaggtta atagacttct ttcacttatt gcaaatcaga 1020 tacagacaga cgatagattt aaccatccgg aagcttcgaa agctatgatt ggaacataca 1080 acaagaaagc gatcgatgct tttatcagag gtctcgatgg ggacgtttat aaatttattc 1140 gtaactacga accaacatcc ttagcagcag cctacagcta ttgcatttct tttcaaaacc 1200 tagagtgccg taagatgcta acaaaaccaa aacattttaa cacacccccg tcagccccca 1260 gaaaccaaat accattgccc acacctcatc taccaccaag agtgttccaa caccaacaaa 1320 gaccaatgac agcgaacaac gtaagacctc attttgcgca ccacccaccg attcaaaatt 1380 ttgcaggaaa ttttacacaa cgtcctgttt ggaatcaacc aaatcagcaa agaccaattt 1440 ttcagcgcac aaattttaat caaccaaatc agatgaaaaa ttttacacag cagagaaaca 1500 attttcgcca aaatggacct gaaccgatgg aaatagaccc atcaattagg tcacatcaag 1560 ttaattatgc gaacaggccg aactcctcaa acattcgtcc attgaaaaga caaagagctt 1620 tcaatattga agcagttccg cgacgtgaat tagaaccgac ttcatatgaa gataatctct 1680 acgatgatga tgtcgaaagt caggcgtcat acgaacgata tatgagaaat gtagaaaagc 1740 aagaaaaact aaatgaaaat tctcattacg acgaaatttc tcgcgaagca gaattaaatt 1800 ttttaggtta aaatcagctt taccatattt tatataccat ggtaaggcag gtcaacaaat 1860 taaaattcta atcgacactg gatctaataa aaatttcatc aaccctttac atgcgaaaat 1920 ttctcacgac gttataaaac cattttttgt atcatctgtg ggaggagatt tactcatcac 1980 aaaatattca caagctcaaa tatttgcccc ttattccgat gtaaatgtca aattttatca 2040 tttgcaggga ctaaaatcat ttgacgccat aataggttat gataccatca aagaaatggg 2100 agcatttgta gacgctaaaa gagacaatct agttcttgaa aattttataa tacctctctc 2160 acttcatcca ttacaggaag ttaacagaat tgaaataaga gacacacatc ttaaccacca 2220 agaaaaagaa aaattacatt tatttcttaa caagtttcaa gatttattcc agccacccga 2280 cgaaaagttg ccctttacaa caaaggtaga agcaaccata gccacgaatg atacggaacc 2340 aatttactgt aagtcatacc catacccttt gtccctcaaa caggaagtgg aaacacagat 2400 aaaaaaatta ttaaataatg gtataattcg accatctagg tcaccatata attcacctgt 2460 gtggatagtt cccaaaaagg ttgacgcatc taacgaaaaa aaatatcgac ttgtgatcga 2520 ttacagaaaa ataaacctga aaactaaaag cgatagatat cccattcccg atacttcaac 2580 agtacttgcc aatctaggaa ataataaata ttttacaaca ctcgatctag catcgggatt 2640 tcaccagatt cgtttagcag aaaaagatat cgaaaaaacc gccttttcca tcaataatgg 2700 aaaatacgaa tttttaagat tacctttcgg tctgaaaaat gcaccttcga tttttcagag 2760 agtcatggac gatgttctta gagaacatat tggaaaaatt tgtcacgtat acatagacga 2820 tataatagtc tttggaaaaa cattcgacga acatctgaaa aacttggaaa ttgttttgaa 2880 tacattacga gaagccaatt ttaaaataca gccagacaaa tcagagtttt taagaacaga 2940 agttgaattc ttaggattca ttgtttcaga atatggcttg aaaccaaatg agaaaaagat 3000 agaaagtatc ttaaaatacc ccgaacctca aactattcga gaacttagat catttttagg 3060 actgtctgga tattacagaa gatttgttaa aaattatgca gctttagcaa aacctttaac 3120 aaaactttta agaggggagg atggccaagg ccactgcaaa attacaaaaa atcaatctaa 3180 aaattttccg ataaaattag atgatgatgc caaacgcgct ttcaaaactc ttaaggaagt 3240 tttatcatcc gatgatgttt tagcataccc cgattttgat catgatttta ttttaactac 3300 cgacgcttct gacaaagcaa tcggagctgt tctttcccag aacgttaatg gtgttgaaaa 3360 accaataaca ttcatatcta gaacactatc aaaaacagaa gagaattatg ctacaaacga 3420 aaaagaaatg cttgctatag tttgggcttt acattctcta cgtaattaca tttacggtgc 3480 aaaaataata atattaacag atcaccaacc tttaacatat gcaatgtcac caaaaaataa 3540 caatgcaaaa ttaaagcgat ggaaagcatt catagaggaa cataactatg agctaagtta 3600 caaacctgga aaaaccaacg tggtagctga tgctctttca cgcatacaaa ttaactcact 3660 aactcctaca caacactctg ccgaagaaga tgatctttct tttatccctt ctaccgaagc 3720 tccaattaat gttttccgaa accaattaat ttttcaaaaa ggtactatta gttcctacga 3780 gtttgtaaac ccctttccta agtttaaaag gcacactttc atagaaccac aattttcaat 3840 cgattttata aaagacaaac ttaagagatt catgatacct ggtataataa atggcatatt 3900 cactgatgag ccaactatgg ggatcattca agaaaccttt aaaaatctat tcaatatatc 3960 aaccatgaaa gcaagatttt cacaaactca agttcaagac atttgtgatc aagaacaaca 4020 gatagaagaa attcgtaaaa tacataactt tgcccataga aacgctaagg aaaattcatt 4080 acaagctata aaaaaatttt atttcccttc catgagaaac aagatagaac aatatgttaa 4140 aaactgcgaa acttgcaaag tagaaaaata cgaaagaaga ccccctgaat acataccagt 4200 taaaacacca atcccaaaat atccaggaga aattgttcat gttgatatat ttgcgtataa 4260 tgcaaatttt ttattcatct cgtcaatgga caaattttcg aaatatttga aattaaaacc 4320 aattaaatca aaatccatag cagacgttaa ggaagtactg ctacaattat tatacgattg 4380 gaatttgcct agacaaatta tatttgataa cgaatgtaca tttgtatcga acgtcataga 4440 gcagtccata ctaaatttag gtgtatcaat ttttaagaca ccagtgaata gatcagagtc 4500 aaatggacaa gtggaacgtt gtcactccac gatcagagaa atcgcaagat gtacaaaagg 4560 tttgaatcca gacatgagct taattacctt aatacaacaa gccgtgtata agtataataa 4620 tactattcat tcttttacaa aagagactcc cagaaaagta tatattggag agcaatcaga 4680 agaactttca tttagagata gatcaaaatt aaaagaaaaa attgagagta aaattataaa 4740 aatatttgaa gaaaaaaatg aaaagattaa agatgataag taccaagatt acgaaccgaa 4800 tcaatttgcg tatgaaaaaa ataaaactat gaataagcgt gacagtcgtt ataagacagt 4860 ggtagttaag gaaaatcatc caacgtatat aatagattcg aacaatcgaa aaatccacaa 4920 aataaattta agaaaaaatt aatgatataa ttatgattta attaattttc tattacagaa 4980 tcgcgttttt cacactatat ggtgttctac aagctagcat aaacattttc gacttaacaa 5040 acaacccatt ggctattgtt ccgttaggac aagcaaaaat taggatcgga tacttgagga 5100 cgattcatcc aattgatctt accgagctag aagagataat ttctcgagtt tttgaaaata 5160 gcacaaacag tacaggaaaa tccccattgc aaagtttaat taatttgaag ctcgaaaaac 5220 ttaacgccac aatttctaag attaggccac gtagacttcg aacgaaaaga tggaacagta 5280 taggtaccgc ctggaagtgg atagctggca gtcccgacgc agaagatcta acgataatca 5340 acaccaccct gaattcgctc atcctacaaa acaacgagca gctattaatc aataatggtc 5400 tcagcagaag attccaagaa acaaccaata ttgctaatca tgttatcgac cttcagaata 5460 ggatccaaag ggaacatcaa actgagatac aacagatcat taagatagca aacctagacg 5520 cattacaagc ccatataaaa acactccaag aagccatact agccgctaag catgggatac 5580 cgaatagcga gctactatca atagaagact taaacaccgt tgcagaattt ctggcacaaa 5640 atggcattta ctatacatca gttgaagaaa tgttaacaca agccacagca caagttacca 5700 tgaattcaac acacgtgata tttatgctaa agtttccacg tctatcctat gaaacttatg 5760 agtacaacta tatcgactct atcatacaaa atgataagag aatcttaatc aagcataact 5820 acataatccg gaatctaacc catatgttcg aattaccgca gccctgtatc gatcagagca 5880 gccaccagct ttgcgaaagt aaagatctgg aagagccttc acgctgcata cgacaactcg 5940 tacaagggga gcatacagaa tgtatgtacg aaaaggtgta ttcaacggga ttagttaaac 6000 acattaacaa tgcgaatatt ctattgaatg atgccactgc cgaaatttca tccaactgca 6060 gcaatataaa ccacattctt aatggatcat atctcataca atttcacaac tgcaatatct 6120 ttattaacgg agaactcttt cccagcaccg aagtttcgat aaccggtaaa ccatatatat 6180 caacccttgg cctcatcgct aaagaagacg gcatcagaga cgaaccttca attgaacatc 6240 ttcgaaacat aacattgcag cacagagaga aactacatac catcagcctg gttaataatt 6300 ccctcacatg gaaacttcat atctttgggt caattgggct aacgacaatt gttctgataa 6360 caatagcaat tttatatttc attaccagta taagaagaac gaaaataagc ctcaacattc 6420 caacgaacaa caccaaccga caggatgtcc accacataga aaccttcgtg aaaaaaccca 6480 caacattcca tgctctcggc agactttgag ggcaaagtca tctaagaagg gagg 6534 // ID Q repbase; DNA; ANG; 4526 BP. XX AC U03849; XX DT 03-DEC-2002 (Rel. 7.11, Created) DT 03-DEC-2002 (Rel. 7.11, Last updated, Version 1) XX DE Q is a non-LTR retrotransposon. XX KW CR1; Non-LTR Retrotransposon; Transposable Element; CR1 clade; KW ORF1; ORF2; Q; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4526 RA Besansky J.N., Bedell A.J. and Mukabayire O.; RT "Q: a new retrotransposon from the mosquito Anopheles gambiae."; RL Insect Mol. Biol 3(1), 49-56 (1994). XX DR Genbank; U03849; Positions 1 4526. XX CC Q is a CR1-like non-LTR retrotransposon. It encodes a CC DNA/RNA-binding CC protein (Q-ORF1p) and the reverse transcriptase (Q-ORF2p). CC Q-ORF1p (positions 210-1376): CC QCCAVITRDFAMAAICFSCAEPLEATGCIISCAYCDATFHRGCCKLPPELIDAVLSNVDLHWSCIGCTNM CC LKNPRCRSVKEIGAQVGFQAALNSAVAAIGKLVEPIVAEVRSGFTLLQTASTPHNRNSDPRPATGRKRRR CC IIEDSASPGVNKIVNSRGNTLCAASSPNAYTNTTIAVQPAPTQPHELVGTDPLSSPLQAAPREPFTDRIW CC IRLSAYQRPSLWNKWSLSVKRRLATDDVIAYCLLRRGVSVDSMNWLSFKVRVPAILRDAALTPSTWPVGI CC GVREFFQSRQHDHQTSSPIATRNRFTTRTPATSTEHRYTTRTPTTTHRLAARTSTPPDPETTSSQQCHPP CC VNDTLEAPNSTLVSGPPQNHRASSPHLHQSTIDRFFLN CC Q-ORF2p (positions 1264-4413): CC MIPWKHPTQHLFLDRPKTTVLHLRTYTSLQSIAFFSTRRSSAHCTQQTRQASCSDNAAQLTYRLPALSNR CC HNDNATAEYLSCYYQNVRGLRTKTKEFHLAVSEADFDLIALTETWLVDNIPSALLFNNNFSVYRCDRSLS CC GSSSRGGGVLLAVSNAYESIELPSRDRSLEYICVRVACNNAHLYVMVVYIPPQLSSEISTLRSLHDCISS CC FTLRLKPSDLLFVIGDFNQPSISWSTADPSSSPAYSSITHYEPTARSLANNTFVDGFKFNGLVQLNHINN CC SHGRMLDLLYANNAAAKLCSPVFPSVVPLVPLDSYHPALDFNIRINSSTRRNSTTRQNSTTTALYRYNFA CC KADYVKLNDMISMFNNSFHCSNFISLDEAVCSFSSFMLQAFVVCVPVQRPKPNPPWADRTLKRLKRVKRA CC AYRHYQTRRCQRSRSIYFDTHSLYCSYNRFRYRRYLSKIQRNLCRWPDSFWRFYNSKTKSTHTPKSITYK CC GATSANTNEMCNLFADRFADCFSPAMNDTDTIDAALVNTPAGAINMSTPFIDSEIVLSALAQLKPSFAPG CC PDGIPSTVLKRCQTTVAPILAKLFNASLANGYFPKAWRKSWMVPIYKKGDRTDAINYRGITSLCAIAKVF CC ELVIYKNLLHACRSYLSPYQHGFVPKKSTTTNLVEFVTYCTSQIDAGAQVDAIYTDLKAAFDSLPHAILL CC AKLDKLGIPSPLVQWLKSYLIHRTYIVKIDKHMSKEIVSSSGVPQGSNIGPLLFILFINDVTLALPPDSI CC SLFADDAKIFAPINNTGDCTFLQDCIEIFCSWCKRNGLTICIEKCYCVSFSRCRSPVTGTYFMDGTAVNR CC QNHAKDLGVLLDSSLNFKQHIDDVVARGNQLLGVVIRTTNEFRNPMCIKAVYNCIVRSVLEYSCVVWSPT CC TASSIARIEAIQRKLTRYALRLLPWQDRNNLPPYAARCRLLGLEPLSVRRRNAQCSFIAGLLNGSIDSSP CC LLHRVDIYAPSRTLRSRETLRLAQPRSSAGRSDPMFRMSAVFNTVSDCFDFDISTQCFKERLRLLPWPQ. XX SQ Sequence 4526 BP; 1117 A; 1250 C; 917 G; 1242 T; 0 other; acgtttgacg tttgacgttt aattgttcgg ccgcatttcc gtcgcgctcc gtgatttttc 60 aataattctc ttgtgttttc tgcgtcgcat cgcattgttg ccttttctac ggactattaa 120 actgtttaac atcatcgctt cgtcttggcg aattgtttag tcctgcatca caactgtggt 180 ttcctgctgt tttttcctgc tcggcttagc agtgttgtgc tgtgattacg cgtgatttcg 240 cgatggcggc tatttgcttc tcgtgtgctg aaccgctaga ggccaccggc tgcatcatca 300 gttgtgcata ttgtgacgct acgttccacc gcggctgttg caaattgccc cccgagctga 360 ttgacgcagt attgtccaat gtcgacctgc actggagctg cattggctgc actaacatgc 420 ttaaaaatcc gcgctgccgt tccgtaaaag aaataggcgc ccaggtcggt ttccaagccg 480 ccctcaactc agctgtagca gctattggca agctcgttga gcctattgtc gccgaagttc 540 gcagtggttt taccctcctt caaactgcat ccacgcctca caatcggaac tctgatcctc 600 gaccagctac tggtagaaag cggaggcgta taatcgagga ttcggcatct cctggtgtaa 660 acaaaattgt aaacagtcgc ggcaacaccc tttgtgccgc gtcatcgcca aacgcataca 720 ccaacaccac gattgccgtc cagccggcac ctacacaacc gcatgaactg gtgggaaccg 780 atccgttatc gtcaccgctt caagctgcac ctcgtgagcc attcacagat aggatctgga 840 tccgcctatc cgcttatcaa cggccgtcac tgtggaacaa gtggtcgctt tctgtaaagc 900 gtcgcttagc caccgatgac gttatagcgt attgcctgct gagaagaggg gttagtgtgg 960 acagcatgaa ttggctttca ttcaaagtga gggtcccggc tatccttcga gatgcggcac 1020 tcacaccatc cacctggccc gtcggtatcg gtgtacgtga gttttttcaa tcccgtcaac 1080 acgaccacca aacctcatct cctatagcca cccgaaaccg ctttactaca cgcacaccag 1140 ctactagtac tgaacaccgc tacaccacac gcacaccaac aacgactcac cgtttagccg 1200 cacgcacgtc tactccacct gatcctgaaa caacatcatc acaacagtgt caccccccgg 1260 tgaatgatac cttggaagca cccaactcaa cacttgtttc tggaccgccc caaaaccacc 1320 gtgcttcatc tccgcaccta caccagtcta caatcgatcg cttttttctc aactagacga 1380 agttccgctc attgcacgca acaaacacgc caagcctcat gctctgacaa cgcagctcaa 1440 ttgacttacc gtttacccgc tttatccaac cgccacaatg acaacgctac cgctgagtat 1500 ttaagctgct attatcaaaa cgttagaggt ttgcgtacta aaacaaaaga atttcactta 1560 gctgtatcag aggctgactt cgatctcatc gcactcacgg aaacttggct tgttgacaac 1620 attccatctg ctcttctctt caacaacaac ttctctgttt accgctgtga tcgctctctc 1680 agcggctcta gctctcgcgg tggtggtgtt ttgctagccg tttctaacgc atacgagtcg 1740 atagaattac cttcccgtga tcgttctctc gagtatattt gtgtgcgcgt cgcctgtaat 1800 aacgcacatc tgtacgtcat ggtagtatac attccaccgc agcttagctc cgagatatcg 1860 actcttcgct ctctacatga ttgtatcagc agcttcactc tcagactgaa gccttcggat 1920 ctgctgtttg ttattggcga tttcaatcag cctagcataa gctggtccac agctgatcct 1980 tcgtcctcgc cagcatactc atcaatcacg cactatgaac caactgcgcg ctcactggcc 2040 aacaacacct ttgtggatgg atttaaattt aacggattag tgcaacttaa tcacatcaat 2100 aattcacacg gacgcatgct tgacctgctc tacgctaaca atgctgcagc taaattgtgt 2160 tcgcctgtct ttccaagtgt tgttccgctt gtacctcttg actcctacca tccggcgtta 2220 gacttcaata tacgtattaa ctcgtcgacc cgacgcaatt cgacgactcg acaaaactcg 2280 acgacaaccg cattgtatcg ttataacttt gctaaagctg actatgtgaa actgaacgac 2340 atgatatcaa tgtttaacaa tagctttcat tgttccaatt ttatttcact tgacgaagca 2400 gtatgttcat tctcgtcctt catgctgcaa gcttttgttg tatgcgtccc agttcagcgt 2460 cctaaaccta accccccctg ggcggaccgc acactcaaac gactcaaacg tgtaaaaaga 2520 gctgcttatc gtcactacca aacgcgccgc tgccaaagat ctcgctcgat ctactttgac 2580 acgcactctt tatactgcag ctataacagg tttcgatatc gcagatattt gagtaaaatt 2640 caacgtaacc tctgcaggtg gcctgactcg ttttggcgct tctacaacag caaaacaaaa 2700 tccacgcata caccgaaatc cattacgtac aaaggagcaa caagtgccaa cactaatgaa 2760 atgtgcaatc tcttcgcgga tcgcttcgca gattgcttct caccggccat gaatgatacc 2820 gataccattg atgctgctct cgtcaacact ccggctggag caattaacat gagcactcct 2880 ttcatcgaca gtgagatcgt tttatctgcc ctagcgcaac taaagccttc cttcgctcct 2940 ggacccgacg gaattccttc taccgtgctg aaacgctgtc aaacgacggt agcacctatc 3000 cttgcaaaat tgttcaatgc atcgctagcc aatggctact ttcccaaagc gtggaggaaa 3060 tcttggatgg ttcctattta caaaaagggc gacaggacag atgccatcaa ctaccggggt 3120 attacatcct tgtgtgccat tgccaaggtg ttcgaactag tgatatacaa aaatctgcta 3180 catgcatgcc gcagctacct aagtccgtat caacatgggt tcgtgccaaa aaagtcgact 3240 accacgaacc tggttgaatt tgtaacctat tgcactagtc aaattgatgc cggagctcaa 3300 gtcgatgcaa tttataccga tctaaaagca gcattcgact ctcttccgca cgcaattctt 3360 ctcgctaaac tcgataagct aggaattccc agcccgctcg tacagtggct taagtcgtac 3420 ctaattcatc gcacatacat cgtgaaaatt gataagcaca tgtccaaaga aatagtcagc 3480 agctcgggtg tgccacaagg aagcaacatt ggcccgcttc tcttcatatt gtttatcaac 3540 gatgttaccc tcgctttacc tcccgacagt atcagtctgt ttgccgacga cgcaaaaatt 3600 tttgcgccta ttaacaacac aggtgattgt acattcctgc aagactgcat cgaaattttc 3660 tgttcgtggt gcaagcgtaa tggactgact atctgcatcg agaaatgcta ctgtgtgtct 3720 tttagtcgat gcaggagccc agtgactggg acctacttca tggacggcac tgcagttaat 3780 cgacagaatc atgccaaaga cctgggcgtt ctgcttgact ctagtttgaa ctttaaacag 3840 catatcgatg acgttgtagc cagaggaaat caattacttg gcgtggttat ccggacaact 3900 aatgaattcc gcaaccccat gtgcatcaaa gctgtgtaca actgtatcgt tcgttcggtt 3960 ctggaatatt cgtgcgtagt ctggagccca actaccgctt cttcaattgc tcgaattgag 4020 gcgattcaac gtaagctcac gagatatgcc ctacgcctac ttccctggca ggatcgcaat 4080 aatcttcctc cgtatgctgc gcggtgccgt cttctaggcc ttgaacctct ttcggtcaga 4140 agacgcaatg cacagtgttc tttcatcgct ggattgctaa atggctctat cgactcatcg 4200 ccattgttgc atcgagtcga catctatgca ccatcccgaa cacttaggtc tagagaaact 4260 ctacggctcg ctcaaccccg ttccagtgct ggtcggtcag accctatgtt ccgcatgtcg 4320 gctgtcttca acactgtctc ggattgcttc gacttcgaca tctcaactca gtgcttcaag 4380 gaacgtctcc ggcttttgcc gtggccgcag tgaattgcga tgcaaatctt gtttttgcta 4440 tgtatttttt tttttttatt gaactgttac atcttaatta ggccatacgg ccgttgaaga 4500 ttaataaata ataataataa taataa 4526 // ID RETRO19_AG_LTR repbase; DNA; ANG; 404 BP. XX AC . XX DT 06-FEB-2003 (Rel. 4.1, Created) DT 06-FEB-2003 (Rel. 4.1, Last updated, Version 1) XX DE Anopheles gambiae gypsy-type long terminal repeat from RETRO19_AG DE retrotransposon - a consensus. XX KW Long terminal repeat; retrotransposon; GYPSY superfamily; KW RETRO19_AG_I; RETRO19_AG_LTR. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-404 RA Jurka J. and Drazkiewicz A.; RT "RETRO19_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 4-4 (2002). XX DR [1] (Consensus) XX CC Related to MDG1, TABOR, DM412, STALKER2, and BLOOD CC retrotransposons CC from Drosophila melanogaster. 4 bp target site duplication. XX SQ Sequence 404 BP; 166 A; 40 C; 88 G; 110 T; 0 other; tgtggcgcat acacataggc gtaatgaata taattatagt aagctatata gcagaaatta 60 tagtaagcta tctttatagg aatagttgtg attggtagaa gcgtgttagt ataggttaga 120 ataggatagc taatacgaaa agataaggat agaataggta gcataggaaa ttaggcataa 180 gacggaatta aggatacata ggaagtaaat gaattaagct aggattagat agggaaattt 240 tagtataaat agcgggatag gatttaggat agtcagataa aagttagact tcataagggg 300 aaactctgtc caattataga gaaaattata gagagtataa ataaaagtta cagttccaac 360 aaaacacccc tttttcaatt gtacactaaa agtgatatac caca 404 // ID CR1-1_AG repbase; DNA; ANG; 5401 BP. XX AC . XX DT 13-DEC-2002 (Rel. 7.11, Created) DT 13-DEC-2002 (Rel. 7.11, Last updated, Version 1) XX DE CR1-1_AG is a CR1-like non-LTR retrotransposon - a consensus DE sequence. XX KW CR1 clade; CR1-1_AG; DNA/RNA-binding; endonuclease; KW Non-LTR retrotransposon; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5401 RA Kapitonov V.V. and Jurka J.; RT "CR1-1_AG, a family of CR1-like non-LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 2(11), 1-1 (2002). XX DR [1] (Consensus) XX CC CR1-1_AG is a family of CR1-like non-LTR retrotransposons. CC The CR1-1_AG consensus sequence was reconstructed based on CC multiple alignment of ~20 copies identified in the CC sequenced portion of the genome. Given the ~2% divergence CC of these copies from the consensus sequence, transposition of CC CR1-1_AG occurred less than 1 million years ago. CC Integrations of CR1-1_AG have not produced target site CC duplications. CC The consensus sequence encodes two proteins: a 440-aa CC CR1-1_AG-ORF1p CC (positions 425 1745) and 941-aa CR1-1_AG-ORF2p (positions CC 1746-4568). CR1-1_AG_ORF1p is DNA/RNA binding protein composed CC of the PDH domain (aa positions 3-57) and gag-like zinc knuckle CC regions (aa positions 334-442). CR1-1_AG-ORF2p is composed CC the AP endonuclease and reverse transcriptase domains. The 3' CC terminus CC is composed of the CAT microsatellite. XX SQ Sequence 5401 BP; 1641 A; 926 C; 1288 G; 1546 T; 0 other; tgatgagtga ctgttgacaa gtgcagttcg ttgacaagtg tgactgttta gctctgtgtt 60 gtgacgtgaa ataagttgtg taaaatagag ctaagtgctc aaatttttag atacgtgtgt 120 agataaatgt tcgtggatgc tctcggtgtc agtgtgtttt gcaaaggtaa aagtgacctt 180 gttaaaaaaa caaaagattg gtgcggtcaa cagcgttccg ccgttggtgg cagtaaccga 240 tccttgttgt gtagataagt gtttgtaccg tttttttatc catgtatgtg tgacaccggg 300 agatacgtag agtagcatat tagtatagta gtgtttgtgt cgcttttggc cggcgtttga 360 aaccaggtaa gcgaaaaaaa acacactttt ataagacttc gcccgtattt gttttttaca 420 cggcatggag tgtttaaaat gctccgccgt ggtgggaacc agcgatgacc cgataatttg 480 ttcagggagt tgtgggttta tttttcaccg tcggtgtatt acacccacac tcaacaagcc 540 tgcggtcaaa ctaattaatg agaaccgcaa tgtcgtatat atgtgtgaca tttgtttaga 600 tcaaagcgcg ggcttggttc atatggatac tgatgcaact aaatcaaatg atttgcttgc 660 acaaacactg agggatttgg aagccaatgt gagcgtgtgg atttctagcg ctttagagag 720 aggaatcgag actctcaaaa ctgagctttg cgcgcaagtg gagcgtaagt tggaaacaac 780 tttgcgcgaa acattaagcg ctatagaagc ctcgaaaatg tcgaaggcgg ccttgcgtgc 840 aacttctgac actccgcaaa ccagcaaaac agtgcaggat gtaaatttag aaacatgggc 900 tacagtaacg aaaaaaagaa aaaggacaaa tagtggagac agcaatgttc aaactattat 960 taatagattt gacgagggaa acaataaagt tactcccaaa attaagaaaa ttaacgatgt 1020 gaaggagcct atgggaaaaa ataaagaaaa taataaaaca ctggttattg ttcctaaggt 1080 ggtgcagtct tgcgatagac caagagctga ccttagcgcc agattggatc cgaggaagca 1140 gcaattgtcg gaattccgca acggcagaga cggacaagta tatgcacaat gtcctgctct 1200 ggcgaattta gatagcatta gaaaagaagt agaagacatt ttaggagacg attattcgac 1260 atccttacct atggcacgcg ttaaaataat tggaatgagt gaaaaatatt cttcttctga 1320 cttagtagat cttttgaaat ctcaaaatga gggaattccc tggaaacagg agaatgtaat 1380 tggaatgttt gagagtaaga tctacaagta ccagatacat aatgtggttt tggaaatcga 1440 ccatgaaact gataagtgtc tggcaaaact tgataaaatc aatattggat ttgatcggtg 1500 taaaatttct agatccattc acgttatgcg ctgctttaaa tgtggtcaat ttagccataa 1560 aagcactgac tgccaaaata aggaagcgtg ttcaaagtgc agtggcgagc accgaacgtc 1620 ggattgcacc tcgtccatcc tcaaatgtgt aaattgtgtt ttggctaaca catccaggaa 1680 cctgaaacta caggtacaac atgcggccaa tagctatgaa tgcccgctgt ttaaaaaaca 1740 ggtagagaga cgaatgcaac tttctcaata gcaggggagg ggtggaatta gggcggttca 1800 gagagatttt atatttcaat gttgccggtc tttcatctaa ctatgctatg tttcgtgaga 1860 cagtagaaaa agttcaaccc ttgttggtct tgatctctga aacccacgta accgagaagg 1920 aggcattcga gcaattttat ttaaaaggat atagggtagt gtcgtgttta tctcattcac 1980 gtcacacagg aggtgttgca gcttatgcca gaagtgacgt tgtccttaaa gtgattttaa 2040 acgagtcatt ggaaggcaat tggtttctcg gtgtagcggt ttctcggggt atgacggtag 2100 gcaattatag catattgtat cactcaccta gtgcgagtga ttcgaggttc gtagatattt 2160 tggaagaatg gttagacagg tttttggatc ttagtaagtt gaacattatc gtcggtgact 2220 ttaatattga ctggttaaat gttgaaaaat ctgcgaaact gaaaagttta atggattcag 2280 taaacatgaa ccaaaaagtc aatgaattca cacgaattgc taggcagagc aggacattga 2340 ttgatcaggt ttacagtagt attgactcaa tcaaagtcac tactgatccg ttattgaaaa 2400 tatcggatca cgaaacactt gttttgaaca taaacgatga acgttgtaaa acgattcaac 2460 ggaaagttaa atgctggaat aggtattcga aacatgctct ttgcaataat gtgtcacaag 2520 gcttgcagtg tggtgcatct gattttgatg aggctgctga cttgttatgg aacacattga 2580 aacatgcaat gagcaccttg gtggaagaaa aaacaattgt ttctagagaa actagtaggt 2640 ggtatacttt ggatctcgca cgtgctaaac ggaaaagaga caaagtgtat aaaaaattta 2700 ttagaacgaa tagagataat gattggtctg agtatactaa acttagaaac agttatagta 2760 gggatctcaa aaatagacga agcgatttct ttagcaatga aataaacaag cacaagaaaa 2820 atagcaaaga gttatggaaa gtcctcaaaa gcatgttaca acctgatgaa tcatgcgttt 2880 cagttgtaaa atttaacggt gtgattgagg ctgacgactc catcatttgc aacaagttta 2940 actcgttctt tgtgaacagt gttttagata ttaatcaaaa cattgcttct gtcagtgaac 3000 ctagctatta cgtagatagt gctactccac gatgccattt cagatttcag aaaattactc 3060 ttgaacaact aaaaaccatt tgtttcaacc tgacaaaaac ggcaggtata gggaatgtaa 3120 gttcaacaac catacaggat tgctatcatg tgatcggaga ggaccttctt atggtgatta 3180 atcaatcact agagagggga tgttttccga aatcatggaa agaatcattg attataccta 3240 ttcctaaagt gaacggagct gccaatgcgg aagattttcg ccccataaac atgttgcatg 3300 tgctcgaaaa ggtgctggag acagtagtta aggagcaatt ggttcagttt ctgaacagaa 3360 acgagctgtt gatccgagag caatcaggat atcggcaagg acactcttgt gagactgctt 3420 tgaatcttgt actggcgagg tggagggtgt tgatggatcg gagggaatcg atagttgctg 3480 ttttcttgga tctaaaacgg gcatttgaaa caatatctag gccattgttg ctttctacct 3540 taaggcgttt tggtattgtg gggagggagc tcagttggtt cgaaagttat ttaaaagaaa 3600 gaactcagag aactttattt ggtagctctg tatcagagcc tatagaaaac acccttggtg 3660 ttccgcaagg tagtgttctt ggaccaattt tgtttatcat gtacatcaat gacatgaaac 3720 aggttttgaa ggcttgtgag atcaatcttt ttgccgacga tactgttttg ttcatctcgc 3780 acaaagaaat caagcaagca gagtctctga tgaatatcga tttaaacgct ctggatggat 3840 ggctgaagta caaaaagctg gcattaaaca ttaacaagac ttgttacatg gtgatgtctg 3900 cgggtgtatt ggaagaacct ccatctatcg taataaattc ggaactaatc gaaagagtta 3960 gacaggctaa atacctggga gttatcctag acgacaggtt gaagttccac gctcacattg 4020 actgggtcat cgctaaagtg gcaaagaagt gtggagtgat aagtagattg gcgaaggatc 4080 tcgatttttt tgggaaagtt catctctaca aatcattgat ctcgccacac tttgacttct 4140 gctcatccat tttgtttctt ggcaacaaag gtcaaattaa aagacttcaa aggttgcaaa 4200 accggattat gaggttaatt ctggggtgcg gtcgacgtac gccgtccgcg gttatgctga 4260 atattcttca atggatgtca gtagagcagc ggattgtgta ccagaccatg acttttatat 4320 ataaaatgtt aaagggcctg ttgcctgggt acctggggga gagcatagtt cgggggtccg 4380 atatccatcg gcaccacaca cgcagggcaa atgagccgag ggtacctaac ttgcattccc 4440 aaagtgccag aaactctttg tttttcaaag ggattcaacg gtacaacagt ctaccagatg 4500 aaattaagaa tgcgagaaac ttgccggatt tcaaacgtaa gtgcgtcata tatgttgaac 4560 aaactgtata atgtgaaata tgtgtagatg tcccatgtca ttatgtaatg tgtaactgca 4620 gttgtcatca cgatctttat gatgatgata agatttttct ttatatacta taattaaaat 4680 tagaaaaaat ataagaaaga gtcaacatag gtttgagaca cgcgcgcgta caagtggata 4740 ttcgggatct atttggggga atttacggtt tgcacacggt ctgggaggtc acaacaggat 4800 tcactcattg atgattgtaa gtggccaatt ccagatacat taggttcacc tgcttcggaa 4860 ggtagtgccc tagagccaat cattggcact agtggaactg gccatatgca tgttgcagtt 4920 gtgttctgat aagtagtgcg ccggatccat tagttggttc ttggcgagct acgtaaggga 4980 tacattagat tctcctgctt cggaaggtag tgccctggag cctgccattg gtgccagaga 5040 ccctggggcc agcggttggc actagtggaa ctggccatat gcatgtcgca gttgtgtcct 5100 gataagtggt gcgccagatc catttattgg ttcttggcgg gctacgtaaa ggatgctagg 5160 gaataccatt gtattcgatg gagtatgacc cgttttttct tgatgaaact atgttggtca 5220 tcttgggtgt gtgtatgact cggacagttc ctctgagagt tttccgatgc caccacttgc 5280 tcgtacaaaa tattttaata tacctatcag agtaatatta tcgtaaagat acttccgtcc 5340 ttctcaaacc tatgttgggg aaagaggtgg gacttatcat catcatcatc atcatcatca 5400 t 5401 // ID GYPSY9-I_AG repbase; DNA; ANG; 5973 BP. XX AC . XX DT 03-OCT-2003 (Rel. 8.09, Created) DT 03-OCT-2003 (Rel. 8.09, Last updated, Version 1) XX DE GYPSY9-I_AG is an internal portion of retrotransposon GYPSY9_AG - DE a consensus sequence. XX KW 4-bp TSD gag; AP protease; GYPSY9-I_AG; GYPSY9-LTR_AG; KW Gypsy clade; LTR retrotransposon; RNase-H; integrase GYPSY9_AG; KW mdg1 lineage; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5973 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY9_AG, a member of the mdg1 lineage of the Ty3/gypsy group RT of LTR retrotransposons."; RL Repbase Reports 3(9), 177-177 (2003). XX DR [1] (Consensus) XX CC GYPSY9_AG is a family of gypsy-like LTR retrotransposons CC that, according to the aminoacid sequence of its ORF2, is CC phylogenetically grouped with Drosophila representatives CC of the mdg1 lineage. CC GYPSY8_AG, GYPSY10_AG, GYPSY11_AG, GYPSY12_AG, GYPSY13_AG, CC GYPSY14_AG, CC GYPSY15_AG, GYPSY16_AG, and GYPSY17_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY9-I_AG consensus was reconstructed after multiple CC alignment of 12 copies. CC The consensus encodes the 458-aa GYPSY9_AG1P gag-like protein CC (pos. 890-2263) and the 1175-aa GYPSY9_AG2P (pos. 2215-5739). CC The sequence of the LTRs flanking GYPSY9-I_AG is deposited as CC GYPSY9-LTR_AG. XX FH Key Location/Qualifiers FT CDS 890..2263 FT /product="GYPSY9_AG1p" FT /translation="KQYEKGMITISDKIEILKYIFKKIQDPNQRTCTRTQL FT RIKAEETFKEIQNDIEKNRYKYTFNKLLEFSKISNALIQNIIAMSTSKTND FT DSRNKSSDFSTNSSEDKNINLTLLTTSRLSFKLLAQIIFVLLKLHKKIKMA FT NFDIKTATGLVPTYDGSPDTFNAFEDASTLLFELNPNHEEMLVKFIRTRLT FT GKARIGLPSNITTFNELISDIKRRCEEKTTPDKVIAKLKSIKTRDAQSICN FT EVELLSEKLKVIYLKQGIPEKIANDMAIKTGIDTLKEKVANSETKILLKAG FT TFATITDATQKVMENESEETNRTTNVLNINTQRYNRNYPRNQGNQNRNNQN FT NYGPNTNQPYRNYYNNRNNNNFNNRNNNNSSYNNNLRNNIRYNNRFIQGYN FT NERYNSNFNGNQNRPNKNLRAITNGETGQQQRNIYHTTAQIQEVEENNFLD FT QQESTQTLDQFTH" FT CDS 0..0 FT /product="GYPSY9_AG2p" FT /translation="FFRPTGVNSNPRSIYSLELNGVDYIKIRLSIANNETS FT ILLVDTGASISLXKASKLKKNHGPIRSDSISLTGISNTPIYSKGSTTCTIY FT FNNLELEHDFVLVPDEFNIGADGILGRDFYKLYRCSINYHLEMLTFTCQGE FT EIQHNIEEDDGKGFILPIRSEVVRKVYLPNITEDTIVFAQEIQPGVFCGST FT IISKDNQLIKFINTRQRNIYITNAEFKPITEPLANYEVKQVNNKVGEINND FT RLQKLLQKIKVDKIPTTEIYNLKKIVTEYNDIFCVEDDPITTNNFYPQKIE FT LKDNIPTYIPNYKQIYSQADEIQNQVDKMLKNDIIEHSVSPYNSPILLVPK FT KSTDDNKKWRLVVDFRQLNKKVIPDKFPLPRIDTILDQLGRAKYFSTLDLM FT SGFHQIPLENDSRKFTAFSTGSGHYQFKRMPFGLNISPNSFQRMMAIAMAG FT LTPELAFVYIDDIIVTGCSARQHISNIVKVFDRLRSYNLKLNPEKCSFFKT FT EVTYLGHKITDKGIYPDDSKFETIRNFPLPKNADEVRRFVAFCNYYRKFVQ FT NFAKIAKPLNNLIKKNVKFIWTDECQKAFNSLKESLLSPAVLKYPDFTKEF FT ILTTDASDVACGAVISQITDGKDHPIAYASKSFTQGEKNKPIIEKELTAIH FT WAINYFKPYLYGRKFTVKTDHRPLVYLFGMKNPTSKLTRMRLDLEEFDFKI FT EFLAGKTNVVADALSRIITDSDELKASIPKNKSDISNPILMVNTRAMIKKN FT NVVENKENKKKKEDIIEEYNPTNMYETDRPSDTTKMLKMKSNINEKDEFIE FT LVIYNHNYYKALGKFKIPITSAKQSQTLEFVLHESCLIARKYHKNLAIASN FT DRLFEFYSMSTIKEIINKMITDVHIVVFTPPRLVEDTEEQIKIMSNFHTTP FT SGGHIGQFKLYSKIKDKFKWKNMKADIIKYVKSCKACATNKILKHTKEKTV FT VTTTPSKPFNIITIDTVGPLPKTANNNRYAVTIQCALSKYIVIIPIQNKEA FT NTIAKALVENFILTFGNFLEMRSDQGLEYNNEILAQISKILEIKQTFAAAY FT HPQTIGSLERNHRSLNEYLRSYTNEHHDDWDQWTKFYEFVYNTSVHSTTNF FT TPFELIFGRQANLPQELYKTKVELLYNIEQYYNEMKFKLQKSHEIAKQKLI FT QAKLQRQAKLNENINK" XX SQ Sequence 5973 BP; 2513 A; 996 C; 950 G; 1514 T; 0 other; tggcgaccgt gaccttaatc tgcaatcgac aggcagaagt gtaagcaaat tatttcaagt 60 aaaatgtgat acgtacaacg gtaacgtaaa gcttaagtgt cgtgtgacga accaaaagta 120 ttgtgaccat aaccggcatt cggcggaaag tgaaaaaaaa agtgtcaaat gtgcaatttt 180 tttttacgga acattcatca acgcagctgg cgcaagtaac ccatatgtaa acaaaagcaa 240 taacgatgaa agaacagcta caagcaagtt tcgtacagtg cagaatgaaa aataaaaaca 300 tttagtgcag tgtaagtgcg aaatgaatgt ttaattgatg aacaaacatt aaccgtacac 360 agtgttgtga gaacctacct aaatgtgtaa aaacgtatta acgctcatgc gatcggttta 420 cctagtagag taaccttgaa atgggtaaac tatagaaaaa agcacacttt gttaaaaagg 480 gttgttgtta cccaagttca gcacagtaca gtatagcaat ctatacgatt cttggagcaa 540 gtgatatgga atgataaagt gcagtgcaaa acaagctgtg gtcagtgtga ttgaatggaa 600 caaccgttcc caccgtggaa gacgaaaaga aaaaccgagc ccactgccta atctggattg 660 gcagacgaga atgaacaagt gtcccaaccc cgacaagaca tcgagcgaca actgatgaca 720 tcgtggaaat tcacaccaag caccgattaa actgcaagca ccggtataaa tacagcgtag 780 aatcagcagc aacaaaatgc acgtcgcata ttccattaag gtactgtaac tttaaaatat 840 ggtcaaatgg gttttcaaag ctaggaaata ggaaatagga agtaattgaa aacaatatga 900 aaaaggaatg attactataa gcgataaaat cgaaattttg aaatatatat ttaaaaaaat 960 acaggatcct aatcaaagaa cttgtacaag aacacaacta agaattaaag ctgaagaaac 1020 atttaaagaa attcaaaatg acatcgaaaa aaatagatac aaatatactt ttaacaaact 1080 attagaattt agcaaaattt ctaatgcatt aatacaaaat atcattgcta tgagcacatc 1140 aaaaaccaac gacgattcac gtaacaagtc atctgatttt tctacaaata gctcagagga 1200 taaaaatatc aatttaacac tgttaacaac tagtagatta tcatttaaac tcttagctca 1260 gatcatattt gtcttactaa agctgcataa aaaaattaaa atggctaatt ttgatattaa 1320 aacagccaca ggtcttgtac ccacatatga cgggtcaccc gataccttta atgcatttga 1380 ggacgcttca acgctattgt tcgagttaaa tccaaaccat gaagaaatgc tcgtaaaatt 1440 cataagaacc agactgactg gtaaggctag aatagggtta cctagcaata taacgacttt 1500 taatgaatta atcagtgata taaagagaag atgtgaagaa aaaacaacac ctgataaagt 1560 aatagcaaaa ctaaaatcca taaaaacaag ggatgcacaa tcaatttgca atgaggtcga 1620 actactttca gagaaactga aggtaattta tcttaaacaa ggaattccag aaaaaatagc 1680 taacgatatg gcaataaaaa caggtatcga cacacttaaa gaaaaggtag caaactctga 1740 aactaaaata ctgttgaagg caggtacttt tgcgacaata accgatgcaa cacagaaagt 1800 aatggagaat gaaagtgaag aaacaaacag aactactaat gtccttaaca taaatacaca 1860 acgctacaat agaaactatc ctagaaacca aggtaaccag aataggaaca atcaaaacaa 1920 ctatggtcca aatacaaacc aaccataccg aaactactat aataacagaa ataataacaa 1980 cttcaacaat cgaaataata acaatagcag ctacaacaac aatctgcgaa acaatattcg 2040 ctataacaac agatttattc agggatacaa taatgaaaga tacaactcca acttcaatgg 2100 taatcagaac agaccaaaca aaaacctaag ggcaataaca aatggagaaa cagggcaaca 2160 acaacgaaac atataccata ctacagctca gatacaggaa gtagaggaaa ataatttttt 2220 agaccaacag gagtcaactc aaaccctaga tcaatttact cattagagct aaatggagtt 2280 gattatataa agattagatt gagcattgca aataatgaaa catcaatatt actagttgat 2340 acaggagcat caatttcttt attcaaagca agcaagttaa agaaaaatca tggcccaata 2400 cgttcagact caatatcatt aacagggata tctaacacac caatttattc aaaaggaagt 2460 acaacatgca ctatatattt caacaatcta gaattggaac acgattttgt attagttcca 2520 gatgaattca acataggagc agatggtata ttaggcagag acttttacaa actatacaga 2580 tgttcaatta attatcattt ggaaatgctt acattcacgt gtcagggaga agaaattcag 2640 cataacattg aagaagatga tggaaaagga tttattttac caatcagaag tgaagtggta 2700 cgtaaagtat atcttccaaa cattacagag gataccatag tcttcgcaca agaaattcaa 2760 ccaggagtat tctgtggcag cactattatt tcaaaggata atcagttaat taaattcatc 2820 aatacaaggc agagaaatat ttacataaca aacgcagaat ttaaacccat tacagaacca 2880 ttagcaaatt atgaagtgaa acaagtgaat aacaaagtag gagaaattaa taatgataga 2940 ttacaaaaac tattacaaaa aattaaagta gataaaatcc ccactacgga aatttataat 3000 ttgaagaaaa tcgtaactga atataacgat atattttgcg tagaggatga tcccattact 3060 acaaataact tttaccctca gaaaattgaa ttaaaggata atattcctac gtatataccg 3120 aattataaac aaatatactc acaggctgat gaaatacaaa atcaggtaga caaaatgctt 3180 aagaatgaca taattgaaca ctcagtctcc ccatataact cacctatatt attggttcca 3240 aagaaatcaa cggatgataa taaaaaatgg agacttgtag tagatttcag gcagctaaac 3300 aaaaaggtta taccagataa atttccatta ccaagaatag atacaatatt agatcagcta 3360 gggagagcaa aatattttag cacccttgat ctcatgtcag gattccatca aataccttta 3420 gaaaatgatt caagaaaatt tacagctttt tcaaccggat cagggcatta ccaatttaaa 3480 cgtatgccgt tcggtttaaa cattagcccc aacagctttc aacgcatgat ggctatcgct 3540 atggcaggat taactcctga gctagcattt gtatatatag acgatataat cgttactggc 3600 tgcagtgcac ggcagcacat cagtaacata gttaaggttt ttgataggtt aaggtcttac 3660 aatttaaaat taaatccaga aaaatgttca tttttcaaaa cagaagttac ttatttaggt 3720 cataagataa cagataaggg tatataccca gacgattcta agtttgaaac aattagaaac 3780 ttccctttac ctaaaaatgc agatgaagta cgaagatttg ttgcattttg taattattat 3840 cgtaaatttg tacagaattt tgcaaaaatt gctaaacctt taaataacct aattaagaaa 3900 aatgtcaagt ttatttggac agatgaatgc caaaaagcat ttaatagttt aaaagaaagc 3960 cttttgtctc cagcagtctt gaagtatccg gattttacta aagaattcat actaacaact 4020 gatgcttcag atgtagcatg tggagcagtt atttctcaaa tcacagatgg aaaagatcac 4080 ccaatcgcgt atgcaagcaa aagcttcacg caaggagaaa agaataagcc tatcatagaa 4140 aaggaactta cagcaattca ctgggcaatc aattatttca aaccatatct atacggcaga 4200 aaattcaccg taaaaacaga tcatagacca ttagtatacc tgtttggtat gaaaaaccca 4260 acatctaaac taacaagaat gagactagat ttagaagaat ttgattttaa aatagaattt 4320 ttagcaggta aaaccaacgt agtagcagat gctttatcca gaattataac tgattctgat 4380 gagcttaaag catcaattcc aaaaaataaa tcagatatat caaatcctat tttaatggta 4440 aatactagag ctatgataaa gaaaaacaac gttgtagaaa acaaagaaaa caaaaagaaa 4500 aaggaagata taatagaaga atataatcca acaaacatgt atgaaacaga caggccatct 4560 gatacaacta aaatgttgaa aatgaaatca aacattaacg aaaaagatga atttattgag 4620 ctcgtgatat acaatcacaa ttattataaa gcgctgggaa aatttaaaat acccataact 4680 tctgcgaagc aaagtcaaac actagagttt gtactgcatg aatcatgcct aatcgctaga 4740 aaataccaca agaatttagc aattgcatcg aatgacagat tattcgaatt ttactcaatg 4800 tcgaccataa aagaaataat taacaaaatg ataacagacg ttcacatcgt cgtatttaca 4860 ccacctagac tggtagaaga tacagaagaa cagatcaaaa taatgtccaa tttccacaca 4920 acaccatcag gaggtcatat aggacaattc aagctgtata gtaagataaa ggataaattc 4980 aaatggaaaa atatgaaagc ggatatcatc aagtatgtaa aaagttgtaa agcatgcgca 5040 actaataaga tcttaaaaca tactaaagag aaaactgttg tgacgacaac accctctaaa 5100 ccttttaaca tcataacgat tgataccgta ggtcctttac caaaaacagc aaataataat 5160 cgatacgcag ttacgatcca atgcgcatta tcgaaatata tcgtaatcat cccgattcaa 5220 aacaaagaag caaatacaat agcaaaagca ttagtagaaa attttattct tacatttgga 5280 aactttttag aaatgagatc agatcaagga cttgaatata acaatgaaat tttggcacaa 5340 atatcaaaaa tattagaaat caaacaaaca tttgcagcag catatcatcc acaaacaata 5400 ggatctttag agcgaaacca tagaagttta aatgaatatt tacgaagtta caccaatgag 5460 catcatgatg actgggatca atggactaaa ttttatgaat tcgtttataa cacatcagta 5520 catagcacaa ctaatttcac acctttcgaa ttaatatttg gcagacaagc taatttacca 5580 caagaattat ataaaacaaa agtagagcta ttatacaata tagaacaata ttacaatgaa 5640 atgaagttta aattacaaaa atcacatgaa attgcaaaac aaaaacttat tcaagcaaaa 5700 ttacaacgtc aagcaaaatt gaatgaaaat ataaacaaat gaaacatacg aataggagat 5760 tatgtctatt taacaaacga aaatagaagg aaattagatc cagcttatat aggaccattt 5820 acagtcttag aaattacaaa tacaaactgc gttataaaac acaatcaaac aggaaaaacc 5880 acaacagtac ataaaaacag attaaaacat ttttagtgaa taatgcactc tttcgtataa 5940 actcaatcgt acattattca aaaagggtgg agg 5973 // ID GYPSY2-I_AG repbase; DNA; ANG; 5178 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 08-MAY-2003 (Rel. 8.04, Last updated, Version 1) XX DE GYPSY2-I_AG is an internal portion of the GYPSY2_AG LTR DE retrotransposon - a consensus sequence. XX KW 4-bp TSD; AP protease; GYPSY2-I_AG; GYPSY2-LTR_AG; GYPSY2_AG; KW Gypsy clade; LTR retrotransposon; gag; integrase; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5178 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "GYPSY2_AG, a family of LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 3(4), 75-75 (2003). XX DR [1] (Consensus) XX CC GYPSY2_AG is a family of gypsy-like LTR retrotransposons. CC GYPSY2-I_AG, an internal portion of GYPSY2_AG, is flanked by CC GYPSY2-LTR_AG LTRs. The GYPSY2-I_AG consensus sequence was CC reconstructed based on multiple alignment of 20 copies; they are CC ~1% divergent from the consensus sequence. CC The consensus sequence encodes the 1408-aa Gypsy1_AGp poliprotein CC (exons 846-1789 and 1874-5156, predicted by FGENESH) CC composed of the putative gag-like (pos. 1-300), AP protease CC (pos. 325-410), reverse transcriptase (pos. 514-681), CC and integrase (pos. 1030-1200) domains. XX FH Key Location/Qualifiers FT CDS 0..0 FT /product="GYPSY2_AGp" FT /translation="MLHSPPVRDVSTPDGVTPSADPAASGSKSPHVPTPPV FT PNTPRVPGPSACDAMFMPPESQIDTLNAMQLKPPEMDTTDIQTFFFALENW FT FDAWNITTNQHIRRFNILRTRIPLRVLPELRPLLENIRQYATDRYEVAKRA FT IIEHFEESQRSRLHRLLAEMNLGDRKPSQLLAEMRRAANGAMTDSMLVDLW FT IGRLPPYVQSAVIATNTDTNDRAKVADSVMDSFALYHRTGPYQTIHEVRNE FT DFERLSRHVTELGQRLDAVLSKLNERERARPRSRTRQRQPNQDAVTPSGHC FT YYHTQYGQAARNCRAPCSFNNRRYRLVITDPKTNIKFLIDTGADVSVIPRQ FT HSSVPSKPSTMKLFAANSTPIQVYGESLYTLDLGLRRSFLWNFIIADVGTA FT IIGADFLQHFHLLVDLRKKCLVDALTNVRSTGVPSQNPSEPTVKVCDSTSP FT IATLLKEFPGLTALSTPGTLLQSEVTHRIETTGQPTFARPRRLPPEKYAAA FT RKEFESLVQLGVCRPSNSSWASPLHMTKKADGTWRPCGDYRALNAKTVPDR FT YPLPFLQDFTMHLQDKIIFSKVDLHKAYHQIPIHPDDIAKTAITTPFGLYE FT FTTMPFGLRNAAQTFQRLIHDVLRGLEFVFPYIDDMIVASTSEAEHHEHLR FT QLFERLEKHQLAINPAKCEFYRNEISFLGHLVNASGIRPLPDRVQAISELP FT QPTTIMELKKFLAMINYYRRFLPHALETQGILLEMTPGNKKKDRTPLTWSL FT EASEAFAQCKEQLKRATLLAHPVKNAELSLWTDASDFAAGAVLHQRTNEDL FT QPLGFFSKRLEKAQQKYSTYDRELTAIYLAIRHFRYQLEGREFCIYTDHKP FT LTFAFRQTHDNASPRRARQLDFIGQFSTDIRHIAGKDNVTADLLSRIETVH FT ATPTIDYERLAEEQERDPELSDILSGKIQTDLFLQKTPIPGSPKSLYADCP FT GGIIRPYITRSFRTQLLHAVHDLSHPGARATARLITERFVWLNARKESQDF FT ARNCLACQRAKVGRHVKSPLIPYPATTARFSHINVDIIGPFPISNGNRYCL FT TIIDRFTRWPEAIPISDITASTVVSALLFHWIARFGVPAHVTTDQGRQFES FT SLFKELTKALGTKHIRTTAYHPQANGIIERWHRTLKAAITCKDTARWSEHL FT PLILLGLRTTFKNDINASPAELVYGTTLTIPAEFFIAKPQNALADQSDFAK FT TLEETMSSIRPQSTAWHTNRTPFVHSDLNKCTHVFIRDDTVRPALTTPYHG FT PYKVLTRNPKSFQILLRGQPTLVSIDRLKPAYGAEEEATPAPQCSWEGLTT FT NLLPPTTDHSETLPLPDVQANSDRRDATAASKPTSREQPVRNQTTPAPPSH FT PTTSRQTDRAAVDAPPPSILRRNDQTVSTGVTRSQRKVIIPLRYR" XX SQ Sequence 5178 BP; 1352 A; 1631 C; 1174 G; 1021 T; 0 other; actggtgacc ccgacgtgat cgcgtgcgcg agtgagtgag tggtaacctg acgaacaccg 60 tgtccagccg agaaaaaacg tgtttccatt gttccacggt ccggaccgac ggcaacgttc 120 ccccccatca tcgaggagcg gccgaccacg aaggaggcac cacgcaagcg cagccagcga 180 aaaaaaaccc cgtgcacaaa ccccgaaccc acgtgagtgc aaatcgacac cgaaggtggc 240 cgacagtgag gaacactgtt caggaacatt tttacccgac ggagcgaccg atcctagcgg 300 aaaagtttcc tctcggtgct gagcgatcgc cgaacatttt gctgacacac cccgcgccgt 360 gtcgcacacc cgccgagcat tttggtaccc gtacgtgttt gcgcacccgc cgatcataac 420 ctcacacgta ccgccgagcg cgctccagac ccacgcggtt tttgtgtgtg caccgtgtgt 480 gtgtgtgtgt gtgtgggtga atgtgcgcag gccgacgccg agcggattgc gtcagaattt 540 tgctcgagct acgttcgtca tttttttcga ccgtgcaccg aagacgtcgt cagcgcacgc 600 agccatcgtt ctcttctcgc cgacaccacc gaccgaacgc caccgaagat catcgcccct 660 cgtttctcac accaccggcg tcatcgacga acgcagccaa cgagcgacta atcctaacac 720 gatcgaccgc gtgtgcggat ttttcgtcgc cgaaggatcg acctagccaa cctccagctg 780 gacttgcttg cgcccccgcc actaaggtaa gatccaccct tttttaacta accttagtcg 840 taaggatgtt gcacagtccg ccggtccgcg acgtatcgac tcccgatggc gtaaccccga 900 gtgccgatcc agccgcgagt ggatccaaat cgcctcacgt accaacaccg cccgttccga 960 ataccccgcg cgtaccaggg ccgtccgcct gcgacgccat gtttatgccg cccgaatcgc 1020 agattgacac tttgaatgcc atgcagctga aaccaccgga gatggacacc actgacattc 1080 aaaccttttt cttcgcattg gaaaactggt tcgatgcgtg gaatatcacc acgaaccaac 1140 atattcgccg ttttaacatt cttagaacgc gtataccgct tcgtgtcctt cctgagcttc 1200 gccccctgtt ggagaacatt cgacagtacg ctacggaccg ttacgaggta gcaaagcgtg 1260 caataattga gcactttgaa gagtcgcaac gaagccgctt gcatcgtctg cttgccgaaa 1320 tgaacctcgg ggaccgaaaa ccatcgcagc tattagcgga gatgcgccgc gccgcaaatg 1380 gagcaatgac ggactctatg ctggtagatt tgtggatcgg ccgtctcccg ccatacgtcc 1440 agtccgccgt tattgccact aacacggata ccaacgatcg agctaaagta gcagactctg 1500 ttatggattc gttcgcgtta taccaccgaa cgggcccgta ccaaaccatc cacgaagtac 1560 gcaacgagga cttcgaacgt ctttctcggc acgtaacgga attaggtcag cgcttggacg 1620 ccgtactgag caagctcaac gaacgagaac gcgcgcgacc acgctcacgt acccggcaac 1680 gtcaaccgaa ccaggatgcg gtaacaccca gcggacactg ctattaccac acgcagtacg 1740 ggcaagcagc gcggaactgt cgtgccccct gctccttcaa caatcggcgg cagggtagta 1800 actcggccac tgcttccgat tgacgcttaa ccagaggcca acctcaacag atacacgtac 1860 tttcgaccca tagctatcgt ctcgtaataa ccgatccaaa aactaacatc aaattcttaa 1920 tcgataccgg tgcagacgtt tcagtaatcc ctcgacaaca cagttccgtc ccgagtaaac 1980 cctccaccat gaagctgttc gccgctaatt ctacaccaat ccaggtttac ggagagtcgc 2040 tctatactct cgatttggga cttcgccgat ctttcctttg gaacttcatc atcgcagacg 2100 tggggacagc gattattgga gccgattttc tccaacattt ccatctgctc gtggacttgc 2160 gcaaaaaatg tcttgtcgac gccttaacga acgtacgttc taccggagtg ccgagccaaa 2220 acccgtcgga accaaccgta aaagtatgtg attccacctc accgatcgcc actctcctaa 2280 aggaatttcc cgggttaact gcactatcca ctcctggcac cttactgcag tccgaagtga 2340 cgcaccgaat cgaaacgacg gggcaaccaa cattcgcaag acctcgccga ttaccacccg 2400 aaaagtacgc agctgcccgc aaagagttcg aatcactcgt ccagctcgga gtgtgccgcc 2460 cctcgaatag cagctgggcc agcccgctac atatgacaaa aaaggccgac ggcacctggc 2520 gcccttgtgg tgattaccgc gccctaaatg caaaaaccgt acccgaccgt tatccactac 2580 cgtttttaca ggacttcacg atgcatttgc aagacaagat catattttcc aaggtcgatt 2640 tgcacaaagc ataccaccag ataccaattc atccggatga tatagcgaag acagccatca 2700 cgacaccctt tggactttac gagttcacta ccatgccttt cggattgagg aacgcagcgc 2760 aaacattcca acgccttatc catgatgtcc tacgaggact cgagtttgtt ttcccgtata 2820 tcgacgatat gatcgtagca tcaacgtccg aggcagaaca ccacgaacac ttacgccaac 2880 ttttcgaacg attggagaag caccaactag ccatcaatcc agccaagtgc gagttctacc 2940 ggaacgagat ttcctttctg ggccatctgg tcaacgcttc tggtattcgt cctctccccg 3000 atcgagtcca agccatcagc gagctgccac agccaacgac gattatggag ttgaagaagt 3060 tcctcgccat gataaactac taccgacgtt ttctgccgca cgccctggaa acgcaaggta 3120 tacttctcga gatgactcca ggtaacaaaa agaaggacag aacgccatta acctggtcgc 3180 tagaagcttc cgaagcattc gcccaatgca aagagcaact gaaacgtgca acgttattgg 3240 cacatcccgt gaagaacgcc gaactttctc tatggaccga cgcttcagat ttcgcagccg 3300 gagccgtact tcaccaacgc accaacgaag acctgcaacc actaggcttc ttctcgaaac 3360 gtctcgaaaa ggcacagcaa aagtactcga cctatgaccg agaacttacc gccatctatc 3420 tcgccatacg acacttccga taccagctag agggtcggga attctgtatt tatacagacc 3480 acaagcctct aaccttcgcc ttccgacaaa cgcacgacaa tgcctcacct cgacgagccc 3540 ggcagttaga cttcattggc cagttttcca ccgacatccg tcacatcgcc ggaaaagaca 3600 acgttacagc cgatctgctc tcccgcatag agacagtgca cgcgacaccg accatcgatt 3660 atgagcgatt agcagaagaa caagagcgcg accctgaact ttccgacatt ctcagtggga 3720 aaattcagac ggacttgttc ctgcagaaga caccaatacc gggaagcccc aagtcactct 3780 acgccgactg ccctggaggt atcatcagac cgtacatcac ccgatcgttt cgaacacaac 3840 ttctccacgc cgtacatgat ctcagtcatc ccggagcccg cgccacagct agactaataa 3900 cagagcgttt cgtgtggctc aatgcaagga aggaatccca ggacttcgct cggaactgct 3960 tagcctgcca gcgcgctaag gtaggaaggc acgtcaaaag ccccttgata ccgtaccctg 4020 caacaacagc gaggttcagt catatcaacg tagacatcat tggaccattt cccatcagta 4080 acggtaaccg atactgcctt acgataatcg accgatttac tcgctggcca gaagcaatac 4140 cgatctcgga tatcaccgca tctaccgtcg tatcagcact actattccac tggatcgccc 4200 gattcggagt tccggcgcac gtaacaacgg accaagggag acaattcgaa tcctccttgt 4260 tcaaagagtt gacgaaagcc ctaggaacga aacacatccg tacgacagcc tatcacccgc 4320 aggcaaatgg aataatcgag aggtggcacc gcactcttaa agcagcaatc acctgcaaag 4380 acaccgcaag atggagcgaa cacctaccgc taatactgct tgggctacga accacgttca 4440 aaaatgacat caacgcctcg ccagccgaac ttgtgtatgg aacgacgttg accatcccgg 4500 cagaattctt catcgcgaaa ccgcaaaatg ccctcgccga ccaatccgac ttcgccaaaa 4560 cgttagagga gacgatgagc agcattcgac cacagagcac cgcttggcat accaaccgca 4620 caccgttcgt gcattccgat ctgaacaagt gtactcacgt gttcatacgc gacgacaccg 4680 tccgacctgc actaactaca ccttaccacg gtccatataa ggttcttaca cgcaatccta 4740 agtcttttca gatactccta cgtggacagc caacgctggt ttcgatcgac cgcttaaaac 4800 cagcgtatgg cgcagaagag gaagccaccc cggccccgca gtgctcgtgg gaagggctaa 4860 cgacaaacct gctgccgcca acaaccgacc actcggaaac tctgccgtta ccggacgtcc 4920 aggcaaattc ggaccgcaga gacgccaccg cagcctccaa accgacgtcg cgcgaacaac 4980 cagtgcgtaa tcagacgaca cccgcaccac catcgcaccc gacgacatcg agacaaaccg 5040 accgagccgc cgtcgacgcc ccaccaccct ccatcctacg ccgcaacgac cagacggtat 5100 cgaccggcgt caccaggtct cagcggaagg tcatcatacc tctacgttac cggtgacacc 5160 gctctaggag gggagtac 5178 // ID COPIA2-I_AG repbase; DNA; ANG; 4232 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE COPIA2-I_AG is an internal portion of the COPIA2_AG LTR DE retrotransposon - a consensus sequence. XX KW LTR Retrotransposon; Transposable Element; 5-bp TSD; COPIA2-I_AG; KW COPIA2-LTR_AG; COPIA2_AG; Copia clade; Salto 7; integrase; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 578-2937 RA Parkes J.R., Warren M.A. and Crampton M.J.; RT "Salto, a Ty1-copia group retrotransposon from the malaria vector RT Anopholes gambiae."; RL Direct Submission to Genbankac. AF93,. XX RN [2] RP 1-4232 RA Pavlicek A., Kapitonov V.V. and Jurka J.; RT "COPIA2_AG, a family of copia-like LTR retrotransposons from RT African malaria mosquito."; RL Direct Submission to Repbase Update (31-MAR-2003). XX DR [2] (Consensus) XX CC COPIA2_AG is a young family of autonomous copia-like LTR CC retrotransposons. CC COPIA2-I_AG, an internal portion of COPIA2_AG is flanked by 99% CC identical COPIA2-LTR_AG LTRs. The COPIA2-I_AG consensus sequence CC was reconstructed based on multiple alignment of 3 copies. CC The consensus sequence encodes the 1348-aa COPIA2-I_AGp protein CC (positions 189-4232). Partial protein (ac AAL55241.1) previously CC submitted to GenBank by [1]. CC MDFSKVGVIRRNNRNYRSWAFKVQMLMMREGTWTYVDPGVAPTPVTPEWTEGDSKARATIALLVEDNQHN CC LIMTKNTAKETWDALKAHHHKATLTGKVSLLKEICNANYREGENMEDFLYGMEDHYSRLENSGEKLSANM CC QVAMILRSLPKAFDALTTALESRSDKELTMDLERAKLIDESEKLYGGKVQEERVLKAKSEAKPGACFFCG CC QPGHKKRECKEFLNRKSSGEGEKKKKIKPNKEQQVKTVRENDASSFTFMVRQPEIRGNDRSWLIDSGASS CC HMCSDKSAFTVMEQSLRSNVTVADGSENRVEGVGDCLIKCAVEYGEIIEITLRGVLYVPTLEGNMISIGK CC LAEKGVRAVFDNTGCKLVYGNTVVAVADKVSDMYWLRIAQDRVMKSVVKEHTKNCQHTWHRRLGHRDPAV CC IGEMKRRDLVSGLEVVDCGIRWTCECCIECKMARSPFPPVAEKTSTEVLDIIHSDVCGPMEETTLGGCRY CC YMTLIDDHSRYTFVYFLKKKSEAEDKFREHVKLVQNQFGRKPRIIRSDQGGEYSNKALRKFCADEGIKME CC FTAAYSPQQNGVAERKNRSLTEMGRCMLRDAGMHKRFWAEAINTTCYLQNRLPSAAVERTPFEIWFGRKP CC DLTNLRLFGCVGYVLIPSVKRKKLDVKAERMTFVGYSGEHKAYRMLNTQTGEIQISRDVRFLEIDDGSKE CC QTYGDPKIEDNPTESVEIEWSLDETKREAKTNVANDTISESEFYGWDCSDDGWPRGFWNDNDNNWLRGLW CC DDDDAEAGAAMPEAVLDAVPEAMPAAVPEVSNTPVRRLQRVTAGVPPARYDEEVYLVKESVAEPKTYKEA CC VSGPQSAEWKIAMAEEMQSHQENGTWELAELPPHRKAIGSKWIFKCKADEDGHFVRYKARLVAQGFCQKF CC GTDYDLVFVPVVKQITFRTMLVLASKRKMLTKHVDIKTAYLHGLLKKEIFMRQPQGFESDNPNEVCRLHR CC SIYGLKQAARVWNTKIDDVLKTMGFIQSTADPCLYIREKAGKSIFVLIYVDDVIVICNTEEEFSEVVHVL CC TLNFTISVMGNLRFFLGIRIRRNDGRYCMDQRAYLERVLERFGMLDAKPSKFPMDPGFLKRKEENGRKLD CC SPKAYQSLIGALLYAAEISRPDIAIATAILGRRVQDPSEADWNEAKRILRYLKGTLDSVLYLGSGGQKLE CC CFVDADWAGDESDRKSNSGFVFKFGGGLIGWGCHKQKCVALSSTEAEYVSLAECLQEVKWILKLMADVGE CC QLDGPVLVNEDNQSCIALTKGDRAERKAKHIDTKFNFVEDMVRDGIVKLQYCPTEHMQADLLTKPLQAVK CC LRQLREAIGIKPFSVEEE. XX SQ Sequence 4232 BP; 1188 A; 859 C; 1239 G; 946 T; 0 other; ggttatgggc ccagctctgt gtggccagtt caattgaaaa gtgcgcgacg cggttcggaa 60 agacagttat tttttcggtg tgaaaaaatt aggacatttc cggaaggtac gcagtgcggt 120 taaggaatcg tgtgtttttt cgtggtgaaa gaaaaaaccc aaccgggaag gtttttgcac 180 gggcaaaaat ggatttttcg aaagtgggcg tcatccggcg gaacaaccga aactatcggt 240 cgtgggcttt caaagtgcag atgttgatga tgcgggaggg tacgtggacg tacgttgacc 300 cgggtgtcgc gccgacaccg gtaactccgg agtggacgga gggtgattcg aaggcgcggg 360 cgaccattgc tttgttggtt gaggataacc aacacaatct catcatgaca aagaacacag 420 cgaaagagac atgggatgcg ctcaaggcac accaccacaa agccactctt accgggaaag 480 tttcgttgct gaaagagatt tgcaacgcaa actatcgtga aggtgagaat atggaagatt 540 ttttatacgg catggaggat cattattctc ggctggagaa ttcgggtgaa aaactctcgg 600 cgaacatgca ggtggccatg attttgcgga gccttccaaa agcatttgac gcacttacca 660 cagctttgga aagtcgttca gataaagagc taacgatgga tcttgagcgg gcaaagctga 720 tcgacgaaag tgagaagctg tacggcggaa aggtgcagga ggagcgagtg ctgaaggcga 780 aaagtgaagc aaaaccaggc gcgtgtttct tttgtggtca acctggccat aagaaacgag 840 aatgcaaaga gttcctgaat cggaagagca gcggggaagg tgaaaagaag aaaaagatta 900 agccgaataa agaacaacaa gtgaaaacag tgcgcgaaaa cgacgcaagt tcgttcacgt 960 tcatggttcg tcagcctgaa attcgcggta acgatcggtc gtggctaatc gactcgggtg 1020 caagttcgca catgtgtagt gacaaaagcg cgttcacggt aatggaacaa agcttgcgtt 1080 caaatgttac cgtcgcggat ggcagcgaaa atcgcgttga aggcgttggc gattgcctga 1140 tcaagtgtgc ggttgaatac ggtgaaataa ttgaaatcac gctacggggt gtgttgtatg 1200 ttcctacgct ggaaggaaac atgatttcaa tcggtaaact cgcggaaaaa ggtgtgcgtg 1260 cggtttttga caacaccggg tgcaagctcg tttacggaaa tacggtcgtc gcggtcgcgg 1320 ataaagtgag cgatatgtat tggttgcgaa ttgcacagga tcgagtgatg aaatcagtgg 1380 taaaggagca cacgaaaaac tgccaacaca cttggcatcg tcgtcttggg cacagggatc 1440 cagctgtcat cggtgaaatg aagcggcgcg atttggtgtc ggggctagaa gtggtcgact 1500 gcggtatccg ctggacctgc gaatgctgca tcgaatgcaa aatggcacgc tcgccatttc 1560 caccagttgc ggaaaaaacc tcgacagaag tgctggatat aatccatagt gatgtgtgcg 1620 gcccaatgga ggaaacgacc ttagggggat gccgttacta tatgacccta atagacgatc 1680 atagtcggta tactttcgtc tattttctca aaaagaaatc ggaggccgag gataagtttc 1740 gcgagcatgt aaaattggtt caaaaccaat ttggccggaa accgcgaatc attcgctccg 1800 atcagggagg agaatactcc aataaggcgc ttcggaagtt ctgtgcggac gaagggataa 1860 agatggagtt tactgcagca tattcacccc agcaaaatgg agttgcggag cggaagaacc 1920 gatcgctaac ggagatgggt cggtgtatgc ttcgggatgc aggtatgcat aagcgatttt 1980 gggcggaagc aatcaacacc acttgctact tgcaaaatcg attgccgtct gctgcagtag 2040 agcgtacgcc attcgagatc tggttcggca gaaaaccaga tttgaccaac ctgcgactgt 2100 ttggatgtgt tgggtacgta ctgattccgt cggtgaaacg aaaaaagtta gacgtcaagg 2160 cggagcgtat gacttttgtc ggctattccg gcgagcataa ggcgtatcgg atgctaaaca 2220 ctcaaacggg agaaattcaa attagtcggg atgtccgttt tcttgagatt gatgacggat 2280 ccaaggagca gacatacggt gatcccaaaa tagaggataa tccgactgaa agcgttgaaa 2340 tcgagtggtc tctcgatgaa acgaaacggg aagctaaaac taacgtggcc aatgatacaa 2400 tctccgaatc tgaattttac ggttgggatt gttcagacga tggctggcca cgaggttttt 2460 ggaacgacaa tgataacaat tggcttcgcg gactgtggga cgatgacgac gctgaagctg 2520 gagctgcgat gccggaggcg gtgctggatg ctgtaccgga ggcgatgcca gcagctgtgc 2580 cagaggtgtc gaatactccc gttcgtcgtt tacagagggt gacagctggc gttccaccgg 2640 caagatatga cgaagaagta tatctggtga aggaaagtgt agcagaacca aaaacgtata 2700 aggaagctgt gtccggtcct cagagtgctg aatggaaaat agcgatggca gaagaaatgc 2760 agtcccatca ggaaaatgga acgtgggagc tagcggagct gccgccacac cggaaggcta 2820 tcgggtcgaa atggatcttc aagtgtaagg cagatgaaga cggtcatttc gttcggtata 2880 aagcacggct ggtggcgcag ggtttctgcc agaaattcgg gacggattac gacctggtgt 2940 ttgtccccgt cgtaaagcag attactttcc ggacgatgct ggttctggcg agtaaaagga 3000 agatgttaac gaagcacgtt gacataaaga cggcgtatct acatggtctt ctcaagaagg 3060 agatttttat gcgccagcca cagggattcg aaagcgataa cccgaacgaa gtatgcaggc 3120 tgcatcgcag catttacggg ctcaagcagg cagctcgtgt ctggaatacg aagatcgacg 3180 acgtactgaa aactatgggt ttcatccaat caacggcgga cccatgtttg tacatacgcg 3240 aaaaagcggg taagtccatc tttgttctca tttacgttga cgatgtgatc gtcatatgta 3300 acacggagga agaattttct gaggtggtcc acgtcttgac actgaatttc acgatcagcg 3360 tcatgggtaa cctaagattt tttctcggca tacgaattcg gcgtaacgat gggcgttact 3420 gtatggacca acgagcttat ttggaacgag ttctggagcg tttcggcatg ctggatgcta 3480 aaccgtccaa attcccgatg gatcccggct tcttaaaacg aaaggaggag aatggcagga 3540 agttggattc gccaaaagcg tatcaaagtc tcataggagc tctgttgtac gctgcagaga 3600 tcagcagacc cgatattgca atcgccacag ccattctggg caggagagtg caagatccat 3660 cagaagcaga ttggaacgag gccaaacgga tactacgtta cctcaagggt acactggata 3720 gtgtattgta ccttggaagc ggcggacaaa agctggagtg ttttgtggac gccgattggg 3780 caggcgacga gagcgaccgc aaatccaact cggggttcgt gtttaagttc ggcggcgggc 3840 tcatcggatg gggctgtcat aagcagaagt gtgtggcact atctagtacc gaggccgaat 3900 atgtttccct tgccgagtgt ctacaggagg taaagtggat actgaaactg atggcggatg 3960 ttggcgagca actggatggt ccagttctgg tcaacgaaga caatcaaagc tgcattgcgc 4020 tgactaaagg agaccgagcc gaacgcaaag caaagcacat cgatacgaaa tttaatttcg 4080 tggaggatat ggttcgggac ggcatcgtga aactgcagta ctgcccaacc gaacacatgc 4140 aagctgattt gcttaccaaa ccgttgcaag cagtgaaact tcgacaactt agggaagcga 4200 tcggaataaa accattcagt gttgaggagg ag 4232 // ID R6Ag3 repbase; DNA; ANG; 5289 BP. XX AC AB090819; XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE Anopheles gambiae retrotransposon R6Ag3 DNA, complete sequence. XX KW Non-LTR Retrotransposon; Transposable Element; KW reverse-transcriptase; gag-like; R6Ag3. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RA Kojima K.K. and Fujiwara H.; RT "Evolution of target specificity in R1 clade non-LTR RT retrotransposons."; RL Mol. Biol. Evol 20(3), 351-361 (2003). XX DR Genbank; AB090819; Positions 1 5289. XX SQ Sequence 5289 BP; 1630 A; 1218 C; 1378 G; 1063 T; 0 other; gcccggcaaa gtccgaccat cacctcctcg cggtgccggg cggggtagga gttcgtcctt 60 ggacaggctc tgagctaacg atggcggcac gctttggtag tggatgcaga ccatgctatg 120 gatcaggcta atgtattgga cgtaaaccat agatgaatgt tatcgggctg gcgaaaaact 180 caaactgatc ccttagtaag gacgggtgaa actttcttga aacggcgtga gggtcaatga 240 gtgattcgtt cccgtcctat taaaacctgg cgagtcttcg gtagcgctct aagtttgttg 300 tcggcctagc aaaccgacaa ctataggtgc aggtgagctt cggccctgct taaccctgcc 360 tggctctgag ctcacacctc ataagggtag tgagccaacc ctcgcgtgga gctccctgaa 420 ccaaagccaa ccacggggca tcaacaaggg gactatagtg gcgaacaagg atagcggatt 480 gaagtttgga cccaacagca atggaagcaa gaaaacaaaa acccaagaaa aatgttgaag 540 atgaggagca cgagcggctc attgaagagt ttatctcgaa gctcaaaaag agttataaga 600 aagcctcaaa ggctgaggag aacgaagcac cgagaaaagt ctcgcacaaa gcacaactcg 660 agcgcttcaa gaattacgcg aacaacctgg agattgagga ccttcgtgac ggcatgatcg 720 ctcagatgat tgagttcatg gagtccatga ttaaagaaat gagcgagttg aaaaagcagc 780 tcaaacagaa gagcactcaa gaaattgaag tgcaaacggc tcagccaagt gaactggcag 840 aagatgcccc ctttgtgccg caaactagaa aaggccgagt cccaaaagag gcccgcaagc 900 gagacaacaa tgcgcgtcaa agatccgcgc agagagaaac tccgaagagt agtgggggtc 960 aaagcaaaca gcccaagaaa aagaaaaaga agcgatcact tccaaaaccc gaagcagtgg 1020 tcattgagaa gtgtgaaaac atagacctgg cgaaagtact aaaaggacta acccatgacg 1080 atgccctcaa ggatgttggc gaccaggttg ccaaggtacg aagaactcaa aacggcgaca 1140 tgctgctagt tctgaagcga ggcaaggaag cggtgcgggt tgaagcaggc atcaaaaatg 1200 ccttgaagga caaggcgaac gtccgaacgc ttgctccctc ggtgatgatt gaaatcactc 1260 atcttgatga aatcaccctt gctgaggaga tcgctgaagc cctcaagcaa cagctggaga 1320 tagacgtcga tcataaggaa atcaaggtcc gagaagcaag aactaagggt acacaaaagg 1380 ctacgtttag agtgcccctg tcagcgaaag agcgtgtcct caatccaggc aagttgaaag 1440 ttggctggtg tgtatgcagc ctgagagagg ctacagtcca ggttaaatgc ttcaaatgct 1500 ggaaactggg tcacaagggc ttcgaatgta ctggccagga tcgaagcaag ctctgcatta 1560 aatgtggaca agagggacac aagatcagag agtgtccaaa cgctatgacg tgcctcgact 1620 gccgtgagga tatggttgag ccccacatca ctggcagcct cagatgtccc aaccggatag 1680 ctagacgaca acaacatggt taaattaata cagcataacc aaaaccattg cggagcggcc 1740 tttaatctta tgtggcaaac cgctcgtgaa cgagaagtag atttgtttat tgttgctgat 1800 ccggtaaaaa atcaacgaca caacaacaac atagtgtata gtgaggacca gttggcagca 1860 atagtgacat gtggaaacct acccattcag aagatcgtca acaaggcatc gagaggaatg 1920 ttagccgtag aggtgggagg catcctcatc gttagtgctt atgccccccc aagctggact 1980 gtgcaggaat tcgaggaact gctcgataat attgttttga ccgtcagcgg atcgtccaag 2040 tttgttgtgg caggggattt caatgcatgg tcttcaagct gggcaaacac ccttggagca 2100 agaggagagt cacagcgctt gagaggcgat acactattag cagccttcgc cggattagag 2160 atggtgctaa tgaacaacgg tcaagacaca ttcgttacac cagagagaaa gtcagcaatt 2220 gacctaacgt tcgtgagcca atctctaatg gagacgacag gatgggaagt actgccggac 2280 tatatgaatt cagaccacat tgggatactt atcacgattg gcaaagaaca aacacctagt 2340 ccccgagaca atgcgaagaa gggatggaaa accacccttt atcacaagga actttttgct 2400 gcggcactag atagaatcct gcatgagatg agagttgaca cgcccgatga cctggttaaa 2460 gccctagata aagcatgtga tgctaccatg tccaggctga agaaaacgtg caggtggagg 2520 ggcgtctact ggtggacgtc cgtgatagca gaccttcgga ggaaaagtaa agccgcaagc 2580 agagttgccc aaagggcgta tgacactcct gaattcccag acaaaaggag ggaatataag 2640 cttgccagga atgcgcttaa gagagaaatc aaaaggacaa aaaaagcgac ctggtacaga 2700 cttgtaaaca tgtctgacat cattctattc ggcgaggtct atgtaatttt gaagcgaatg 2760 gttggaggaa acagagtgcc caaagagttg gaccccgaga aactcaacac cattattgac 2820 gaactatttc cgagccatcc tgtcacagac tggccgacac atcaaccaac gacaagtcaa 2880 gaaaacccgg aaagtgtgac agacgaggaa atccgaaaca tcggaaggtc gcttaaatcc 2940 aggaaagtac caggtccaga tggaataccg aacgctgctt tggcaacagc gatgattgag 3000 gagccgacaa tcttcaagaa agtttaccaa agatgcctcg atactggtgt atttccagac 3060 aactggaaga aacagaggct cgtgctctcg attcccaaac cagggaaacg accgggagaa 3120 agcggctcat cacgtccgat atgtttgatt gatggagtag ctaaaggttt agaacgtgta 3180 atactccacc gactgaataa ccacatcgag agagtacaag ggttatccga aaaccaatat 3240 ggtttcagga aaggaagagc aacgactgat gccatcgaaa aggtcttaag catagcaagt 3300 gcatccagag ctcgaaatcg aggtgctaat cgattctgtg cagtagtgac acttgacgtg 3360 aaaaatgcat ttaatagcgc gagctggacg gcgatagcga ggtctctaca gagaatcaac 3420 atacccaaat acctctatga catcataggg aattacttcc ggaaccgcgt gctgctctat 3480 gaaaccaacg aaggaaatcg agaacgagtc gtaacagctg gagtgcccca agggtcagtg 3540 ctgggaccca ccttatggaa tttaatgtac aacgaggtgc tcggcttaac gctgtatgat 3600 ggagcatcac tcatcggatt cgctgatgat atagttctag tagctgtcgg aagccgaata 3660 gacgatctgg agaacacgat cgaaacatcc atcaacatca ttcggcaatg gatggagtca 3720 gtggagcttc aactaaatat atcgaagacg gagtacatcc tagtaagctc acacagaagt 3780 agacaagaat cacagatcat cgtcgaagga cacacaatta gatcatcgcg ccacttgaag 3840 tatctgggca tcatgattga tgatcgccta gaatacactc agcacatcaa gtatgtcgct 3900 gagagagcgg tgaccaacac caacgcccta gtgagaatga tgcccaaccg atcaggacca 3960 agaagcagcc gacgtcgaat tatagcaaac accatcattg caggcatcag atatgcctcc 4020 tcaatatggg ctgagtcact gaaatttgag tgcaggaagc aatggctccg gaggtgccat 4080 agaccattag tgaacagagt gataagcgca tttaggagca cctctcacga tgcagcctgt 4140 gtcatagcag gcatgatgcc gctccacata ctaatcgatg aagactaccg agttcgacaa 4200 cgaagcatca caacgggagt aagcagcaaa ctggcgagga tagctgaacg accatactct 4260 gtagaagctt ggcaaaggga atggtcgacc accacttcag gctcctggac aagacgattg 4320 atacccaaca tccaaccatg gatcaccagg agacacggaa acatcgagtt ccatatgagc 4380 cagttcctat caggccatgg gttcttcaga tctcatcttc accgaatggg gtatgtacca 4440 tcacccgtat gtccggcctg cggcgacgag aatcagactg cagagcacac catcttcatc 4500 tgcggcatgt atcttctgac gaggttacga ctggagcaag atcttcaagc cgatttcgac 4560 gtcgaaaacg caatcaacat catgtgcagc gacgaagtaa cgtggaatcg agtcgcagag 4620 tacgtccacg aagtgatgga aaatcagtac aatctccaat gcagctacag aggcaacagc 4680 gacagagaac tccaaaacca agaagcagca agccaagaga cttccccgga acgtgcggga 4740 atcagcatga atgaggataa ctgacatggc cgatacatcg acgaagatcc attcccctcg 4800 gaacgtgcgg gaatcagctg agatgccaaa gaccatacgc acacatgtct ctattctttg 4860 gtcccccgac ttacgagtag agggaccttg atggtgagct tgtctgcggt tgtcagcccc 4920 ggtgtggcca gctggtcaaa caccggactt tgacattata cttaggtgga tggcacactc 4980 actgagatac catgtgcggc gatgaggtgc ctgacaccta ggagagatgg ccctccgggt 5040 ctcgggacct gggcgcgggg tgtaatgttc tatacgcctg acagacacca ctgtcatagc 5100 attgcggcgc cgtggtcgat gttgaccaaa ggaatgctgg ttgaagtaat gctctagcgg 5160 gcgatccggc cagtatttct tgaggcacaa gagagtttaa gtggttaaaa tccatctgca 5220 tacgtaggta ctggtgctct gtctatcgta tgtcctataa aaggttctct cttgtctaaa 5280 cgggaaaaa 5289 // ID CR1-1a_AG repbase; DNA; ANG; 5247 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 19-MAY-2005 (Rel. 8.02, Last updated, Version 2) XX DE CR1-1a_AG is a CR1-like non-LTR retrotransposon - a consensus DE sequence. XX KW CR1; Non-LTR Retrotransposon; Transposable Element; KW reverse transcriptase; endonuclease; CR1 clade; DNA/RNA-binding; KW CR1-1_AG; CR1-1a_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5247 RA Kapitonov V.V. and Jurka J.; RT "CR1-1a_AG, a subfamily of CR1-like non-LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(2), 13-13 (2003). XX DR [1] (Consensus) XX CC CR1-1a_AG is a subfamily of CR1-1_AG non-LTR retrotransposons. CC The CR1-1a_AG consensus sequence was reconstructed based on CC multiple alignment of ~10 copies identified in the CC sequenced portion of the genome. Given the ~2% divergence CC of these copies from the consensus sequence, transposition of CC CR1-1a_AG occurred less than 1 million years ago. CC The CR1-1a_AG and CR1-1_AG consensus sequences are 79% identical CC to each other. The 3' terminus of CR1-1a_AG is composed of the CC CAT CC microsatellite. CC CR1-1a_AG encodes two proteins: a 444-aa CR1-1a_AG-ORF1p CC (positions 441-1772) and 932-aa CR1-1a_AG-ORF2p (positions CC 1776-4571). CR1-1a_AG_ORF1p is DNA/RNA binding protein composed CC of the PDH domain (positions 3-38) and gag-like zinc knuckle CC regions (aa positions 340-444). CR1-1a_AG-ORF2p is composed of CC the AP endonuclease and reverse transcriptase domains. XX FH Key Location/Qualifiers FT CDS 441..1772 FT /product="CR1-1a_AG-ORF1p" FT /translation="MECLACSAVVLINDDPILCAGKCGGNFHRRCVTPSLS FT KTAAKIINENKNVLYMCDRCLEHKSGLAGMDVDVSGSYDLLTQSIKNLESN FT VSVWISSALEKGIETLKTELCAQVERKLEQTLRESLSAVECSNKAKEALRA FT TFDDTKARETVEDESWATVTKKRKRTNSGNSNVQTIINRFDTGNVNSTPKI FT SDKVTGPILANKNKNSKTLVIVPKVGQSCDKTRADLRAKLDPRKQQVSEFR FT NGKDGQVFVQCSAQVKLDELRKEVENILGDEYATDLPLSRVKIIGMSEKYT FT DSHLVDLLKSQNEGIPWKQVKVIGMFENKIYKYQKYNAVLEIDYESDLCLA FT KLGKINVGFDRCKISKSVHVMRCYGCGQFNHKSTECKNKQACSKCGDEHKT FT SECTSSSLKCVNCVLANSVRNLKLDVRHAANDYNCPMFKKQIERRMQLSQ" FT CDS 1776..4571 FT /product="CR1-1a_AG-ORF2p" FT /translation="QVGEELGRFREILYFNVAGLSSNYAMFRETVEKVQPL FT LVLISETHVIEEEAFQQFHINGYRVVSCLSHSRHTGGVAVYARSEIVLKVI FT FNESLEGNWFLGVAVSKGMTAGNYGILYHSPSASDSRFVDILEEWLDRFLN FT FSKLNIIVGDFNIDWLNVEKSAKLKSLMDSVNMKQKVNEFTRIARQSRTLI FT DQVYSSTDSIKVTTDPLLKISDHETLVLNINVKRCETIQRKFKCWNRYSKF FT ALCNHVSQGLRQEAPSFNEAADLLWNTLKKAMGTLVEEKTILSRETSRWYT FT LELSRAKRKRDEAYQHFIRSNSGYDWTEYTRLRNTYSRNLKTTRRNYFSGE FT ISKHKGNSKELWKVLKSLLRPEESRVSVVKFDGLVESEDSIICQKFNLFFV FT NSVLDINQNIADARAPDSLFCNDAPGNQFKFQKITKEKLKTICFSLSKTAG FT IGNVNCNTIQDCFHVVGESLLIVINQSLEEGIFPETWKESLVIPIPKVSGA FT ASAEEFRPINMLHVLEKVLELVVKEQLVQFLTRNDLLISEQSGYRQGHSCE FT TALNLVLARWKVLMDRKESIVAVFLDLKRAFETISRPLLLQTLRRFGIVGK FT ELNWFENYLKDRTQRTVFGNSISEPIENTLGVPQGSVLGPILFIMYINDMK FT QVLKSCEINLFADDTVLFISHKDIKQAESLINFDLNALDGWLRYKKLALNV FT KKTSYMVMTAGVLDSPPSIVINKEPIERVRQVKYLGVILDDRLKFNTHIDW FT VIAKVASKCGVISRLAKDLDFFGIVNLYKSLISPHFDFCSSILFLGNKGQI FT KRLQRLQNRIMRLALGCGRRTSSFVMLDILQWMSVEQRIVYQTMTFIFKLL FT GGLLPGYLGERIVRGSDVHRHCTRRANEPRVPNLISHGARNSLFFKGIQLY FT NRLPGEIKNASNLPDFKRRCAAYVKQTV" XX SQ Sequence 5247 BP; 1575 A; 847 C; 1255 G; 1570 T; 0 other; tcaacccttg aggtgaatgg tgacaagtgc agtcctgtga atgtgtagct cccgtttgtg 60 aactgtgaaa acttgtgaaa tagagagcta agtgttattt tgttttgttc tacttgttac 120 ttaatgttag tgctacctgc ggtatcattg tttagtattt gtattattgt tttgtgaatg 180 taaaacacgg gtgaagtatc tcggtcgaca gcgtcactgc gttgtcgaaa gtgattaatc 240 tttgctcggt atgtgtttgt gttcgtgcta gacatacgta tcgttgtccg tttgtcaaga 300 cgtgtacacg agtgtaagcg tttgatatcg caggctgtag tagagtagtg tttgtgtgca 360 tttgctgtta cctttttttc atgtctgtgt gataattttt ttaattctaa aacagctcgc 420 ccgaattttg tttttacggc atggagtgtt tagcatgctc cgccgtagtt ttaataaacg 480 atgacccaat tttgtgtgcg gggaaatgtg gtggcaactt tcatcgtcgc tgtgttaccc 540 cctcgctctc gaaaacagcg gccaaaataa ttaatgagaa taagaatgtg ctgtatatgt 600 gtgatagatg tttagaacac aaatcgggct tggcgggtat ggatgtagat gtgagtggat 660 cgtacgattt actcacacaa tccataaaaa atttggagtc gaatgtgagt gtatggatat 720 cgagtgcctt ggaaaaggga atcgagactc taaaaactga gctctgcgca caagtggaac 780 gcaaattgga gcaaacattg cgtgaaagct tgagtgcggt agaatgctca aataaggcaa 840 aggaagcctt gcgtgcaacc tttgacgata ctaaggccag agaaacagta gaggatgaaa 900 gttgggctac agtgactaag aaaagaaaaa ggacgaatag tgggaacagt aatgttcaaa 960 ccattattaa tcgttttgac acggggaatg ttaattcgac gcccaagatt tctgacaagg 1020 ttactggacc cattttagca aacaaaaata agaatagtaa gactctggtt attgtaccaa 1080 aggtgggtca atcttgtgat aagacaagag ctgaccttcg cgctaagctg gatccaagga 1140 agcagcaggt gtcggaattc cgtaacggca aggacggtca ggtgtttgtt caatgttctg 1200 ctcaggttaa attagatgaa ctcaggaaag aagtagaaaa cattttggga gatgagtatg 1260 caacggattt accattgtca cgtgtaaaga taattgggat gagcgaaaaa tacactgact 1320 cacatttagt agatctttta aaatctcaaa atgagggaat accctggaaa caggtcaagg 1380 tcataggaat gtttgaaaat aaaatttaca agtaccaaaa atataatgcg gtcttggaaa 1440 ttgattatga gtctgaccta tgtctggcaa aattaggaaa aataaatgtg ggatttgata 1500 ggtgtaaaat ttcgaagtcc gtgcatgtta tgaggtgtta tggttgtggt caatttaatc 1560 acaagagcac cgagtgcaag aataagcaag cttgttcaaa atgtggtgat gaacacaaaa 1620 cgtccgaatg tacttcatct tctttgaaat gtgtgaattg cgtgttagca aactctgtta 1680 gaaaccttaa actagatgta agacatgcgg ccaatgatta taattgtccg atgtttaaaa 1740 aacagataga aaggcgtatg caactttctc aatagcaggt aggggaggag ttaggacggt 1800 tcagagagat tttatatttc aatgtagccg gtctttcgtc caactacgcc atgtttcgtg 1860 agacagtaga gaaagttcaa ccgttgctgg tcttaatctc tgaaactcat gtgattgagg 1920 aagaagcatt tcagcagttt catattaatg gttatagggt tgtgtcgtgt ttgtcccact 1980 cacgtcatac aggaggtgta gctgtttatg ccaggagcga aattgtccta aaggtgattt 2040 ttaatgaatc attggagggt aattggtttc ttggtgttgc agtttctaag ggaatgacag 2100 caggcaatta cgggatattg tatcattcgc caagtgcaag tgactcaagg tttgttgaca 2160 ttttggaaga atggttagat aggttcttga attttagcaa acttaacatt atcgtcggtg 2220 atttcaatat tgactggtta aatgttgaaa aatctgcgaa gttgaaaagc ttaatggatt 2280 cagtaaacat gaaacaaaaa gttaacgaat tcacacgaat tgctaggcag agcagaacat 2340 tgatcgatca ggtttacagt agcacagact caatcaaggt cactactgat ccgttattaa 2400 aaatatcgga tcatgaaaca cttgttttga atataaacgt gaaacgttgt gaaacgattc 2460 aacgaaaatt taaatgctgg aataggtact cgaaatttgc tctttgcaat catgtgtcac 2520 aaggtttaag gcaagaagca ccgagcttca acgaagctgc agacttgtta tggaacacat 2580 tgaaaaaagc tatgggcacc ttggttgaag agaagacaat cttgtcaaga gagactagta 2640 ggtggtatac tttggaactt agccgtgcga aacggaaaag agacgaagca taccaacatt 2700 ttattagatc aaactcaggc tacgattgga ctgaatacac tagactaagg aatacataca 2760 gcaggaatct caaaactact cgtagaaatt actttagtgg tgagatatct aagcataaag 2820 gaaatagcaa ggagttgtgg aaagtgctta aaagtttact aaggccagag gaatcacgcg 2880 tttctgttgt aaaatttgat gggttggtag aatcggaaga ttcaataata tgccagaaat 2940 tcaatttgtt ttttgtaaac agtgttttag atataaatca aaacattgct gatgccagag 3000 cgcctgattc tttgttttgt aatgatgctc caggaaacca attcaaattt cagaaaatta 3060 caaaagagaa attaaaaact atttgtttca gcctgtccaa aacagcgggc atagggaatg 3120 ttaactgtaa tactattcag gattgcttcc atgtggtagg agagtctctt cttatagtga 3180 tcaatcagtc gctggaggag ggtatttttc cggagacttg gaaggaatca ttggtaatac 3240 ctattcctaa agtgagcgga gctgccagtg cggaagagtt tcgtcccatc aatatgttgc 3300 atgtactcga aaaggtgctg gaattggtgg ttaaggagca attggtccag tttctaactc 3360 gaaatgacct gttgattagt gaacaatcag gatatcgaca gggacactct tgtgaaactg 3420 ctttaaatct tgtactggcg aggtggaagg tgttgatgga tcgaaaggaa tcgatagttg 3480 ccgttttctt ggatctaaaa cgagcatttg agactatatc aagaccgtta ttgctgcaga 3540 ccttaaggcg ttttggtatt gtggggaaag agctcaattg gttcgaaaat tatttaaaag 3600 acagaactca gagaacagtt tttggaaact ctatatcaga gcctatagaa aatacccttg 3660 gagttccgca aggaagtgtt cttggaccaa ttttgtttat aatgtatatc aatgacatga 3720 aacaggtttt gaagtcttgt gagatcaatc tttttgccga tgatactgtt ttgtttatct 3780 cgcacaaaga catcaagcaa gcagagtctc tgattaattt cgatttaaac gctctggatg 3840 gttggcttag gtacaaaaag ctagcattaa acgttaagaa gactagttac atggtaatga 3900 ctgctggtgt attagacagt cccccatcca tcgtgataaa taaggaacca atcgaaagag 3960 tccgtcaggt taaatatctg ggggttattt tagacgacag attgaagttc aacactcaca 4020 tagactgggt catcgctaaa gtggcatcaa agtgtggggt tattagtagg ctggcaaaag 4080 atctcgattt ttttgggata gttaacctct ataagtcact gatttcacca cattttgatt 4140 tttgctcgtc gattctgttt cttggcaata agggacaaat taaaaggctt cagagattgc 4200 aaaatcgtat tatgcggtta gctttagggt gtggccgacg tacgtcgtct tttgttatgc 4260 tagatattct tcagtggatg tctgtagagc agaggattgt gtatcaaacc atgaccttta 4320 tatttaaact tttggggggc cttttgccag ggtatttagg ggaacgcatt gttcgaggat 4380 ctgatgttca tcggcactgt acacgcagag caaatgagcc gagggttccc aacttaattt 4440 cacatggtgc cagaaactct ttgtttttca aggggattca attatacaac agattacccg 4500 gagaaatcaa gaatgcgagt aacttgccag acttcaaacg taggtgtgcg gcatatgtta 4560 aacaaactgt gtaatgtcat atttgtgtag aagtcctatg tcactatgtt atgtacaact 4620 gcacttgtca tcatgagctt gatgatgatg ataagatttt tcttgatata tgaaaacaaa 4680 ttagaaaaaa tatataagaa agagacaaca taagtttgag acacgcgcgc gtacaagtgg 4740 acaatcggat ttagtttggg agagcttgct gtctgcatgt ctgggaggtc acagcgagat 4800 tcactcattg atgattgtaa gtggccaatt ccagatacat taggttctcc tgcatcggaa 4860 ggtagtgccc tggtgccatc gttggtacca gcggaactgg ccatatgcat gttgcagtgt 4920 gccctgataa tttccaatac attaggttct gctgcttcgg aaggtagtgc cctggagccg 4980 gtgtggcacc agtggaactg gccatatgca tgtcgcagtt atgtcctgat gtgttcgatg 5040 gagttgaccc gttttttctt gatgaaacta cgtcggtcgt cttgggtgtg tgtatgactt 5100 ggtggattct tttgaatgac gtccgatgcc accacttgct cgtaaatttt tattttattc 5160 aaaaactacc ttagagtaat attatcgtaa agatacttcc gtccttctca aacctgtgtt 5220 ggggtaagag gtgggactta tcatcat 5247 // ID GYPSY46-I_AG repbase; DNA; ANG; 6534 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY46-I_AG is an internal portion of retrotransposon GYPSY46_AG DE - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD gag; KW AP protease; GYPSY lineage; GYPSY46-I_AG; GYPSY46-LTR_AG; KW Gypsy clade; RNase-H; reverse transcriptase; KW integrase GYPSY46_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-6534 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY46_AG, a member of the Gypsy lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 84-84 (2004). XX DR [1] (Consensus) XX CC GYPSY46_AG is a family of gypsy-like LTR retrotransposons CC that, according to the aminoacid sequence of its Reverse CC Transcriptase, CC RNase and Integrase is CC phylogenetically grouped with representatives of the GYPSY CC lineage of other organisms. CC GYPSY39_AG, GYPSY40_AG, GYPSY41_AG, GYPSY42_AG, GYPSY43_AG, CC GYPSY44_AG, CC GYPSY45_AG and GYPSY47_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY46-I_AG consensus was reconstructed after multiple CC alignment of 4 copies. CC The consensus encodes the 434-aa GYPSY46_AG1p gag-like CC poliprotein (pos. 507-1808), the 1061?aa GYPSY46_AG2p CC pol-like poliprotein (pos. 1757-4939) and the 515-aa GYPSY46_AG3p CC env-like poliprotein (pos. 4962-6506). CC The sequence of the LTRs flanking GYPSY46-I_AG is deposited as CC GYPSY46-LTR_AG. XX FH Key Location/Qualifiers FT CDS 507..1808 FT /product="GYPSY46_AG1p" FT /translation="MEALAGRIAALEARFSESNVTDDFQDPPLFFTKQDGS FT AVDPESFEKIPGVVKDLPIFCGDPSELNSWINDVDGIIRLYQTISSHSLEK FT QNKFHMICKFIRRKIRGEANDALVASNVGINWNMMRKTLITYYGEKRDLET FT LDFQLMSVYQKGRTLEVYYDEVNRLLSLIANQIQTDDRFNHPEASKAMIGT FT YNKKAIDAFIRGLDGDVYKFIRNYEPTSLAAAYSYCISFQNLECRKMLTKP FT KHFNTPPSAPRNQIPLPTPHLPPRVFQHQQRPMTANNVRPHFAHHPPIQNF FT AGNFTQRPVWNQPNQQRPIFQRTNFNQPNQMKNFTQQRNNFRQNGPEPMEI FT DPSIRSHQVNYANRPNSSNIRPLKRQRAFNIEAVPRREIEPTSYEDNLYDD FT DVESQASYERYMRNVEKQEKLNENSHYDEISREAELNFLG" XX SQ Sequence 6534 BP; 2504 A; 1266 C; 1070 G; 1694 T; 0 other; ggcgcccgaa tagggacctc tagtgaagtg aaaatcaagt gacgagtagt gatcgcgaaa 60 gagcgtaatc acaccgtgct tagaacatct cccgatcgca gattgcagcc gtgtgtatga 120 gcacctaagt agccggtagc agccaataac tcgttgccgg aggaaaggga caacgccgac 180 acaccgagaa ggaaaaattc accggaggcc cgagcaacag aaaccaaccc cgtcagagga 240 ggcgaagcaa ccagccccgt aagaagggtg aagcatgaat ttattaatgt aagtaaaact 300 ttcaatcata gtaagtataa acatttatta aagcgacaaa aaatattaaa acattataat 360 cagtgttaaa gtgagtgatt agtgagtgaa aagaaaatca ttatcacaaa agttaagtca 420 aactggagga attagttcaa tcattacgac aattctcagt cacagtaaaa tctatagata 480 acgaagaaaa actaacaaac aaagatatgg aggcactagc tggcagaatt gcagctttag 540 aagcacgttt tagtgaaagc aatgttacag atgattttca agacccacca ctttttttta 600 caaaacaaga tggtagtgca gtagatcccg aatctttcga aaaaatccct ggagttgtaa 660 aagatctccc aattttctgc ggtgacccaa gtgaacttaa tagctggatc aatgacgtag 720 atgggataat ccgactatac caaactatat ctagccatag tttggaaaag caaaataaat 780 tccacatgat ttgtaaattc atacgtagaa aaattagagg tgaagccaac gatgctttag 840 tagcatctaa cgtagggata aattggaata tgatgagaaa aactctcata acttattatg 900 gagagaagcg agatttggaa actctcgatt ttcaacttat gagtgtctac caaaaaggtc 960 gaactttgga agtttattac gacgaggtta atagacttct ttcacttatt gcaaatcaga 1020 tacagacaga cgatagattt aaccatccgg aagcttcgaa agctatgatt ggaacataca 1080 acaagaaagc gatcgatgct tttatcagag gtctcgatgg ggacgtttat aaatttattc 1140 gtaactacga accaacatcc ttagcagcag cctacagcta ttgcatttct tttcaaaacc 1200 tagagtgccg taagatgcta acaaaaccaa aacattttaa cacacccccg tcagccccca 1260 gaaaccaaat accattgccc acacctcatc taccaccaag agtgttccaa caccaacaaa 1320 gaccaatgac agcgaacaac gtaagacctc attttgcgca ccacccaccg attcaaaatt 1380 ttgcaggaaa ttttacacaa cgtcctgttt ggaatcaacc aaatcagcaa agaccaattt 1440 ttcagcgcac aaattttaat caaccaaatc agatgaaaaa ttttacacag cagagaaaca 1500 attttcgcca aaatggacct gaaccgatgg aaatagaccc atcaattagg tcacatcaag 1560 ttaattatgc gaacaggccg aactcctcaa acattcgtcc attgaaaaga caaagagctt 1620 tcaatattga agcagttccg cgacgtgaaa tagaaccgac ttcatatgaa gataatctct 1680 acgatgatga tgtcgaaagt caggcgtcat acgaacgata tatgagaaat gtagaaaagc 1740 aagaaaaact aaatgaaaat tctcattacg acgaaatttc tcgcgaagca gaattaaatt 1800 ttttaggtta aaatcagctt taccatattt tatataccat ggtaaggcag gtcaacaaat 1860 taaaattcta atcgacactg gatctaataa aaatttcatc aaccctttac atgcgaaaat 1920 ttctcacgac gttataaaac cattttttgt atcatctgtg ggaggagatt tactcatcac 1980 aaaatattca caagctcaaa tatttgcccc ttattccgat gtaaatgtca aattttatca 2040 tttgcaggga ctaaaatcat ttgacgccat aataggttat gataccatca aagaaatggg 2100 agcatttgta gacgctaaaa gagacaatct agttcttgaa aattttataa tacctctctc 2160 acttcatcca ttacaggaag ttaacagaat tgaaataaga gacacacatc ttaaccacca 2220 agaaaaagaa aaattacatt tatttcttaa caagtttcaa gatttattcc agccacccga 2280 cgaaaagttg ccctttacaa caaaggtaga agcaaccata gccacgaatg atacggaacc 2340 aatttactgt aagtcatacc catacccttt gtccctcaaa caggaagtgg aaacacagat 2400 aaaaaaatta ttaaataatg gtataattcg accatctagg tcaccatata attcacctgt 2460 gtggatagtt cccaaaaagg ttgacgcatc taacgaaaaa aaatatcgac ttgtgatcga 2520 ttacagaaaa ataaacctga aaactaaaag cgatagatat cccattcccg atacttcaac 2580 agtacttgcc aatctaggaa ataataaata ttttacaaca ctcgatctag catcgggatt 2640 tcaccagatt cgtttagcag aaaaagatat cgaaaaaacc gccttttcca tcaataatgg 2700 aaaatacgaa tttttaagat tacctttcgg tctgaaaaat gcaccttcga tttttcagag 2760 agtcatggac gatgttctta gagaacatat tggaaaaatt tgtcacgtat acatagacga 2820 tataatagtc tttggaaaaa cattcgacga acatctgaaa aacttggaaa ttgttttgaa 2880 tacattacga gaagccaatt ttaaaataca gccagacaaa tcagagtttt taagaacaga 2940 agttgaattc ttaggattca ttgtttcaga atatggcttg aaaccaaatg agaaaaagat 3000 agaaagtatc ttaaaatacc ccgaacctca aactattcga gaacttagat catttttagg 3060 actgtctgga tattacagaa gatttgttaa aaattatgca gctttagcaa aacctttaac 3120 aaaactttta agaggggagg atggccaagg ccactgcaaa attacaaaaa atcaatctaa 3180 aaattttccg ataaaattag atgatgatgc caaacgcgct ttcaaaactc ttaaggaagt 3240 tttatcatcc gatgatgttt tagcataccc cgattttgat catgatttta ttttaactac 3300 cgacgcttct gacaaagcaa tcggagctgt tctttcccag aacgttaatg gtgttgaaaa 3360 accaataaca ttcatatcta gaacactatc aaaaacagaa gagaattatg ctacaaacga 3420 aaaagaaatg cttgctatag tttgggcttt acattctcta cgtaattaca tttacggtgc 3480 aaaaataata atattaacag atcaccaacc tttaacatat gcaatgtcac caaaaaataa 3540 caatgcaaaa ttaaagcgat ggaaagcatt catagaggaa cataactatg agctaagtta 3600 caaacctgga aaaaccaacg tggtagctga tgctctttca cgcatacaaa ttaactcact 3660 aactcctaca caacactctg ccgaagaaga tgatctttct tttatccctt ctaccgaagc 3720 tccaattaat gttttccgaa accaattaat ttttcaaaaa ggtactatta gttcctacga 3780 gtttgtaaac ccctttccta agtttaaaag gcacactttc atagaaccac aattttcaat 3840 cgattttata aaagacaaac ttaagagatt catgatacct ggtataataa atggcatatt 3900 cactgatgag ccaactatgg ggatcattca agaaaccttt aaaaatctat tcaatatatc 3960 aaccatgaaa gcaagatttt cacaaactca agttcaagac atttgtgatc aagaacaaca 4020 gatagaagaa attcgtaaaa tacataactt tgctcataga aacgctaagg aaaattcatt 4080 acaagctata aaaaaatttt atttcccttc catgagaaac aagatagaac aatatgttaa 4140 aaactgcgaa acttgcaaag tagaaaaata cgaaagaaga ccccctgaat acataccagt 4200 taaaacacca atcccaaaat atccaggaga aattgttcat gttgatatat ttgcgtataa 4260 tgcaaatttt ttattcatct cgtcaatgga caaattttcg aaatatttga aattaaaacc 4320 aattaaatca aaatccatag cagacgttaa ggaagtactg ctacaattat tatacgattg 4380 gaatttgcct agacaaatta tatttgataa cgaatgtaca tttgtatcga acgtcataga 4440 gcagtccata ctaaatttag gtgtatcaat ttttaagaca ccagtgaata gatcagagtc 4500 aaatggacaa gtggaacgtt gtcactccac gatcagagaa atcgcaagat gtacaaaagg 4560 tttgaatcca gacatgagct taattacctt aatacaacaa gccgtgtata agtataataa 4620 tactattcat tcttttacaa aagagactcc cagaaaagta tatattggag agcaatcaga 4680 agaactttca tttagagata gatcaaaatt aaaagaaaaa attgagagta aaattataaa 4740 aatatttgaa gaaaaaaatg aaaagattaa agatgataag taccaagatt acgaaccgaa 4800 tcaatttgcg tatgaaaaaa ataaaactat gaataagcgt gacagtcgtt ataagacagt 4860 ggtagttaag gaaaatcatc caacgtatat aatagattcg aacaatcgaa aaatccacaa 4920 aataaattta agaaaaaatt aatgatataa ttatgattta attaattttc tattacagaa 4980 tcgcgttttt cacactatat ggtgttctac aagctagcat aaacattttc gacttaacaa 5040 acaacccatt ggctattgtt ccgttaggac aagcaaaaat taggatcgga tacttgagga 5100 cgattcatcc aattgatctt accgagctag aagagataat ttctcgagtt tttgaaaata 5160 gcacaaacag tacaggaaaa tccccattgc aaagtttaat taatttgaag ctcgaaaaac 5220 ttaacgccac aatttctaag attaggccac gtagacttcg aacgaaaaga tggaacagta 5280 taggtaccgc ctggaagtgg atagctggca gtcccgacgc agaagatcta acgataatca 5340 acaccaccct gaattcgctc atcctacaaa acaacgagca gctattaatc aataatggtc 5400 tcagcagaag attccaagaa acaaccaata ttgctaatca tgttatcgac cttcagaata 5460 ggatccaaag ggaacatcaa actgagatac aacagatcat taagatagca aacctagacg 5520 cattacaagc ccatataaaa acactccaag aagccatact agccgctaag catgggatac 5580 cgaatagcga gctactatca atagaagact taaacaccgt tgcagaattt ctggcacaaa 5640 atggcattta ctatacatca gttgaagaaa tgttaacaca agccacagca caagttacca 5700 tgaattcaac acacgtgata tttatgctaa agtttccacg tctatcctat gaaacttatg 5760 agtacaacta tatcgactct atcatacaaa atgataagag aatcttaatc aagcataact 5820 acataatccg gaatctaacc catatgttcg aattaccgca gccctgtatc gatcagagca 5880 gccaccagct ttgcgaaagt aaagatctgg aagagccttc acgctgcata cgacaactcg 5940 tacaagggga gcatacagaa tgtatgtacg aaaaggtgta ttcaacggga ttagttaaac 6000 acattaacaa tgcgaatatt ctattgaatg atgccactgc cgaaatttca tccaactgca 6060 gcaatataaa ccacattctt aatggatcat atctcataca atttcacaac tgcaatatct 6120 ttattaacgg agaactcttt cccagcaccg aagtttcgat aaccggtaaa ccatatatat 6180 caacccttgg cctcatcgct aaagaagacg gcatcagaga cgaaccttca attgaacatc 6240 ttcgaaacat aacattgcag cacagagaga aactacatac catcagcctg gttaataatt 6300 ccctcacatg gaaacttcat atctttgggt caattgggct aacgacaatt gttctgataa 6360 caatagcaat tttatatttc attaccagta taagaagaac gaaaataagc ctcaacattc 6420 caacgaacaa caccaaccga caggatgtcc accacataga aaccttcgtg aaaaaaccca 6480 caacattcca tgctctcggc agactttgag ggcaaagtca tctaagaagg gagg 6534 // ID BEL14-I_AG repbase; DNA; ANG; 5776 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE BEL14-I_AG is an internal portion of the BEL14_AG LTR DE retrotransposon - a consensus sequence. XX KW 5-bp TSD; BEL14-I_AG; BEL14-LTR_AG; BEL14_AG; Bel clade; KW LTR retrotransposon; PHD domain; integrase; protease; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5776 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "BEL14_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(3), 35-35 (2003). XX DR [1] (Consensus) XX CC BEL14_AG is a young family of Bel/Pao-like LTR retrotransposons. CC BEL14-I_AG, an internal portion of BEL1_AG is flanked by CC BEL14-LTR_AG CC LTRs. The BEL14-I_AG consensus sequence was reconstructed based CC on CC multiple alignment of 16 copies; they are ~1.5% divergent from CC the consensus sequence. CC The consensus sequence encodes a 1898-aa BEL14_AGp Bel-like CC protein CC (pos. 57-5750). CC BEL14_AGp is composed of the PHD (pos. 9-55), protease (pos. CC 277-400), CC reverse transcriptase (pos. 900-1060) and integrase (pos. CC 1600-1765) CC domains. XX FH Key Location/Qualifiers FT CDS 0..0 FT /product="BEL14_AGp" FT /translation="MGPKKARGCKACGNQVDDTLYVQCDECDAWWHFSCAG FT ITASVEAVEKCAWLCEECARKTLREQSSPREGNKEPKEGTSKHVDGDLVRN FT LSLEAATDGGARPVTNPQRRPLLSLDEANDEIAPGTSTHVAGGPVHNLNQD FT AATEGGVRPVMTPKRRPLSSLDEADRGKTSVSSNIVHRGSCPNLNLDAASD FT DVARQLAVLKRRQEVEKRRMELELQLKFVQEEEALLGFGENKSFSISPQLN FT SFQTEKRTVKRSEEEKEEPDLTPRQEAARHMVSKELPVFSGDPAEWPIFIS FT HYEYTTRRCGYSNWENMLRLQKCLKGPALEAVRSRLVLPDVVPQVIEKLRS FT KYGRPVHLIKTFIEKVRKIPAPQTDKLDSLVEYGEAVQCMVDHMVAAGERA FT HITNPLLLQEVVGKLPTDQQLRWSHHIRGMTSVDLSTFSDYMEDLAEDAAR FT LTTIDSPSVRGTSKGRPTKGYVHAHVDPDGATTSSAAERQCVSCNVAGHVL FT STCTNFRGLPVKDRWRRARELSVCFSCLEKHNWRSCKNRSRCGINDCAFRH FT HALLHDPDAIESPSTADRERRHFPRTSGSQTHQVINNYHQSNPMSALFRIV FT PVTAYGPGVMIKTFAFLDEGSSMTLMDEDLAKQLGVKGDRRPLCIKWTGDT FT TRVEPASMMIDLQIGPVTSTKRFTLKAVRTVTSLSLPQQTFTMDDKRWDHL FT KQLPLPEYRDARPQLLIGLDNLRLAVPLKTREGLAGEPVAVKTRLGWCVYG FT KTAGSQIGRVLHMCECGASDENSTIQGALRKFYELEQLGTVSSDVPDPDER FT RALTILETTTVRIGNRFESGLLWKTDNVELPSSLGMARRRLECLERRMERD FT PKLKTVVHHHIADMMEKGYIHKATSAELAECNSKRIWYLPLGVVTNPKKPG FT KVRIIWDAAAKVQGTSLNDMLLKGPDELISLPGVLFRFRMYGIAVCADVKE FT MFLQIRMRDEDKHAQRFLWREDPADDIATYFVDVVTFGSACSPATAQYVKN FT RNAKEHAEKYPRAVRGILTSTYVDDYLDSFGTFEEASRVSREVRGIFSNGG FT FVLRNWVSNNPVVLERLGGESSSPGMKSLTSTADDGERVLGLRWNPSSDQL FT SFYTQACVGMAEIFETECTPTKREVLKCVMSLFDPLGLLANFTIHGRILIQ FT DLWRAGTGWDEAISPSQMRDWRRWVDVFPLIAQLRIPRCYFPEAREKVYEN FT AELHLFVDASQLAYACVLYLRVVDSEGEPHCTMLCGKAKVAPLKPLTIPKM FT ELQACLLGARLLKSTEQHHPISVKKRVLWTDSTVALSWIHADPRNYRPFVA FT NRVAEIQENTNVNEWRWVPTQDNPADEATKWKGRANFNWDGIWFQGPSFLL FT QDEESWPTRRLVSTTPEEEIRRVNLHREKLNPGLLPLKAERFSRLERMIRT FT LAWIVRYVDNLMRKVGGAPLHLGILSQDELERAETIAWKQAQGEYFQDEVR FT VLSVGEGTGRSTVPKESPIYGLLPYADERGVLRMRGRIGAAPELPYAARYP FT IVLPRDAWITHLLVDKFHRRFRHANNETVVNELRQYFQIPKMRRLVSKVVR FT QCVFCHIRRTLPQIPPMAPLPKQRLTAFVRPFTFVGLDYFGPLLVRRGRAQ FT EKRWVALFTCLTIRAIHLEVVSSLSTDSCILAVRRFVARRGAPVEVFSDNG FT TNFVGASQQLRKEIDERNDALAATFTNANTRWTFNPPGAPHMGGVWERMVR FT SVKAAMSTMTELQRTPDDETLLTVIVEAEGMINTRPLTYIPLESADQESLT FT PNHFLLGSSSGVKQRPVAPTSLQTGLRSNWKMVQHILDGFWRRWIKEYLPV FT LARQSKWFETVREIEVGDIVLIVDGGARNQWKRGIVERVVSGADGRIRQAW FT VRTNTGTLRRPAAKLALLEIRKGDK" XX SQ Sequence 5776 BP; 1463 A; 1325 C; 1755 G; 1233 T; 0 other; ctcttttgcc tacaaaaaag ggttcttcag tgcgttaagt gtaatagtga agaacaatgg 60 gaccgaagaa agcacgtggt tgcaaggctt gcggtaatca ggtcgacgac actttgtacg 120 tgcagtgcga tgaatgtgat gcgtggtggc atttctcgtg tgccggtata acggcatccg 180 tagaagccgt ggagaaatgt gcgtggttgt gcgaggagtg tgccaggaag acgctgagag 240 agcaatcatc gccacgcgag ggcaataagg agcccaagga aggaacctcg aaacacgtgg 300 atggggatct cgttcgtaac ctcagtttgg aagcagcgac ggatggcggg gcgcgcccgg 360 ttacaaaccc acagaggcgg ccgcttttat cgctcgatga ggccaacgac gagatagcgc 420 caggaacatc gacccacgtg gcagggggac ccgtacataa cctcaaccag gatgctgcga 480 cggaaggcgg ggtgcgcccg gtcatgacac caaagaggcg gccgctttca tcgctcgatg 540 aggccgacag aggtaaaacg tctgtctcat cgaacatcgt gcacagagga tcgtgtccta 600 acctcaacct ggatgcggca agtgatgacg tggcacgtca actcgccgtg ctgaagcggc 660 gacaggaggt ggagaaacgg cgcatggagc ttgaactgca gctgaagttc gtgcaggagg 720 aagaggcact tctcgggttt ggggaaaata agtctttttc aatttcacca caacttaact 780 cttttcagac tgaaaagaga acagtgaaac gcagcgaaga agaaaaagag gaaccagacc 840 taactccacg acaagaggct gcgcggcaca tggtttctaa agagctccca gttttctccg 900 gtgatcccgc tgagtggcca atttttatat cgcactacga gtatactacc aggcgatgtg 960 gatactcgaa ttgggagaat atgctgcgcc tgcaaaagtg cctgaaagga cctgccctcg 1020 aagctgttcg gagtcgattg gtgttaccgg acgtagttcc gcaggttatc gagaagctac 1080 gttccaaata tgggcggccg gtgcacttaa ttaaaacatt catcgagaag gtgcggaaga 1140 ttccggcacc ccaaactgac aagctggaca gtttagtcga gtatggggaa gcagtgcagt 1200 gtatggtgga ccatatggtt gcggctggtg aacgtgcgca tatcaccaac ccgctcttgc 1260 tgcaagaggt ggtcggcaag ttaccaacgg atcaacagtt acgttggtcg catcacatcc 1320 gcggaatgac ctcggtagat ctgtccacat tcagcgacta catggaggat ttggctgaag 1380 acgctgcgag gctgacgaca attgactctc cttcagtgcg cgggaccagc aagggaaggc 1440 ctacgaaggg ctacgtccac gcgcacgtgg atccagatgg agcgacaacg tccagcgcgg 1500 ctgagaggca gtgtgtatcc tgtaacgtcg cggggcatgt attgtcgaca tgcactaatt 1560 ttcgaggact gccggtaaag gatcgatgga ggcgagcgcg tgagctatcg gtgtgcttta 1620 gctgcctgga gaagcacaat tggcgatcgt gcaaaaatcg ctctcgttgt ggaatcaacg 1680 attgtgcatt ccgacatcac gcgcttctac acgacccgga tgcaatagag tcgccttcta 1740 ctgcagaccg agaacggcgg cacttcccga gaaccagtgg aagtcagacg caccaggtaa 1800 taaataatta tcatcagtcg aatccgatgt cggcgctttt tagaatcgtt ccagtaacag 1860 cgtatggacc cggagttatg ataaaaacct tcgcgttcct ggacgaaggt tcgtcaatga 1920 cgctgatgga cgaagacctg gcaaagcagt taggggtgaa gggagataga cgacctctat 1980 gtatcaagtg gacaggtgat acgactaggg tcgagccggc gtcgatgatg atcgatttac 2040 agatcggacc tgtgacgtcg acaaaaaggt tcaccctgaa agctgtgcgg actgtcacca 2100 gccttagcct cccacagcaa actttcacga tggatgacaa gagatgggac catcttaagc 2160 agctgccatt accggagtac cgtgatgctc ggcctcagtt gttgatcggg ctggacaacc 2220 ttcgattggc ggtgccgctg aagacgcgtg aaggccttgc aggggaaccg gttgccgtaa 2280 agactcggct tggatggtgc gtgtacggaa agacggctgg aagccaaatc ggaagggtgc 2340 tgcatatgtg cgagtgtgga gcatcggacg aaaactccac catccagggg gccttacgca 2400 agttttatga gttggagcaa ctcgggactg tctccagtga cgtgcctgat ccagatgaac 2460 gaagggcact gacgatcctg gaaacaacga cggtgcggat tggtaatcgg tttgaaagcg 2520 gtctgttgtg gaagacagac aacgtggagc ttccttcgag cttgggtatg gcgcgtcgca 2580 ggctggaatg cttggaaaga agaatggaac gtgaccctaa gctgaaaacc gtggtgcacc 2640 atcacatagc cgatatgatg gaaaagggtt atatccacaa ggcgacgtct gctgagcttg 2700 cagagtgtaa ttcgaagcga atttggtacc tgccgttggg agtggttacc aatccgaaga 2760 agccagggaa ggtgcgcatc atctgggacg ccgctgctaa ggtacaaggt acgtccctaa 2820 atgacatgtt gctgaagggg ccggacgagt taatttcttt gccaggggtg ttgttccggt 2880 ttcgaatgta cgggatagcg gtgtgcgctg atgtcaagga aatgttcctg cagatacgca 2940 tgcgcgacga agacaagcat gcgcagcggt tcctgtggcg ggaagatcct gctgacgata 3000 tcgcaacgta tttcgtggac gtcgttacct ttgggtcagc ctgctcccca gccaccgcac 3060 aatacgtgaa aaaccggaac gccaaggaac atgccgaaaa ataccctcgt gccgtacgtg 3120 gcatcttgac cagcacgtat gtcgacgact atttggatag tttcggaaca ttcgaagaag 3180 ccagtcgagt atccagagaa gtcaggggaa tcttctcgaa cggcgggttc gtactccgga 3240 actgggtttc caacaatccg gttgttttgg aacggctggg cggcgaaagc tccagtcccg 3300 gtatgaagag tttgacatct acggcggatg atggagaacg ggtgctcgga ttgcggtgga 3360 acccgagctc ggaccaattg tccttttaca cgcaggcgtg tgtgggaatg gcggagatat 3420 ttgagacgga gtgtacccct accaagcgag aagtgctcaa atgcgtgatg tcactttttg 3480 atccgcttgg actgttggca aactttacca tccatggaag gatcttgatt caagaccttt 3540 ggcgagctgg taccggttgg gatgaggcca tcagtcccag tcaaatgcga gattggcgta 3600 gatgggtgga tgtttttcct ctgatagccc agcttaggat tccgaggtgc tacttcccgg 3660 aggcacgaga gaaagtgtac gagaatgcgg agctacactt gtttgtggat gccagccagc 3720 tagcgtacgc ttgcgtgctg tatttacggg tcgtcgattc tgaaggagaa ccgcattgta 3780 ccatgctatg cggaaaggca aaggttgctc ctctgaagcc tttgacgata ccaaagatgg 3840 agttacaagc ctgcttgtta ggtgcacggc ttctgaagtc cacggaacag catcacccga 3900 tttctgttaa aaaacgggtg ctctggacgg acagcacggt ggcgctatca tggatacatg 3960 ccgaccctag gaattacagg ccatttgtcg cgaatagagt ggcggagatt caggagaaca 4020 ccaacgtgaa tgagtggcga tgggtgccca ctcaggacaa tccagcagac gaagctacca 4080 aatggaaagg gcgtgcgaac ttcaactggg atggcatttg gttccagggt ccatcatttc 4140 tgctgcagga tgaagagtct tggccgacga gaagactcgt ttcaactact ccggaggaag 4200 agatacggcg ggtcaacctt caccgtgaga agttgaatcc tggacttctc cctctaaaag 4260 ctgaacgctt cagccgcctg gaaagaatga tcaggacgtt ggcgtggatt gtcaggtacg 4320 tggacaattt gatgagaaag gtgggaggag cccctctaca ccttgggatc ctctctcaag 4380 acgaattgga gagagcggag acgatcgcgt ggaagcaagc gcaaggggaa tattttcagg 4440 atgaagtacg agtcctgagt gtcggtgagg gaacaggaag gagtaccgtg cctaaggaaa 4500 gtcctatcta tggtctctta ccctacgcgg atgagcgtgg tgttttgcgc atgcggggac 4560 ggattggagc agctccggaa ctgccatatg ctgccaggta cccaatcgta ttgccacgtg 4620 acgcatggat aacccacctg ctggtggaca aatttcatcg ccggtttcga cacgccaata 4680 acgaaaccgt ggtgaacgag ctgaggcagt atttccaaat cccaaagatg agacggttgg 4740 tttcaaaagt ggttcggcaa tgcgtgttct gccatattcg acgaacattg ccacagatcc 4800 ccccgatggc tccattaccg aaacagcggc tcactgcatt cgtgaggccg ttcacatttg 4860 tgggactgga ctactttgga ccgctgttgg tgaggagagg aagagcacag gagaaacgat 4920 gggtggcgct tttcacatgc ctaaccataa gagcaattca tttagaagtt gtgagtagtc 4980 tttccacaga ttcctgtatt ttggcagtga gacgctttgt ggccaggaga ggcgctcccg 5040 ttgaggtgtt cagcgacaac gggacgaatt tcgtgggagc cagccagcag ctaaggaagg 5100 aaatcgacga gcgcaacgat gccttagctg cgacctttac caacgcgaac acccgatgga 5160 cgttcaaccc ccctggcgca ccccatatgg gaggggtatg ggaacgcatg gtgcgatcgg 5220 tgaaggctgc gatgagtacg atgacggaac tacagcgtac acctgatgac gagacgctgc 5280 ttacggtgat agtggaagcg gagggaatga tcaacacacg cccactgacg tacatcccgc 5340 tggaatcggc ggatcaggag tctcttactc ctaaccactt cttgctgggc agttcatcgg 5400 gagtgaagca gagaccggtg gcaccgacta gccttcagac ggggttacgg agcaactgga 5460 aaatggtgca acatatcctg gacgggtttt ggagacggtg gataaaagag tatcttccgg 5520 tgttggcacg gcaaagcaaa tggtttgaga ctgtgagaga gattgaggtt ggagacattg 5580 ttctgatagt cgacggtggc gctaggaatc agtggaagag agggatagta gaacgagtgg 5640 tttcgggagc cgacgggcgg atacgacaag cttgggtgcg aacaaacaca gggaccctca 5700 gaaggccggc ggctaaactt gccttattag agataagaaa gggtgacaaa tagcgtattg 5760 gtcacgggct ggggga 5776 // ID RETRO23_AG_LTR repbase; DNA; ANG; 720 BP. XX AC . XX DT 06-FEB-2003 (Rel. 4.1, Created) DT 06-FEB-2003 (Rel. 4.1, Last updated, Version 1) XX DE Anopheles gambiae long terminal repeat from RETRO23_AG DE retrotransposon - a consensus. XX KW Long terminal repeat; retrotransposon; RETRO23_AG_I; KW RETRO23_AG_LTR. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-720 RA Jurka J. and Drazkiewicz A.; RT "RETRO23_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 7-7 (2002). XX DR [1] (Consensus) XX CC 4 bp target site duplication. XX SQ Sequence 720 BP; 172 A; 112 C; 179 G; 257 T; 0 other; tgtaatgtga tgatcgtaca actgtacttc gagtaccgtc gcggtaggtc agtgttttac 60 tgacatgagc cgaggtggaa taaaacttcg ttcgaagcgg acttttcgac acttgatcgt 120 ccaggcgatc atagaaacct ttaggaggtt tcccttactg gaaacgttgg agtttagagg 180 ccttaccttg ggtaggggga cataactaaa ctctttaaat gagaagtcga tagatgtagg 240 atgggcatag tcctatcctg acagtccgaa cgaatctgac ttgccatgat cggagttggt 300 tttatttata ctcaaaagaa gttaagggtt tctcacattt ggttccgggt tgtatgttgg 360 aaagttattg ttttcactca gctagttacg tttgtctttt gttgacaaga acgatggttc 420 ttgtcactaa gttcctgttt tgggtttaat gcatatactg ttcctgcttt gggttataat 480 gcataaactg ttcctgctta tgttgtacgt ttcggttggt tctgataagt gtgtcacaat 540 tgtttttata tgaacggtgt ggcctgagct cttccgggtg tgtatggtat gatctgattt 600 actgaatgtg ttataatgaa tgtgttacaa tggtgtggtt tggctttcca aagtgtgtta 660 caatgtgtct atagctgata gtgtgtatta taataagttc tgtatagcag atatgctaca 720 // ID INVADER1-I_AG repbase; DNA; ANG; 4626 BP. XX AC . XX DT 13-DEC-2002 (Rel. 7.11, Created) DT 13-DEC-2002 (Rel. 7.11, Last updated, Version 1) XX DE INVADER1-I_AG, an internal portion of the INVADER1_AG Gypsy-like DE LTR retrotransposon - a consensus sequence. XX KW GYPSY superfamily; INVADER group; INVADER1-I_AG; INVADER1-LTR_AG; KW INVADER1_AG; LTR retrotransposon; endogenous retrovirus; gag; KW integrase; protease; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4626 RA Kapitonov V.V. and Jurka J.; RT "INVADER1_AG: a family of Gypsy-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 2(11), 15-14 (2002). XX DR [1] (Consensus) XX CC INVADER1_AG is a member of Gypsy-like retroviruses that belong CC to the INVADER group originally identified in Drosophila (see CC description of INVADER1_DM in drorep.ref). CC Members of this group encode one long polyprotein composed of CC the gag, protease, reverse transcriptase and integrase domains. CC A similar 1363-aa polyprotein, called INVADER1_AGp is encoded CC by INVADER1-I_AG (positions 488-4576). CC Positions of domains in INVADER1-I_AGp: CC gag: 1-350; protease: 360-540; reverse transcriptase: 550-1050; CC integrase: 1085-1363. CC INVADER1-I_AG is flanked by long terminal repeats, CC INVADER1-LTR_AG. CC Solo LTRs or proviral copies are flanked by 4-bp target site CC duplications. CC INVADER1_AG has been integrated into the mosquito genome during CC the CC last one million years. There is 99% identity of INVADER1-I_AG CC copies CC with the consensus sequence. XX FH Key Location/Qualifiers FT CDS 0..0 FT /product="INVADER1-I_AGp" FT /translation="MATVKQLCEDFTKIGLVRECERLGLETTGGKVEIANR FT IVKYRATVASGDNAGPSGSGHRAEPANATDDVENYAGNQPYESCDDDSEEK FT ADLDLPEEEHGDSFVAEDDDETEEDPFQTAVRISTPKRPQRVYAFRDVEDS FT IETFGAEDGHDVRIWLAHLDSVSKSAGWNDEQKLIMLRKKMTGIARKFVSS FT LRNVQTYAILKKELIAEFAPFVRSSDVHRILANRKKETAETMREYVYEMQR FT IAAQIDLDEPSLCEYIVNGVTDDDFFKSLLYEAQTIRVLKEKLLNFEKVRM FT ARKKKTTDKEENKRVLSSSSRVDKRAEQRCYNCGNKGHQARACAQTQGGPK FT CFSCREYGHKASECARNKSVVPAKINVTEESVGMVDVVLNKTSVKALFDSG FT SNQNLVTIGCYKRIEGSPLIDTSMWFQGFGGMRTKAIGMFTVDVTVDDNVF FT SGVRFFVVPNESMSYDAVLGRDSLNYFEVTMTTAGVKVRPYGSTDEMFSIV FT CDNEDNLDVSPRFSERVKAVISGYKPAGNVNSRVETKIILHDETPVRSSPR FT RFAPGEKAVLEKTIDEWLAAGIIRESESDFASPVTLARKKDGSLRVCVDYR FT ELNRKMVKDCFPMRNIEDQIDRLKSARVFTTLDLKNSFFHVPVEKSSQRYT FT GFVTHTGQYEFLRTPFGLVNSPASFSRFVADVFREFIKSERVLVYVDDLII FT PSLDEESNFQTLKELLNVASENGVQFNWKKSQFLKDEVEYLGYVIRGGCYR FT IAPSKLRSVQLFPEPKNVKQLQRFLGLTSYFRKFIAGYATISKPLTSLLQK FT GVEFVFGEEERSSFDELKRCLVTDPVLKIYDESAETELHTDASKYGYGAAL FT MQKSDDDKFHPVAFMSQQTSNAEKNYSAYHLEVLAVVRAVEKFRVYLLGIK FT FKIVTDCAAFGHTLKSKELSARIARWALMLEEYEYEVVHRPGSSMKHVDAL FT SRAPVMIVKSDPMIEAIRKMQQSDERAKAIIELLKTQSFEDFVMCDGLLMK FT VVKGREVIVVPSGMQSDLIRRIHEKGHLGARKIEGIIEQEFYIPNASEKIK FT QTIECCVKCILAERKRGKVDGLLYPIAKGDVPLDTYHVDHLGPMDITEKRY FT KYLFVVVDAFSKFTWIYPTKTTNSFEVIQRLTTQSEVFGNPRRIISDKGAA FT FTSNDFKRYCEDQDIERVEVTTGVPRGNGQVERVNQVIIAMLRKMSVNDPA FT KWYKHVANIQRWINSSPHQSIGVTPFEAMFGVPMRHEGDLRLGELMEEIRV FT AQHHDQREQTRAGARVSIEKAQEEQRRSYDLRARSATTYREGDLVVIKRTQ FT FGPGRKYAAEYLGPYKVTSVRPHDRYDVEKINGEGPKVTSTASSHMKPYRF FT " XX SQ Sequence 4626 BP; 1323 A; 804 C; 1339 G; 1159 T; 1 other; tgggggctca accgggatac ttttttccca aacgagtgag gcttttgcga tgaaagcaat 60 cggtgcgata ttgtgtgatg tttcgtgagg ctattttgtg attaaatagc aaaaaactgt 120 gaggcttttt tataagcaaa tagtgagacg ttttgtgaaa gagaagaaca gtgaggctta 180 tgtagcaacg cgtgcggatc agtgaggctt tcgtagcaat ccgtgcgaat cagtgaggct 240 tgtgaagcaa ttagtgaggc tttattttta gcaaccagtg agaatccgtg agtgatagcg 300 aggcttttcg aagcacaccg agtgaagaaa ttaagcgagg cttcacagca atctgtgtgt 360 gaaaattctg tgtgaatttt acgggtcaac tagtgtgaca ttttgtgcgt gagtgcggga 420 agtcgcgaat tgcctggtag tgtgtgtcgg ctttgtgagt gaaagtaccg cgagagagag 480 agatagaatg gcaacggtca agcagctttg tgaagatttc acgaagattg ggttggtgcg 540 cgagtgtgag aggcttggcc tagaaaccac tggaggaaaa gtggaaatcg cgaatcgaat 600 tgtgaaatac cgagcgacgg ttgcgtcagg ggacaacgcc ggtccgtccg ggagtggaca 660 cagagcagaa ccagctaacg ccacagacga cgtcgaaaac tacgcgggta accaaccgta 720 cgagagctgc gatgatgatt ccgaggaaaa agcggatctg gatcttccgg aagaagaaca 780 cggtgatagc tttgttgccg aggatgacga tgaaacggaa gaagaccctt ttcaaactgc 840 tgtgcgaatc tcgacgccta aacgaccgca acgcgtttac gcatttcgag atgtggaaga 900 tagcattgag acgtttggag cagaggatgg gcacgatgtg cgtatttggc tggcacacct 960 cgattccgta tcaaagtcag caggatggaa tgacgaacaa aagctaatca tgttacgtaa 1020 aaagatgacg ggaatcgcaa gaaagttcgt gtcgtctttg cgtaatgtgc aaacttatgc 1080 gatattgaaa aaagagctga tcgcggaatt tgctccattt gtgaggtcga gtgatgtgca 1140 tcggattctc gcgaatcgga agaaggaaac ggccgagacg atgcgagagt acgtttacga 1200 aatgcaacga attgctgccc aaatcgattt ggacgaacca agcttgtgtg agtacatcgt 1260 taacggtgtg accgacgatg attttttcaa atcattgctg tacgaggcgc aaacaattcg 1320 agtattaaaa gaaaagttgc tcaactttga aaaagtgcgc atggctcgaa agaagaaaac 1380 gacggataaa gaagaaaaca aacgagtttt gtcatccagt agccgcgttg acaaacgggc 1440 ggagcagcga tgctacaatt gcggaaacaa aggacaccaa gctcgcgcgt gcgcgcagac 1500 acagggtggt ccgaaatgtt tctcgtgtcg tgagtacggt cataaggcga gcgagtgtgc 1560 gcggaacaaa agcgtcgttc ctgcgaaaat caacgtgacg gaagaatcgg tgggaatggt 1620 tgatgtcgtg ttgaacaaaa catcggtcaa ggcattgttt gacagtggaa gcaaccaaaa 1680 cttggtgaca ataggttgtt acaaaagaat cgagggatca ccgctgatcg atacttcgat 1740 gtggttccag ggctttggtg gcatgagaac aaaggcgatc ggcatgttca cggtggacgt 1800 tacggtggat gataacgttt ttagtggtgt gcgatttttt gtggtgccaa atgaaagcat 1860 gtcttacgat gcagtattgg gcagagattc cttgaactat tttgaagtta cgatgacaac 1920 ggcgggtgtc aaagtcaggc catatggttc aacggatgaa atgttttcta ttgtgtgtga 1980 caatgaagac aatttggatg tgtctcctcg attttcggaa agagtaaagg cggttatttc 2040 ggggtacaaa cctgcgggaa acgtgaatag tcgtgttgag acgaaaatta ttttgcatga 2100 cgagacgcct gtgcgttcgt cgccaaggcg ttttgctccg ggtgaaaagg cggtgctgga 2160 gaaaacaatc gacgagtggt tagccgcggg aataattcga gaaagtgaga gtgattttgc 2220 gagtccggta acgttagcga gaaaaaagga cggttcctta cgcgtttgtg ttgattatcg 2280 cgaacttaat cgaaaaatgg ttaaggattg ttttcccatg aggaacatag aagatcaaat 2340 cgatcgcttg aagtcagcca gagtttttac cacacttgac ttgaaaaatt cgttttttca 2400 tgttcctgtg gaaaagtcga gccagcggta cacaggcttt gttacccaca caggccagta 2460 cgagtttctt agaacgcctt tcgggttggt caatagtcca gcgagtttca gccggtttgt 2520 agcggatgtg tttcgggaat tcatcaagag tgagcgtgtg ttggtgtatg tggatgattt 2580 aataattcct tcattagatg aggaaagtaa ttttcaaacg ttgaaggaat tgttaaatgt 2640 cgcgagtgag aacggtgtgc agttcaattg gaaaaaatcg caatttttaa aggatgaagt 2700 ggagtatctc gggtatgtga ttcgcggcgg gtgttatcgc atagcgccga gtaagttgcg 2760 atcggttcag ctttttccgg aaccgaaaaa tgtgaagcag ctgcaaagat ttttgggact 2820 tacgagttac ttccgaaaat ttattgctgg ttacgcgaca atttcgaagc ctttgacaag 2880 tttgcttcag aaaggtgttg agtttgtgtt tggtgaagag gagcgttcga gttttgatga 2940 gttgaaacgg tgtttggtga ccgatccggt gttaaagatc tacgacgaaa gtgccgaaac 3000 cgagctccat acggacgcgt caaagtacgg ttatggtgct gcgcttatgc agaagagcga 3060 cgacgacaag tttcatcctg ttgccttcat gagtcaacaa acatcaaacg cggagaagaa 3120 ttatagtgcg tatcatttgg aagtgctagc ggtggttcgc gctgttgaga agtttcgtgt 3180 gtatctctta ggcatcaagt ttaagatcgt tacagattgt gcagcgtttg ggcatacttt 3240 aaaatcgaaa gaactgtcgg ctagaatcgc gagatgggct ttgatgctcg aagagtatga 3300 atatgaagtg gtgcataggc caggttcatc gatgaagcat gtggatgcgt tgagcagggc 3360 accggtgatg attgtgaaaa gcgaccctat gatagaagcg atcagaaaaa tgcaacaaag 3420 tgacgagcgt gcgaaggcaa ttattgaatt gttaaaaacr caatcttttg aagattttgt 3480 catgtgcgat gggctgctga tgaaagtagt gaaaggtagg gaagtgattg tggtaccatc 3540 ggggatgcaa agcgatttga tacgtaggat acacgaaaag ggtcacttag gagctcgtaa 3600 gatagagggt attatcgaac aggagtttta cattccaaac gcgagtgaga aaataaaaca 3660 aacgattgag tgttgtgtga aatgtatcct cgcagagcgt aaaaggggaa aagttgacgg 3720 tttattatac ccaatcgcga aaggtgacgt tccgttagac acgtatcacg tagaccattt 3780 gggtccaatg gacattacag agaaaaggta taaatatttg tttgtagtag ttgatgcgtt 3840 tagtaagttt acttggatat atcctactaa aacgacgaat tcatttgaag taattcagcg 3900 attaacgaca cagagcgagg tatttggtaa tccaaggcgt attataagcg ataaaggggc 3960 tgcgtttacg tcaaacgatt ttaagcggta ttgtgaggat caggatatcg agcgtgtgga 4020 agttacgaca ggtgttccgc gcgggaacgg gcaggtagag agggtaaatc aagtgattat 4080 tgctatgttg cgaaagatga gtgtaaacga tcccgcaaag tggtataagc acgttgccaa 4140 tattcagcgg tggattaatt ctagcccaca tcagagcatc ggtgttaccc cttttgaagc 4200 gatgtttggg gtaccgatga gacacgaagg agatttacga ctaggtgagc tgatggaaga 4260 aattcgagtg gcccagcatc acgatcaacg agagcagact cgagctggtg ccagggtttc 4320 tatcgagaaa gcccaagaag aacaacggag atcgtacgac ttacgagcgc ggtcggctac 4380 aacttaccgc gaaggcgacc tggtggtgat taagcggacg cagttcgggc ctggaaggaa 4440 gtacgcggct gagtaccttg gaccatataa ggtaaccagt gttcgtcctc atgatcgcta 4500 tgacgtggaa aagatcaatg gcgaagggcc aaaagtgacg tcgacggctt cgtcacatat 4560 gaaaccctat cggttttaat gcggtgagga tccttcgggg cgaaaggatc ggtcgaggaa 4620 aggccg 4626 // ID CR1-4_AG repbase; DNA; ANG; 4411 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 12-MAR-2003 (Rel. 8.02, Last updated, Version 1) XX DE CR1-4_AG is a CR1-like non-LTR retrotransposon - a consensus DE sequence. XX KW AP endonuclease; CR1 clade; CR1-4_AG; DNA/RNA-binding; PHD finger; KW Non-LTR retrotransposon; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4411 RA Kapitonov V.V. and Jurka J.; RT "CR1-4_AG, a subfamily of CR1-like non-LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(2), 15-15 (2003). XX DR [1] (Consensus) XX CC CR1-4_AG is a young family of CR1-like non-LTR retrotransposons. CC The CR1-4_AG consensus sequence was reconstructed based on CC multiple alignment of ~50 copies identified in the CC sequenced portion of the genome. Given the ~1% divergence CC of these copies from the consensus sequence, transposition of CC CR1-4_AG occurred less than 1 million years ago. CC The 3' terminus of CR1-4_AG is composed of the TAAA CC microsatellite. CC CR1-4_AG encodes two proteins: a 349-aa CR1-4_AG-ORF1p CC (positions 363-1409) and 965-aa CR1-4_AG-ORF2p (positions CC 1430-4324). CR1-4_AG_ORF1p is DNA/RNA binding protein composed CC of the PDH domain (positions 3-40). CR1-4_AG-ORF2p is composed of CC the AP endonuclease and reverse transcriptase domains. XX SQ Sequence 4411 BP; 1287 A; 967 C; 779 G; 1377 T; 1 other; cggggagaga gttggccact gaagcgtttg gacgtgtttc tgcgttgctc ctgtgtactg 60 tctgtttttg gtgtgatttt ggtttgaaat cgttcgtttt tgagtttgtg aagctgttcc 120 tgtcgcaagt tttgctgtgt tgctgtgttg aagcattttg ttcctgcatc gtacgagaat 180 ttctacgaga acgcgtgcct gaatttgtat tttctgcgtg atttgtgttt catcggcgac 240 agcggttcag tgttattgac cctgcttaat acactcagag agtaattcgc accagctcat 300 caataagtgt cctctccgtt cactctcact gctcacctaa gtgatcgata ctctctctca 360 taatggactg tgcaatctgc tctactacta tcaacaaaga tccggttgtt tgtattggca 420 atcttccttt ttcggagtgc aattctgcct ttcatccgga atgcattaaa cttgccgcca 480 cttgtgttaa ggaggtggca cgcaatcgtg gtctctgctg gatgtgcgag aaatgccgtg 540 actcgagatc agatttattc tcatcaattt cgtgcttgat gaatacactg aaagatgagt 600 tgaaaaatgc aatacggagt gagcttgatc aaaggatatc tcagctggat ccaaatagag 660 tgatgccgca agagcgtgag aaaatcgctc cgactgtgag taccatctct ctcaccgata 720 aaacattcca cacatccacc gatactccta tgtcaccaac accaactccc gtcaaacaga 780 actcactacc acactctcaa tcgcgtatgc tctctgataa tgaacacata aatccaacac 840 aagcaattct tcacaccggc actgccaatg atcacatcaa cacagacacc atacaattca 900 ttcctgcacc tgagccaaaa gtttggatgt ttgtaactag gattgcaccc actgtaactg 960 aggagaacat gaaaatgttc attctaggaa ggttaaaatg cactgactgt tcggtgaagt 1020 gtgtcatacc aagaggtcgc gttacaagct cactaaagta tgtgtccttt aagatcggca 1080 ttccatcgga gtttggcgag ctcgctttct ctccttcaac ttggccatgc ggatttgttt 1140 accgccagtt tgaatttcac caacgtacac agaaacaatt cacaccaaca ctcccggtat 1200 cttgctttcc tgcttcaaac aattcaactg ccagaagttt ctcaactact aattttatgc 1260 ataatgatgt caactgcata aatgtgatcc caccaacaca cacaacacaa catagtccat 1320 caccgagcca tcatcttaaa aatgcaaact cacctgaaac tcacctgact caaaayaatt 1380 cctccggttc gactttttta agccaacatt aagtcattca actgtaccta tggagccagt 1440 actaaacaca ttcaatatat tttatcaaaa cgtgagaggc ttacgtacta aaacctctga 1500 atgttttgct aatactgcaa tcgccgactg ggatgttatc gttcttacgg aaacgtggct 1560 agatgacagc tttccatctg agcttttgtt tgataacaac cgctttaata cattccgtac 1620 ggatcgttct gcagcaaaca gtaacaaatg tagaggtggt ggtgtcctcg ttgcgatcaa 1680 tgcaaattat gcttcctctc tgtgctcgac aaacacatct acaattgaat gcctttgggt 1740 tcgagtaaaa gttcttaatg tttcgctaat catcggatca ttttatttgc ctcctgatca 1800 atcagccaac atggacacca taaatgcatt ctgtaattca ttgcacttaa cgagagagaa 1860 atataagaat gactttttta ttctgttcgg ggacttcaat caacctaatc ttaagtggga 1920 tatcaacggc aaatttccga cactgaatct tatgcttacc cgactatccc ctaccagtca 1980 agctctactc gatgaactaa gctttgaagg tcttcgccaa cttaacactg ttctaaacca 2040 caataacaac atgttagatc tagtgtttgc gaacgacaaa gttactgatt acatgaggcc 2100 aatcgaatta tgcatcgaaa gtattgttga acctgatgga catcatccag ctttacttac 2160 atacttcacg ctcccgcaat acagtgtgcc gtcatctaaa cccccacgtc aagcagattt 2220 caattttaga cgaactaatt tcactgatct cgtttctgca cttaaccaaa tcaactggga 2280 ctctatcgct gaccatgacg acatcaacga cagtgtggct gaattttctt ctcaaatgaa 2340 tgaactgtat gaacaattca tcccacggtt taatgttcgt gctcatccac catggaccaa 2400 ttctgcacta cgtttggcta aacgtcgaag atcacgggcc cttaaaaaac tgcatcgact 2460 aaaaaacagc actaatcaaa taaattttgc tcgcgcctca aaaatatata agcagctgaa 2520 ccgaaccgcc tacgccaact acgtcaggaa aattgaaatc aatatcaaaa gacatcccac 2580 atcattctgg aaatttgcta aagataagga atcctgtgga cgacttcctt cctcgatgca 2640 gtttgaggga aacaccatta ccggagatga agagttttgc aatgcatttg cctcatattt 2700 ctcttctgtg tacaccaaca attcctcggt accgtctaac tcaacatcgg ctttatcctt 2760 cataaatgat gaggttaatt tatgcacacc actaatcaac gatgatgagg tggaatctgc 2820 tatttcgttg ttgaagcttt cctacgcacc tgggcctgac aatattccca gcgcaattct 2880 cattaactgc aaggctgctc tcattcccat actgaccaaa ctattcaaca aatccttgca 2940 atcgaaatgc ttcccgcgtc tttggaaatc atcgtggatg tttcctgttt ataaaaaatc 3000 tgacaaaagt aatgtgtgca attatagagg aatttcaatg ttatgtgcgt gcagtaaact 3060 gtttgaaaag ataatgtctc gtcatatgct ccaagccttt tcaccattaa tttctaatgt 3120 tcaacatggc tttatgccga aacgatcgat tgagaccaat ctaatatact tacttaactt 3180 ttgccactcc tatattgaca agggcttaca ggttgacgtc atttatactg acttctgcgc 3240 tgcttttgac aaggtcaacc actttctatt gctatctaaa ttatcaaagt atggtgttca 3300 cacaaatgtt gttgagtggt taagaagtta tttaactgat cgttgcatta acgttaagat 3360 aggaactagt ttatctgcca cattccataa tttatctggt gttcctcaag ggagtatatt 3420 aggaccttta ctatttatta tatttattaa cgatgtcgtg tttgctattc ctcatgttaa 3480 attgttatta tatgctgacg atctaaaaat gttcctacct gttaaatatt ctgacgactg 3540 tgaaatgtta caagactcgt taaactattt ttctgcatgg tgttttaata atgaaatgtt 3600 acttaatgtt agcaaatgta gttgtataac attctcaaag aaaaaaaatc ctattattta 3660 taactacaaa atcaatgaag acagcgttcc acgtttttct caggttcggg acttaggagt 3720 tattttagac agtaagctta gtttatctag ccattatcaa actatcgtca ctaaagctct 3780 taaactgtta ggatttgtct tacgtgtctc ggccgacttt aaagatcctt tcagtttaaa 3840 aacattatac tgttctttag tccgtcccat cctggaattt gccagtgtag tttggtgtcc 3900 ccatcaaatc acatacatag ataaaattga aaaaattcag aaaaaaatca cacgtgttat 3960 gtttcatcgt cttccctggt caaaccaaat tccacgtcct tcgtataatg ttcggtgttt 4020 actatttggc ttagagactt tacagcatag gagaactacg gctcagataa cttttatgca 4080 taaattatta attggagatt ttgatgcacc tgacatttta aattttattt gcttttctac 4140 tccctctaga ggtcttagaa gtagagagct actaagaagc ccttttagat cgactggttt 4200 tggtgccaat gatccactgc tcaaaatgat tgatgtgtat aataggttag ggttatcagc 4260 agatttcaat caatccgtta gtcagctgcg tcagcatatt caagttagtt ctagggccat 4320 aatttaaact tgtactctgt aagcattaag ttatttaggc atactgcccg ataactttga 4380 tttaaataaa taaataaata aataaataaa t 4411 // ID GYPSY46-LTR_AG repbase; DNA; ANG; 328 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY46-LTR_AG is an LTR of retrotransposon GYPSY46_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY46_AG; GYPSY lineage; GYPSY46-I_AG; GYPSY46-LTR_AG; KW Gypsy clade. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-328 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY46_AG, a member of the Gypsy lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 85-85 (2004). XX DR [1] (Consensus) XX CC GYPSY46-LTR is a long terminal repeat of GYPSY46_AG (its internal CC portion is deposited as GYPSY46-I_AG). XX SQ Sequence 328 BP; 101 A; 79 C; 79 G; 69 T; 0 other; agttagaccg acccgctaga aaccagactg caagctggca ttccacaccg gctacgagca 60 gacgaagatg taacgctaca ccggccacga gcggacaacg gcgacatgcg caagtaatgc 120 gacacgcaga ccgatcgtga gaacggacca atgcagccag caaccagcgg cctcgttaga 180 acattagctc gttaggttta gtcagtcgaa gtcgaagtct agagtcagcc agatatagat 240 catagtttag ctttagtcag gagtaatcct gtttgtgtaa aataaaaatc ttttttttat 300 ggccaaccgg cctagataaa gattaact 328 // ID TRANSIB3_AG repbase; DNA; ANG; 2070 BP. XX AC AAAB01008960; XX DT 29-JAN-2002 (Rel. 7, Created) DT 29-JAN-2002 (Rel. 7, Last updated, Version 1) XX DE TRANSIB3_AG is a coding portion of a TRANSIB-like DNA transposon. XX KW Transib; DNA transposon; Transposable Element; KW TRANSIB superfamily; TRANSIB1_AG; TRANSIB3_AG; transposase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-2070 RA Kapitonov V.V. and Jurka J.; RT "TRANSIB3_AG."; RL Direct Submission to Repbase Update (27-DEC-2002). XX DR Genbank; AAAB01008960; Positions 16423 14354. XX CC TRANSIB3_AG is a copy of a young Transib-like DNA transposon. CC The copy encodes a 709-aa TRANSIB3_AGp transposase. XX FH Key Location/Qualifiers FT CDS 0..0 FT /product="TRANSIB3_AGp" FT /translation="TGKFNFLNHTIHYNRVSNMVSLQVREIVDQFNFGDSQ FT ITNINKALAYVDTKARNRELDIEAARSEVENIYLTRAQLQWDQCNRVRGTF FT EANNAEWLASKFPTQLLSNGASTSNTNSCSLPANYEQSSELAERMKIRFME FT HSETESANERTSPTRIARRKAYRLKEYNIMKQISLLLQSPRDLLISTDHSD FT MLSDEEVLAFYIDLGLTEAQYKSTMSSLVPSRFPSFNAIRRAEALCVPFHE FT QIVTTECAVKMEMQCLVEYTVRRILEMQKKVLRDFLTEKQLNVTLKLRCVF FT TYGIDATSGHGTDVLISVLSPIKLHLDDASHHIFWLNVTPQSYRFCRPISL FT QLAKPSKALVLQTTKHSVDEQISNLKPMHIVLEDDKAVDVEFEFVLSMIEG FT KVLSYMLDDSSTTCCLICCAAPSEMMDASILATSGFIGQEEPLVYSISPIH FT CWLTFFELLLQLSYRMDFKEWQVPEVQRTTFNERMKTVQQRLYEAFGVRVA FT ETADAGSSSSTNTMGYISRRVLADPALASTTLNIDQDLIERFRNILIAINS FT RHPLHPAKVQQYCSDTYRKYLTELYSWSRVPAAVQKVLAHAGQLIVRSPLR FT LGYVGKKSGEIEHNYFTSDRELCARTTSKQDALKDPFVEALTICSDPKISS FT ISLSNRIKRKKRAAYPDVVASFFIFDESYCETDDDDDSNSGDDDEDDLDSL FT DEFWTELDWLTNEDIAHVKG" XX SQ Sequence 2070 BP; 614 A; 461 C; 470 G; 525 T; 0 other; tttaatttct taaatcacac gatacactac aaccgagtga gcaacatggt ttcattacaa 60 gtgcgagaga ttgtggacca atttaacttt ggcgattccc aaatcacaaa cataaacaaa 120 gcgttagcct atgttgatac gaaggctcgg aaccgggagc tagacatcga ggctgcccgt 180 tccgaagtgg aaaacatcta cctacgtgct caactacagt gggatcagtg taatcgcgtc 240 cgaggcactt tcgaagcaaa taatgcagaa tggttggctt ccaaattccc gacccagtta 300 ctgtccaatg gtgcttctac atcgaataca aatagctgta gtctgccagc taactacgaa 360 cagtccagtg agcttgctga acgaatgaaa atacggttta tggagcattc agaagagtcg 420 gcgaacgaaa gaacatcacc tacacgaatc gctcgccgga aagcatatcg gctcaaagaa 480 tataacatca tgaaacagat cagcttgttg ctgcaatcgc cacgcgatct tctcatatca 540 acggatcatt ctgatatgct gtccgatgaa gaagtattag cattctacat agacctgggt 600 ctgaccgaag cccagtacaa atcaatgagc agccttgtgc cttcccgctt tccatcgttt 660 aatgccatta gaagagcaga agctttatgt gtgccgtttc atgaacaaat tgtgactaca 720 gagtgtgcag tgaaaatgga aatgcaatgt ctggttgagt atacggtgcg aagaatattg 780 gagatgcaga agaaggtgtt gagggatttt ttaacggaga agcagttgaa cgtactcaag 840 ttacgatgtg ttttcaccta cggcatagat gccacatcgg gccatggtac tgacgttttg 900 atcagtgtcc tatccccaat taaacttcac ctggacgatg cgtcacatca cattttttgg 960 ctcaacgtta ctccgcagag ttatcgtttt tgtagaccaa tttcattgca gctagcgaag 1020 ccatcaaaag cattagtcct tcaaaccaag cacagtgttg atgaacaaat ttccaaccta 1080 aaaccaatgc acattgtttt ggaggatgat aaagcagtcg atgtggagtt tgaatttgtg 1140 ttaagcatga tcgagggtaa agttttatca tacatgctgg atgattcgtc gaccacttgc 1200 tgtctgattt gctgcgcagc cccatcagaa atgatggacg cttcaatcct ggcatcgggt 1260 tttatagggc aagaggaacc gttggtgtat agtatttcac ccatccattg ttggctaaca 1320 tttttcgagc tgctgctaca actgtcatat aggatggact tcaaagaatg gcaagtcccg 1380 gaagtgcaac gaaccacatt taacgaaaga atgaaaaccg tgcaacagcg tctctacgaa 1440 gcgtttggcg taagagtagc agaagcagat gcagggtctt cctccagcac aaacacaatg 1500 ggatacatca gccgacgagt gttggccgat cccgctcttg caagtaccac gctcaacatc 1560 gaccaagatc ttatcgaacg cttcagaaac attctaatcg caatcaatag tcgtcatccg 1620 ctgcaccccg caaaggtaca acagtactgc agcgacacct atcgcaaata tctggagctc 1680 tacagctggt cccgagttcc ggcagcagtg cagaaagtct tagcacacgc cggtcaactg 1740 atagtacgct cacctctgcg tttaggctat gttggcaaga agtctggtga gatcgagcat 1800 aactatttca catcagatag ggagctctgt gcaagaacaa cttcaaaaca agatgctttg 1860 aaggatccat ttgtggaagc acttatctgc agcgatccta aaataagttc tatatcgctt 1920 agtaatagaa tcaagcgtaa aaaacgagct gcttatccgg atgtagtagc gagctttttc 1980 atatttgatg aaagctattg tgaaaccgat gacgacgatg attctaatag cggcgacgac 2040 gatgaggacg atttggactc gctcgatgaa 2070 // ID RT2 repbase; DNA; ANG; 6733 BP. XX AC M93691; XX DT 14-SEP-2005 (Rel. 10.09, Created) DT 14-SEP-2005 (Rel. 10.09, Last updated, Version 1) XX DE Anopheles gambiae RT2 retroposon. XX KW Non-LTR Retrotransposon; Transposable Element; RT2. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-6733 RA Besansky N.J., Paskewitz S.M., Hamm D.M. and Collins F.H.; RT "Distinct families of site-specific retrotransposons occupy RT identical positions in the rRNA genes of Anopheles gambiae."; RL Mol Cell Biol 12(11), 5102-5110 (1992). XX RN [2] RP 1-6733 RA Butera S.T., Perez V.L., Besansky N.J., Chan W.C., Wu B.Y., RA Nabel G.J. and Folks T.M.; RT "Extrachromosomal human immunodeficiency virus type-1 DNA can RT initiate a spreading infection of HL-60 cells."; RL J Cell Biochem 45(4), 366-373 (1991). XX DR EMBL/GenBank/DDBJ; M93691; Positions 1 6733. XX FH Key Location/Qualifiers FT CDS 1533..2855 FT /product="RT2_1p" FT /translation="MKEQNAKLLEQITGMCQLLQEEKEEAKRREEKLEAQM FT EKLAAAHQRDRDVLNSLLAAKVGGGQPSASPRQPPTPLPRRSSAQPQQQQQ FT QQQRNQHEQEQPRASTSRAVMPPRSEALTAVRGDVVPELTYSEVVRRRYRG FT KATGKPRSQQQPQQQQQQRQLQRQAVGIAQHQQQQQQRQPQRQAVAGSQQQ FT QQERMQQQQQLQRKRKPRPDIIEVSPSEGETWDGIYDKVRKAIRLDAAHSE FT MKGHIKQGRRTHARLLRMELSKTANAPLMLEGVRKIIGDAGVSRLVTEMGE FT LLVVDIDPLATEEDIIAALDAKIGASAGVVSASIWELPDGSKRARIRLPVK FT SARQLEGLKLFLCDCVSKVRAAPPTPPERQRCFRCLEMGHIASNCRSTADR FT QNLCIRCGLTGHKARSCQNEAKCALCGGAHHIGHSECARSAQRCSRP" FT CDS 3280..6477 FT /product="RT2_2p" FT /translation="MVEQLGLIVINQGREYTFVGNGVALPSIVDVAFASPS FT IARPDTWVVSTSYTASDHRYVLYTVGGTPPSPEQLQNHQQTAQQSSQQQQQ FT QQQQQSLQQQQLSQQQQQQRQRQPSSQQGDSSSQRRVRHAGRRWKASQFSP FT SSFLEALFAADFVQRASTQEGMIAAMLKACDETMQRVTRLHQDPHRNIFWW FT SPLLARLRNNCEVARDRMLQTADLEERSIAAAEHRTARAELSRAIRASKRN FT LFQELIEIAEENAFGAGYRVVMSRLRGSRTPSEADRVVLERIISDLFPEHP FT PCDWSQLSNVGSVEGATTTAGIAPVTDDELLLIASQMANHKAPGLDGIPNA FT AVKTAIMLFPESSGFLYQDCLNRASFPAQWKRQRLVLLPKQGKPPGESSSY FT RPLCMLDALGKVLERLILNRLNEHLEEPSSPRLSDRQFGFRRGRSTVSAIQ FT RVVEAGRTAMSFRRTNGRDNRFLLVVALDVKNAFNTANWQSIASRLQAKGV FT PVGLQRMLRSYFEDRVLYFDTSEGPVVRHVTAGVPQGSILGPTLWNIMYDG FT VLDVPLPPDVEVIGYADDLALLVPATTTDEVRARAEEAVDQVQRWMQQHGL FT ELAPAKTEAVLISSKKTPPQVTFRVGDVEVQSSRSIRYLGVQLQDHLKWRD FT HVTKVSEKASRVVAAVTRLMQNHSGPRTAKSRLLAYVAESVLRYAAPVWAE FT ATQVRECRRMLQRVQRKAAIRVARAFRTVRYETATLLAGLVPICHLINEDA FT RVHQQLLAPDRAATREDIRATERQNTIDCWQEEWDADALQQDASRHTRWTH FT RVIPSVGDWQSRKHGDMTFHLAQVLSGHGFFRDYLCHNGFTSSPDCQLCVG FT VPETADDAFFECPRFAAVRQELLGEGGPDPVCPDTLQRHLLRDADSWSRIC FT EGAKRITAQLQRAWDEERAALAVNVIERQDEDAAELEAQRAEVRRARNERR FT NANRRAATARRREERRAGLPPTPPASPRTAQRRAALRERQARFRERRRNRR FT LLGMSGQAINNDDDGRERADLAAGPSGMRNRAIDEENEATDGGLNAAEEAA FT VVEAEVASR" XX SQ Sequence 6733 BP; 1602 A; 1879 C; 2003 G; 1249 T; 0 other; gagtgttcat cgagtagccg tcgagtgaag acggatgtaa acacgctccg acagagtttc 60 ggaaaaaagt gtgtaaaacg gcctcggatc gctcggaaag tgttaaaaac gagtttagag 120 cacccgtttt acctttagac ggtgttattt acccgaaaat cggtgatttt ttgtaaaagt 180 gaaaaatcgt gtttttgtgt atacatttgc ccccccccgg tagcaccaaa tttccgggtg 240 tgggaattaa gtgtaaaaat ctgcgatttt agtgccggaa acacttgtat ttaactaaat 300 ggacgcttct gaagcgattc caagtgtttt ggaaacgatc cgtgcagcaa aaagtgaaaa 360 aaacgcgaaa accaaaagtg tcgggggtgc taccgctggg gcatgtgtcc gcggtaaata 420 acgaaaaaaa cagtaaatcg tgaataactc ggtgagtttt ggtcggattc gagtgcggtt 480 ttcaccattg tgcgtgtctt tgtgcaaaca ataagaatca aacaaaaaaa cagtgaaaat 540 cggtgaaaaa atttttgaca tttgtgagtg acagttcagt aatacccacg ggacaaattt 600 ggccggggcg caattcccaa gggcaaaatt tgaattttcg gcccggtggc ggttggcggc 660 ggaaccgagg gtgccgaaaa attgtcaaat ttgcaggggt ggtagagcaa gtaacgccga 720 ctacgggaaa attattttcc cccctccccc ccctcccccc tcccccccag aggtcagaat 780 agagtagggc gcccgataag agtcggataa ggtcgaggtc gcccgaaaat tctcattttt 840 ggtagggtgg tagagcgtga ccccagcagt ccgggaaaat tattttcccc gctcccaccc 900 ccccccccct cccagtgaaa aattttataa gtgtcaaaaa gtcggaaaag tggtccaaaa 960 agggtggaat tccgggtcga gccgtttgga cgagtgaacg tgccgcgaaa atttgggagt 1020 ggtgcggaat aattccccac gtgtgtgacg caccctcctg aattttttcc ccaccccctt 1080 cccccctccc ccccgcccac cagggggcac aaattggacc tttttcgccc tgaacaccgg 1140 agttgaagcg gaaaccgaca tcatcgtgca gcagatcacc agctaggcgg cgcagaatcc 1200 agggtcattc cgaccccctg ggtcattcga cccccgggtc attcgacccc cagggttatt 1260 ttaccccacc catcgcgaac gtcggttatc gcgatggaag cacccgggag atcgacccgc 1320 tcggtgctag ggcgacgtcg gtggactgcc gcaccagctt ggctccttgc tccaagctct 1380 ttgcggccga gccacgtgtt gcgcttccta agctaagcgc cacaggcgct agtaagccca 1440 ttgccgagcc aaaagctgca tcagcgacac cggctccgga gctcgagttg ctcagagcta 1500 caatacaacg gctcgaggag cagaactgtg cgatgaagga gcaaaacgca aaactcctgg 1560 agcagataac cggcatgtgc caactgctgc aggaggaaaa ggaggaggca aagcgccgtg 1620 aggagaagct cgaggcgcag atggagaagt tagccgccgc acatcaacgc gatcgagatg 1680 tgctcaactc tctgctggcg gcaaaggttg gcggtggaca accgtcagct agtccacgtc 1740 aacctccaac tccgttgccg cgccgatcct ctgcgcagcc gcagcagcag caacagcagc 1800 agcagcggaa ccagcacgag caggagcagc cccgcgcgtc gacgtcgcgc gctgtcatgc 1860 cgccgcgtag cgaggcattg acagccgtcc gcggagacgt cgtgccggag ttgacctaca 1920 gcgaggtcgt gcggcgtaga tatcgcggca aagctacggg taagccacgc tcccagcagc 1980 agccgcaaca gcagcagcag cagcgtcagc tacagcgaca ggcagtcggt atcgcgcagc 2040 atcaacagca gcagcagcaa cgtcagccac agcgacaggc ggtcgctggc tcgcagcagc 2100 aacagcagga gcgtatgcag cagcagcagc agctacagcg taagcgaaag ccgaggcctg 2160 acatcatcga ggtgtctccc agcgaaggcg aaacctggga tggcatttac gacaaggtgc 2220 gcaaagccat tcgtctggac gcggctcaca gcgaaatgaa agggcatatt aagcagggcc 2280 gccgaactca tgctaggctg ctacgtatgg agctgagcaa gacagcaaac gctccgctta 2340 tgctggaagg cgtccgcaaa atcatcggcg acgcaggcgt cagtcggctt gtcacagaaa 2400 tgggtgagct gctggtagtc gatattgatc cccttgctac ggaggaagat atcattgctg 2460 ccctcgatgc taagattggc gcaagtgctg gagttgtttc tgccagcatt tgggaactac 2520 cggatggttc gaagcgagca cgcatccggc tacctgtgaa gtcggctcgg cagttggaag 2580 gacttaaact gttcctgtgc gactgtgtga gcaaggttcg agcagcccca ccaacgcctc 2640 cagagcgaca gcgctgtttt cgctgtctgg agatgggcca catcgcctcg aactgccgtt 2700 ccaccgcaga tcggcagaat ctgtgcatcc gctgtgggct taccggacac aaagcacgat 2760 cctgccagaa tgaggcaaag tgcgcactgt gcggtggcgc tcaccacata ggccacagcg 2820 aatgtgctcg ttcggcccaa cgatgttccc ggccctgaaa gttctgcagg cgaacctggc 2880 catggccgtg atgcccagaa cctggtgctg caagctgcca gagaggagaa agcagacgtg 2940 ctcattctct ctgatgttct gcgcccacct gaaaacaacg gccggtgggc attcagcagc 3000 tgcaaggcgg tagcggtggt agctgtcggt gagctaccaa tacagcgggt gtggtgcagt 3060 gaagctcagg ggttggttgc agcgcagatc ggcggagtgg ttttcatcag ctgctatgct 3120 ccaccaagcc ttaacctcgc agagttcgag cgcttcttgg aagcaataga actcgaaggc 3180 ttctcccacc ctcaagtcgt cgtcgccggc gattttaacg ccagacatga ggagtggggc 3240 agcccgagga cttgcgaccg cggggaagag ctgcacggaa tggtggagca gcttggccta 3300 atcgtgatta atcaaggtcg ggaatacacg tttgttggca acggggtggc tctcccgagt 3360 atagtggatg tggcattcgc gagcccgtcg atcgctcgcc ctgacacctg ggttgttagc 3420 acaagctaca ccgcgtcaga ccaccgctat gttctctaca ccgtgggagg aacaccacca 3480 tcccccgaac aactgcagaa ccatcagcag acagcgcagc agtcgtctca gcagcaacag 3540 cagcagcagc aacagcagtc cttacaacag cagcagttat cacaacagca gcagcagcaa 3600 cgacagcggc aaccatcgtc gcagcagggt gattcctcaa gccagaggcg tgtgcgtcac 3660 gctggccgcc gatggaaagc ctcgcagttt tctccttcct cattcctcga ggcgctgttc 3720 gctgcggact ttgtccaacg cgcatcgacc caggagggta tgatcgcagc catgctgaag 3780 gcctgcgacg agacgatgca acgggttacc cgactgcacc aagacccgca tcggaacatc 3840 ttttggtggt ctcccctcct ggctcgcctg aggaacaact gcgaggttgc ccgtgaccgg 3900 atgctgcaga cggctgactt ggaggagcgc agcatagcag cagctgagca caggacagca 3960 agggcggagc tcagccgagc aattcgagct agcaagcgga acttgttcca ggagctgatc 4020 gaaattgcag aagagaatgc gttcggggca ggataccgtg tcgtcatgtc ccggctccgg 4080 ggcagtcgga cgccttcaga agcggaccgg gtcgtcctcg aacgcattat atccgacctc 4140 ttccctgagc atccgccctg cgactggtcg cagctgagca acgttggaag cgtcgaggga 4200 gcaacaacaa cggcgggaat tgcaccggta acggacgacg agctgctgct catcgcgagc 4260 cagatggcaa atcacaaagc accagggctc gatggcatcc cgaatgcggc agtgaagacg 4320 gccatcatgt tgttcccgga gtcttccggg ttcctgtacc aggattgcct caaccgcgct 4380 tcgtttccgg cgcagtggaa gaggcaacga ctggtgttgc tgccgaagca ggggaagcct 4440 ccgggagaga gcagctcgta ccgacccctc tgcatgctcg acgcactggg gaaggtactt 4500 gagcgcctca tcctgaatcg cctcaatgaa catctcgagg agccgtcttc accccgactg 4560 tcggaccggc agttcggatt tcgtcgaggg cggtcgacag tgagcgccat ccagcgggtt 4620 gttgaggccg gccgtaccgc catgtcgttc cggcgaacga acggccggga taaccgtttc 4680 ttgcttgtgg tggcgttgga cgtgaagaac gccttcaaca cggccaactg gcaatccatt 4740 gccagccgcc ttcaggcaaa aggtgttccc gtcggcctcc aacggatgct tcgaagctac 4800 ttcgaggatc gtgtgctgta ctttgacacg agcgaaggcc ccgtcgtacg gcatgtaacc 4860 gccggtgttc cacaggggtc catcctgggc ccaactctgt ggaacatcat gtatgacggg 4920 gtgctggatg tgccgctacc tcccgacgtc gaagtcatcg gatacgcgga cgatctggcg 4980 ctgttggtac ctgctaccac cacggacgag gttcgcgcga gagcagagga ggccgttgac 5040 caggtccaac gttggatgca acagcacggt ctggagctgg ccccagccaa gactgaagcc 5100 gtcctgatct caagcaagaa gactccgccg caggtgacat ttcgcgtcgg tgacgtggaa 5160 gtccagtctt ctaggagcat caggtacttg ggcgtgcagc tccaagatca cctgaaatgg 5220 cgagatcacg tcacgaaagt ctccgaaaag gcgtcgcgcg tggtggcagc tgtaacgcgc 5280 ctcatgcaaa accacagcgg ccccaggacg gccaagtcaa ggctgctggc gtatgtggca 5340 gaatcggtgt tgcggtatgc tgctcccgtc tgggctgagg caactcaggt gcgagagtgc 5400 cgacggatgc tgcaacgagt tcagcgaaaa gcagccatca gggtggcacg ggcattccgt 5460 accgtcaggt atgagacggc caccctactc gctggactcg taccgatatg ccacctcatc 5520 aacgaggatg ctcgggttca ccaacaactc cttgctccag accgtgctgc aacgagggag 5580 gacatccggg cgacggagcg gcagaacacc atcgactgct ggcaggagga atgggatgcc 5640 gacgcactgc aacaggatgc tagccggcac acgcggtgga cgcaccgtgt aattccatcg 5700 gtcggcgact ggcagtctag gaaacatgga gatatgactt tccatctggc acaggttttg 5760 tccggacacg gatttttccg tgactacttg tgccacaatg gattcacgtc gtccccagac 5820 tgccagttgt gcgtcggcgt cccagagacg gcggacgacg ccttcttcga gtgcccacgc 5880 tttgcggcag ttcgacagga gctactcggc gagggaggtc cggaccctgt ctgtccggac 5940 accctccagc ggcacttgtt gcgcgacgcc gatagctgga gtcgtatttg cgaaggcgcg 6000 aagcgaataa cggcccaact gcagcgagcg tgggacgagg agcgggcagc attggctgtt 6060 aacgtcatcg agcgtcagga cgaagacgca gcagagctgg aggcccaacg cgcggaagtg 6120 cggcgggccc gtaacgagcg acgcaacgcc aaccggcgag cagcaacagc acgccggcgg 6180 gaagaacgtc gagctggact tcccccaacc ccaccagctt caccacggac cgctcagcga 6240 cgtgcagcgc ttcgtgagcg tcaagcgcgg ttcagggaga ggagacgaaa ccgtcgtctt 6300 ctcgggatgt ccggccaggc gataaacaat gacgatgacg gccgagaacg cgcggattta 6360 gcagccggac catctggtat gcgcaaccgc gcaatcgacg aggaaaacga ggccacggac 6420 ggtggcctga acgccgcgga agaagccgcc gtggtcgagg cagaagttgc ctcccgctag 6480 gcgtggtatg cacgtaaaaa cagcgacgca tcgagcgtga aaaagcgccc tatttagggg 6540 aatgagtgaa tttttcggtt gaatttagaa atagaaataa aacatggaag gtgcttttcg 6600 cacggacaaa aggttggcca ttaagccatg aaaccccctg cagggtaagc cctcgcgggt 6660 aaaatgtagg ggagcgggag ggtttaattt tgaaacaaaa taaaaaaccc gtttataaaa 6720 aaaaaaaaaa aaa 6733 // ID HidaAg1 repbase; DNA; ANG; 5523 BP. XX AC AB090822; XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE Anopheles gambiae retrotransposon HidaAg1 DNA, complete sequence. XX KW Non-LTR Retrotransposon; Transposable Element; KW reverse transcriptase; gag-like; HidaAg1. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RA Kojima K.K. and Fujiwara H.; RT "Evolution of target specificity in R1 clade non-LTR RT retrotransposons."; RL Mol. Biol. Evol 20(3), 351-361 (2003). XX DR Genbank; AB090822; Positions 1 5523. XX SQ Sequence 5523 BP; 1524 A; 1292 C; 1438 G; 1269 T; 0 other; gaagaggttc ccgaaggggt gacgtgtcaa cagtccaggc agaggagctc cgtgtttttc 60 ttccagattt tgttgtaaaa agacgcaaga agataggcat tttataagat aacagcagtt 120 aatcgcgttt acacttgaac cagtgtcata cgacagatta gggtgctgct acactccgcg 180 aattttatgc acacacatcg aaatgacagc cacacagcag ctggaggcag ttaatgtttt 240 ctccacaaag gggagcaagt ttcggtggta aagatgagtt tttcatctca aatgcagaat 300 gcgctcggat tgactcggag catgtcagcg gatacatcaa aaacaagcgt ggggaagcag 360 ttgcctgctt ctggaatacc aacactacgc gctccaatgg ccgctgggaa cgctggctct 420 gtagtctcca agacagtgga agacttacag cggtcgctag cagcagagaa agaggagaag 480 atgaagctca ccgtgctttt acaggaacta caagctcaaa ttagcattat gatgaaaaaa 540 tctcgagaga cgaaggaaga ggcacgtcgt gataaagaaa aagccatcag acatcgtgag 600 gagtaccggc gcgacatggc actaatccgt gaggagaata cgaaactcct ggcccaatta 660 atggcaatga aggtagttac aacaactgca gggagcattc catccgcttc cttatcgcag 720 aggcagcaat cttcaccgca gccctctatg gcatcggtag tggcaaatgg ggacacagcg 780 tctacatcac accgggtgac gctgactcaa agccagtata ggagggcacc aatttctaat 840 ttcgtggaat cagatggtat ttggcgcgag gtgactcgac gcaagtcacg tcggtctgac 900 aatcgccgga acgagaggga atctacgcag tatcaacaaa gcgtgcatca acctcaacaa 960 agttcccggg accagcagca tggtgcgcaa catcgacctc agacgacacg tccgaaccgg 1020 caagacataa ttgaggtaac atcatttaca ggtaaaatgt ggtatcaggt gtacaaacaa 1080 atacgcgaag cacctgaaat ggagaagatg aacgaaaaat tgcatattgg tcgtcgtaca 1140 gccaagctca atctacgcat gaaagtggct cgtagcattg acagttctga ggcgatggca 1200 cgcatacaag gagttctacg agatgaaggc tcagtccgtg tactcaccca gatgactgag 1260 gtcatcataa cgaatgtaga tcccttagca aatgacggag acattcgaag cgccatcggg 1320 aacgtgacag ggtctgcttc gagcatcgcg actatacaac tatggcagct gtcagatggt 1380 actcagaggg cacgagtttg cttgcccctg gctcatgcaa aactaataat tagactgcgc 1440 ctaaaggtgt tatacaccat gtgtgcagtg aaagaagctc ctcacactcc catagagaaa 1500 ctgcgttgct atcggtgttt ggaaaggggc catgtgtcgc gagactgtca cagcccagtc 1560 aatcattcga acgtatgtat ccgctgcggt actagtggtc acttagcggc cacctgcgag 1620 gcagaagtac gttgcgcttc ttgtgctggc ccgcatcgta tgggcagtgc tcaatgtgtt 1680 cagtctaatt ctcaatgatg attacgctgc aaattaatat ttcgaactgt agtacttcgc 1740 agaatcttat gcttcaagca gcgaaggaac aacatgctga cgtgatactg gtatcggagc 1800 tataccgaca cccaccaaat aatggtaatt gggcggtgga ctcatcggga agagtagcgg 1860 tggtagctgc tggatctcga ccaatccagc ggatgtgggg cagcgctgta ccgggtctag 1920 ttgctgctga cattggtggt ataaccttca tcagctgcta tgcttctcct cgaatgactg 1980 ttgctgagtt tgaggaattc cttaacgcag ttgaaatcga agtaagtgcg caccctaacg 2040 tagtgctagg aggggatttc aacgcctggc acgaggcttg gggaagcgct aggacgaagc 2100 gaaaaggcga ggagctgctc aataccgtcg aacagctcgg gttaatagtg ctaaaccgtg 2160 gtaatacctc aactttcacc ggacgaggaa tcgcaggaga aagcgtgata gacgtgactt 2220 ttgcaagccc atcgattgtg cgctacaata catgggaagt gcttaaaagt tattggtata 2280 gcgatcatcg ttatgtccga ttttctgttg acagctcatc tgttttaggt aatggtatac 2340 aacttcatcg tcatcaacac caacttcaac cacagcagcg tcgttttcat cgacaaagtc 2400 ctgcacaccg acgaaaacct cgctggcgcc gcgctggccg acgatggaag gtggggcaat 2460 ttctaccgga atctttttgt ttcgctctcg aagcagtcca cttcgcggag attgcaagga 2520 ctcccgagac acttcaggtg gctctctcta gggcgtgtga tgtagcaatg gaacgcgtca 2580 gttcatcgac accctactat caaacaaaac ctcaggtata ttggtggacg ccagagagag 2640 cacaattacg tgagctctgc aaagaggcta gcgacaatgc ccattcacgt gtggaccctg 2700 atgagagagc agcggcatcg gaaattcatc aagagaggcg aagtgagctg aaacgtgcca 2760 taactacggg gaaggggcag ttatttcaac agcaaataga cgaggttaat gcgaacgtgt 2820 atggttcggg ttatcaggtc gtcacctccc acttgcgcgg tagtcgcact cctcccgaat 2880 tggatcgaga agtgttggag cgcatagtca ccgacctgtt tcctgaccac aattcattcg 2940 attggcccat gcctacagaa acctcatccg aaccttatca catccgacct gtgacggatt 3000 tagagctaga acgtatagct gacgatatgt gctcgagaaa ggcacctggg ctggacggca 3060 ttcctaatat cgctctcaag actgccatca agaaacatac agcagtgttt cgttctatct 3120 atcaaggctg tttcgatcgc ggcgaattcc cggctcattg gaaaattcag cgtttggtgc 3180 tactgcagaa acccgggaag ccaccgggag aatcttcatc cttcaggcct cttggaatgt 3240 tgaatgggct tggaaaggtg ctagagcgcc ttattctgaa ccgtctaaat gagtttcttg 3300 agaacggtga gacttcccat ctttccccaa accagtacgg cttccgtcgg gggaaatcga 3360 cggtgcaagg tatcctgaga gtagttcaag ctggaagaac cgcgaaatct tttaatagaa 3420 caaacggacg tgacatgcgc tgtttgatgg tggtttcctt ggatgtgcgg aatgcattta 3480 atacggcgag ctggaaatcg attgcaatgg cgctaagatc caaagaagtc cctgcttctc 3540 tacaaaagtt actgcagcat tggatgactg atcggcaact tgtttttgac accgatgatg 3600 gccccgtaac gcgaaatttg tctacaggcg ttccacaggg gtccatattg ggccccacat 3660 tgtggaatgt tatgtatgat agtgtgctgg acgtacagtt gcctgaagga gcggaaatca 3720 tagcctttgc cgatgatctg ttgttattag atccgggaat tactcctgaa gcggcttcgc 3780 agcgtgcaga agaagcggta tccgcggtta atctttggat ggaaaatcat tgcctggagc 3840 tggcgccggc caagaccgaa ttggttacaa tctcgagcaa aaggcagggc aatataaacg 3900 ttccggtggt catcaatgga gtggagcgga gaacgacccg aagtattcgc taccttgggg 3960 tcgtgatcga caaccaattg tcttggaagt cgcatgtaga gtattgcacg acgaaagcgc 4020 ttcgaacagc aaaggcgctt gggtgtctta tgcgcaatca cagtggcccc aagtgtgcaa 4080 aacgacgtct tctggcatcc gtagtagact ccatccttcg ttatgccgcg cccgtttggc 4140 atgaagctac caagaatcag gaatgccgga ggatgctgca aagggtgcaa aaacattgtg 4200 caataaaagt gtcaagcgca tttccgacgg tacgttatca aacagctgtt gtgctagcca 4260 gcatgatacc tatctgtctg ttggtacaag aagacgctcg atgttaccaa cgacaacaag 4320 aggcgggagg agccctttcg gcagggatac ttcgagcaga ggagcgcatc aacaccatgc 4380 aaagctggca ggaagaatgg gacgcggacg caagtcaagc tgatgccagc agattcgtgc 4440 gatggacaca tcgtgtcatc cctgacatcg cggcatggca tttccgaaga cacggagaag 4500 ttaacttcca tttgtctcag gttttgtccg gtcatggctt tttccgtgac gacttgtgtc 4560 ggatggggtt cacaccgtcg ccagattgca tcaggtgtac gggtgttcca gagactgccg 4620 agcacgcaat gttcgagtgc cccagatttg cggaaatcag acaaaagctg cttggtgaag 4680 ccaacactga cgcgattacg cccgaaacac tgcagttcca tctcctacaa agtcaagaaa 4740 aatggagcag gatcgctgaa gctgcgaagc agatcacctc cgcattgcag cgggactgga 4800 acgaagaacg agcgcgtctg gcagtttcca gcacattatc accctcgcat cctgttggac 4860 caagcgatag aaaccaggtc attgctgcaa gacgtgaacg ccgcaatgcc aggcgccgtg 4920 aacgaagggc ccgtgagagg gaggcacagc aacaaaaccc atctcttgtc ttttcagcgt 4980 cttcagaagc aacagaaggg cgcgaatcag cccacccaga acgcagagag caggtacgac 5040 cacagcgcag gatcaggcaa catatgccgc aacaaaaaga ggtagtcgag ctttcagacg 5100 tcacccaata tgcaacagct atcagtgagg atgtctactc ctccaaccct attcagggag 5160 gacttacggc ggcagagtca gcagtggcta cagaaaccga aatatcctct cgctaatcgg 5220 tgacacaaca acaaatcgga agctccttca gactcgtcag tgtcctgttg gaattgggtt 5280 ggatagattt gtatttttct ttatttttgt cttaaaaatc attttgcttt attttgtttc 5340 gaactatcaa atagttatta ttaatgtaat acacacacaa gagaactgct atgacggcat 5400 tacaaaatcg cacttactaa ccctcgcggg aattgcatgt tgcgaaaggc gagggagggt 5460 tattctactt tgtttgtata aaataatacg aaataaatct cccgacattt atattaaaaa 5520 aaa 5523 // ID HELITRON1_AG repbase; DNA; ANG; 6666 BP. XX AC . XX DT 11-NOV-2002 (Rel. 7.1, Created) DT 11-NOV-2002 (Rel. 7.1, Last updated, Version 1) XX DE HELITRON1_AG, a rolling-circle DNA transposon - a consensus DE sequence. XX KW Helitron; DNA transposon; Transposable Element; HELITRON class; KW HELITRON1_AG; Rep/helicase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-6666 RA Kapitonov V.V. and Jurka J.; RT "HELITRON1_AG, a rolling-circle DNA transposon from African RT malaria mosquito."; RL Repbase Reports 2(10), 8-8 (2002). XX DR [1] (Consensus) XX CC HELITRON1_AG is a first example of HELITRONs (a rolling-circle CC DNA transposons) present in insects. The A. gambiae genome CC contains many families of HELITRONs. HELITRON1_AF is one of them. CC A defective copy of a HELITRON-like element is also present CC in the D. melanogaster genome (AE002840, positions 9827-9264). CC HELITRON1_AG has all basic hallmarks of HELITRONs: it is CC flanked by 5'-TC and CTAG-3', is inserted into the AT target CC site and encodes a 1792-aa HELITRON-like Rep/Helicase protein, CC called AGHEL1p. CC AGHEL1p (positions 820-6195): CC MQPSVSDIVSQNLTPPEETPNEKKARLQRKRQALYRAKKRLPGAAPTLAAVQDDQQAVPSTLAGSSLSAA CC SAILQQQQRRNTDDELLNEIAGRYSTPPDETPPEKKARLARKRQALYRAKKRLAGAAPTVAAVQDDQQAV CC PSTSAGTSLSGAPAILQQQQLRRNIDDDLLNEIAGRYSTPPDETPPEKKARLARKRQALYNARKRMVQTP CC NVATVVNRAPAPASAPVPAPAPAPTPALVDPTAAVAIRVPIHQQLQTYVRHLRDHEADQQRQFIARRRMT CC GHLRVAHGNHESYSCPSLHTLAPRVPCNFCSALKWPGEPPSVCCNSGQVVLPPFPEPPAELRQLFDVPQF CC LLNIRRYNNAFAFTSIGASIRGNDPVRQDLRVGHGGIYNYRIQGALCHRIGSLAVLPGRPPCYAQLYFYD CC SSSEGQYDEMLNARAKAYGSELNREILAILQRVFSTHNPIAEMFKHAYERMSVRDDLRLCIHSRIPGIDQ CC RRYNAPTADEVGGVFVCDDTTGATEHTRDIVLQHRATGYLQPVFDTNQLADALQYPILFPRGETGWTYGM CC PKVQRRTRQQQQPSRPTSSSNGAAHIDENDDAAEDPNPELDGAERSANQITPREYAAYRIAWRENAQTLM CC HRAGRLFQQYCVDQYCKIEMQRLKYLREHQAQLRTEAYTGFMDLAGAEGTIADLHPDNLPNVAVEIPVVQ CC EPPVAPRRSIIDPPSNLSRTGTRVILGPSFVGSNRYMRAQYQDAMAIVRALGKPDLFITVTCNPKWPEIT CC QCLLPRQQAPDRPDVIVRVFRLKLKAILNDLTMGALGIEVARIHVIEFQKRGLPHAHILVILAEEDKPQT CC PADYDKIVSAELPNPATSSQLFETVQSCMMHGPCGAANPAAPCMKDGTCEKGFPKSFCEQTRSMDNGYPQ CC YRRRNNGRSVTVKGIELDNRYVVPYNPWFTHKYNCHINVEVCTSISSVKYLYKYVYKGHDRLSVTLAVGN CC DEIQQHIDARYLSPTDSCWRIMRFELQAKTHTVVTLPIHLENQQNVFFRANETVSCVLNRGNHTMLTRFF CC QLAAHDNFARALLYHDVPTYYRYAKPTANQRLPWQEPGTKHWIRRIRTGHKTVIGRMVSCSMQLMERYCL CC RLLLCYRKGPTSYEDLRTVNGSVCETFQQAAINEGLLEDDSEWDRALEEAATYRMPAQLRHFFALILSSG CC MPQNPRTLWESYASEMSEDFHHRNRDRYTTEESDLNKLLRDVEHFRALSDIDRYLRGTTPSKVLTSSPGM CC PQLSEYEHVQAHIMDDDDVNEFIIAERSYLITDLDATLATVHQLNDEQRMVYETVTAAIDRQLATAASQA CC NAGDQRLFFLDGPGGTGKSFLVEKILAHVRRCGEIALATAASGIAALLLTGGKTVHSTFKLPLDLNNHST CC CSITVQSKRAEMLRQTALIVWDEASMSSRFALEAVDRTLQDITGVQLPFGGKVVLLSGDFRQILPIVPKG CC TDAQIINECIKKSTLWPLFRSLQLRDNMRVRTAPNANQASELRDFANLLLRIGEGRHDTFAGLDPSLAKI CC PHDMIVPHTANPTNDLNTLIDKIYPDMQRHFQHPSFFSDRAILSPLNVDVASVNNLVLDRIPGPEQEYRS CC VDTLVNPEEHEHLQLPSEYLNTLNVSGIPVHRLRLKRFAPVLLLRNLNSDMGLCNGTRLQIVGLKRNCLH CC AKILTGTRRGEDVLLPRIFCDSNDKGHPFQIRRKQFPVQVCFAMTINKSQGQSLHHLGLYLPQDVFAHGQ CC LYVALSRVTSRANIAVLIPNPKRADEEGVSTSNIVYREVVDR. XX SQ Sequence 6666 BP; 1706 A; 1761 C; 1639 G; 1560 T; 0 other; tctatatata aattttcgtt aacgctgacg ctcgatgtca cactttgcgt tacgctgttt 60 ctatctcgcc cgttctgcct tcgctgccgt ataacaccat cacgctcgca cgaccgcacg 120 acgaataccc ctaccatcgt gagcacgacc aacgctgcag gggtgctcaa ccgtttcctc 180 gctttccgaa ctcgaccgag cagtgcgcac actatcgact cgcccttgcg agcccctgcc 240 agcgcatgcg cgtacgagca agcgagcgag cgagcgaacg atcgagttgg ctacgcgata 300 tgatatattt tttgacaaaa cgaatatcgt ttattaatac ggttataggt gcgatttaca 360 cttttttaag ttgatttcca gtttttttag tttagtttga gtattaggtt aatttcttta 420 ggtgcgcatg gtgaagtata gtgtagtgta gtttagaaaa gtgttgtgta ttgtagtgta 480 gtttagtacg cagggcagtg tagatttagt gtaaggagat tattttcgta acgtacgttt 540 cgtgtacgtt ttgtggttca atttgttgtt cctgcgtgga acgtgtgttc ctgtgttcct 600 gtgttcctgt gtttctgtgt tccggcgttt tcttgcaatt ctgtttttgc tacgatatgg 660 agcataatga ggaaagtgtg ggcaacggtt gccctatgat tgacatcatg tcaggtataa 720 tatttcctac tttcttttat tcacagcact tatctaaaat tctttttttt tgttttacaa 780 ttttcaacag ccccagttac agaaccctat ccacaaacca tgcagccctc cgtaagtgat 840 atcgtctcac agaatctcac tccaccggaa gaaactccga acgaaaagaa agcaaggctt 900 caaaggaagc gacaggcgtt gtacagggcc aagaagcgcc tcccgggtgc agctcctacg 960 ctggcagctg tacaggacga ccaacaggct gtaccttcaa ctttggccgg gtcatcctta 1020 tcagcagctt ctgcaatcct gcaacaacaa caacgaagga atacggacga cgaattgctc 1080 aatgaaatcg ctggacgata ctccacccca ccggatgaaa caccacccga gaagaaggca 1140 aggctcgcca gaaagcgaca agcgctgtac agggccaaga agcgcctcgc gggagcagct 1200 cctacggtgg cagctgtaca ggacgaccaa caggctgtac cgtctacttc ggccgggaca 1260 tccttatcag gagctccagc aatcttgcaa caacaacaac tacgaaggaa tattgacgat 1320 gatttgctca atgaaatcgc tggacggtac tccaccccac cggatgaaac accacccgag 1380 aagaaggcaa ggctcgccag aaagcgacag gcactgtaca atgccaggaa gcgcatggtg 1440 caaactccga atgtcgcaac tgtggtcaac cgcgccccag caccagcatc agcaccagta 1500 ccagcaccag caccagcacc aacaccagca ttggtagacc caactgcagc agttgctatc 1560 cgagtgccaa tacatcagca gctacaaaca tacgtgcgcc atcttcgcga ccatgaagca 1620 gatcagcagc gacaattcat agcccggcgt cgtatgaccg gccatctgag agtagcacat 1680 gggaaccatg agtcgtacag ttgtccctcg cttcatacgc ttgcaccacg cgtaccttgc 1740 aatttctgtt ccgctctcaa gtggcctgga gagccgccca gtgtgtgctg caacagtgga 1800 caagttgtgc ttccaccgtt tcccgaaccg ccagcagagt tgcgccaact atttgacgtg 1860 ccgcaatttc tgctcaacat ccggcgatac aataatgcgt ttgccttcac ctctatcggt 1920 gcctcaatac gaggaaatga cccagtacgc caggatctac gagtcggtca cggtggaatt 1980 tacaactatc gcatacaagg tgcgctttgc catcgaatcg gttcgctcgc tgtgcttcca 2040 ggacgtccac catgctacgc acagctgtac ttttatgatt caagctcgga gggtcagtac 2100 gacgaaatgc tgaacgctcg ggccaaagca tatggtagtg agctgaatcg tgaaatattg 2160 gccattttgc aacgtgtttt ttcgactcac aatccgattg cagagatgtt taagcatgcg 2220 tacgaacgga tgtctgttcg agatgatctg cggctttgca tccactcgcg tataccaggc 2280 atcgatcaac ggcgttacaa tgcaccgact gcagatgaag ttggtggcgt gttcgtgtgc 2340 gatgatacta ccggagccac ggaacataca cgcgatattg tgctgcagca tcgtgccaca 2400 ggatacttgc agcctgtgtt tgataccaac cagttggcag atgcactcca atacccgatt 2460 ctgtttccgc ggggtgaaac cggttggact tacggtatgc caaaggtaca gcgccgaacc 2520 agacagcagc aacaaccaag cagaccaaca agttcgagca acggagcggc gcacatagat 2580 gaaaatgatg atgctgctga ggatcctaat cctgagttgg atggagcaga gcgttcagca 2640 aatcaaataa cgccacgcga atatgccgca tatcgcatag cctggcggga aaacgctcaa 2700 acgcttatgc atcgtgccgg acgattgttt cagcagtact gtgtagacca gtactgtaag 2760 atagagatgc agcggttgaa atatttgcga gagcaccagg cacaacttcg cactgaggcg 2820 tacacgggct ttatggatct tgccggcgct gaaggtacca tcgcggatct tcatcccgat 2880 aatttgccta atgtggctgt ggaaatccct gtagtgcaag aaccacctgt tgcaccccgt 2940 cgttccatca tcgatccgcc ctcgaacctg tcgcgtacgg gcacgcgtgt tattctcgga 3000 ccatcatttg tcggtagcaa tcgatatatg cgagcgcaat atcaagacgc aatggctatt 3060 gttcgagctt tgggtaaacc ggacctgttt atcaccgtca cgtgcaatcc gaagtggccg 3120 gaaatcacac agtgcctact tccacgtcag caggccccgg atcgacccga tgttatcgta 3180 cgtgtctttc ggttaaagct gaaagccata ctgaatgacc taaccatggg agctctcggg 3240 attgaggtag ctcgcattca cgtgatcgag ttccaaaaac gtggccttcc ccatgcacac 3300 atactcgtga ttctcgccga ggaagacaaa ccgcagaccc cggcagacta cgacaagatc 3360 gtgtctgccg aacttcctaa tcctgcaacg tcgtcgcaac tgttcgagac ggtacagagc 3420 tgtatgatgc atggaccgtg tggggctgcc aatcctgccg caccttgcat gaaggatggt 3480 acatgcgaaa aagggttccc aaagtcattc tgcgaacaga cgcgcagcat ggataatggc 3540 tatccacagt accgtcgtcg taacaatggg cgcagtgtga cggtgaaagg aatcgagctg 3600 gacaaccggt acgtcgtgcc ctacaaccca tggttcacgc ataagtacaa ttgccacatc 3660 aacgttgagg tgtgtacttc gatcagcagc gtgaaatatt tgtacaaata cgtgtacaag 3720 gggcacgacc gtctgagcgt taccctggcg gtaggcaacg atgaaattca gcagcatatc 3780 gatgcacgct acctttcacc gacggacagc tgctggcgaa taatgcgctt tgagttgcaa 3840 gcaaaaacgc acaccgttgt tacactgcct atccatctgg aaaatcaaca gaacgtgttt 3900 ttccgggcaa acgaaaccgt ttcgtgcgtg ttaaaccgtg gtaaccatac catgttgaca 3960 cgattcttcc agctggcggc acatgacaac tttgccagag cgctgctgta ccacgatgtc 4020 ccgacgtact accggtacgc gaaaccaact gcgaaccagc gactaccgtg gcaggaaccg 4080 ggaacaaagc actggatccg tcgcatacgt accgggcaca agacagtgat cggccgaatg 4140 gtgtcctgta gcatgcagct tatggaacgc tactgcttgc ggttgcttct ttgctaccgc 4200 aagggtccaa catcgtacga ggatctgcgt actgttaacg gttcggtgtg cgaaacgttt 4260 cagcaggctg ccatcaacga agggctgctt gaggatgatt ccgagtggga tcgtgctcta 4320 gaagaagcgg ccacttatcg tatgcccgcc cagttgcgcc atttcttcgc actcatcctt 4380 tcgtctggga tgccacaaaa cccgcgcacc ctgtgggaaa gctatgcgag tgaaatgagt 4440 gaagattttc accaccgcaa tcgcgaccgg tacacgacgg aggaatccga cctgaataag 4500 ctgctacgtg atgttgaaca cttccgggcg ctgagtgaca tcgatcgcta cctgcgcggt 4560 acgacgccat cgaaggtttt gaccagttcc ccaggaatgc ctcagctttc ggagtatgaa 4620 cacgttcagg cacacatcat ggacgatgat gatgtcaacg agttcattat cgctgaacga 4680 tcgtatctga tcacggatct cgatgctacc cttgccaccg tacatcagct caacgacgaa 4740 caacgcatgg tgtacgaaac ggtcacggcg gcaatcgatc gtcaattagc gacggcggca 4800 agccaagcga acgctggtga ccaacggtta ttcttcctgg atggccctgg tggaacgggt 4860 aaatcttttt tggtagaaaa gatactggca cacgtccgtc gctgcggaga aattgcgctc 4920 gcaactgcag caagcggcat agcagcactg ttgcttacag gagggaaaac agtacactcc 4980 acgtttaagt tgccgctgga cttaaacaat cattccacct gtagcattac ggttcagtcg 5040 aaacgggccg aaatgcttcg acaaacagca ctgatcgttt gggatgaggc gtcgatgagc 5100 agtcggtttg ctctcgaagc agtcgatcgg accctgcagg acataacggg tgtgcagctt 5160 cctttcggcg gtaaggtggt gctgctgtcc ggtgactttc ggcaaatttt accgatcgta 5220 ccgaagggca cggatgcaca aatcatcaac gagtgcatca agaagagcac attatggccc 5280 ctgtttaggt cgctacaatt gcgcgataac atgcgggtac gcacggcacc aaacgcgaac 5340 caagccagtg agttgcgaga ttttgccaac cttctgcttc gtatcggtga aggacggcac 5400 gatacgtttg caggactgga tccatcgttg gcaaaaatac cgcacgatat gattgtgccc 5460 catactgcga atccgacaaa cgaccttaac accctgatcg acaaaatcta cccggacatg 5520 caacggcact tccaacatcc gtcattcttt tcggatcggg ctattctgtc gccgcttaac 5580 gtggatgttg ccagcgtgaa caacctggtc ctagaccgaa ttcctggacc ggaacaggag 5640 taccgttcgg tcgatacatt ggtcaacccg gaggaacacg agcatctgca acttccttcc 5700 gaatacttga acacactcaa tgtcagcggc atcccagtgc atcgcttgcg gctgaaacga 5760 tttgcaccag tacttttatt gcgcaatctt aattccgaca tgggattgtg taacggtacg 5820 cgtttgcaaa ttgtaggcct aaagcgaaac tgtttacacg ctaaaattct gacaggcacg 5880 cggagaggcg aagacgtcct gcttccacgg atcttctgtg acagcaacga taagggtcac 5940 ccgttccaga tccgccgtaa acagtttccg gtgcaagtgt gctttgcgat gaccatcaac 6000 aagtcgcaag ggcaatcgct tcaccatttg ggcctatatc tgccgcaaga tgttttcgcc 6060 catggccagc tatacgttgc actctcgcgg gtgacatcac gagcgaacat tgctgtgctg 6120 ataccgaacc cgaaacgcgc tgacgaggaa ggtgtctcca caagcaacat cgtctaccga 6180 gaggtcgttg acagatgatt cttctactga accagagtgt gacgtaatta ctgaaaacag 6240 aaaattagac attacataaa tataaatatc cattctttca gtttgaaccg gctattcact 6300 aacaatacat ctaccagaac ctagaacaag acggcagcaa aatacaaaca atacatgtgt 6360 acatcttcct tggctatcct tactgcattt tcgtttcatg cagcgcctgc aaccaaaacg 6420 atcacgtaaa cgacgggaca atcaattgaa cttccatgct ggaaggtgga gctttacttc 6480 atgcaccacg ttccatgcgc acctgacttt gactgaacac cccacatttc caggctctcc 6540 agcaatcttt ccttggaatg ccatgagact tccaatttcc acgaggaacg aggtggcatg 6600 ctccagttgt ttctgtgcag cctggagttt ccgtgcgtag cactgtaacc gggcctctta 6660 agctag 6666 // ID T1_AG repbase; DNA; ANG; 4634 BP. XX AC M93689; XX DT 22-DEC-1995 (Rel. 2, Created) DT 25-MAR-1997 (Rel. 3, Last updated, Version 2) XX DE Anopheles gambiae T1 retroposon. XX KW reverse transcriptase; retroposon; Transpsoson; AGT1; T1_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4634 RA Besansky J.N.; RT "A retrotransposable element from the mosquito Anopheles RT gambiae."; RL Mol. Cell. Biol 10, 863-871 (1990). XX DR GenBank; M93689; Positions 1 4634. XX SQ Sequence 4634 BP; 1059 A; 1101 C; 880 G; 1593 T; 1 other; agagagagat cgactgtcaa accgagtggt tgtgccttcc ttgtgctcct gatttttgat 60 gctgttttgc cgtgttactg cctgattttc aacatttgca acattggtgc tgctgagttg 120 cttgtactgg ctagtgcgtg ccttcgattg ttatcaagtt gcacttgaat tcttgcacct 180 agtctcctgt tatttcgtca gttttgccga gatattgttt ggcttcgttc ctgctcctgt 240 cactcgtcag tttgtgctgt gtcattcgtt atgcagtgct caacctgcaa tgcacccacc 300 gatagtgcaa attcggtgtc ctgcgccggt gtgtgtggct ccaagcatca tacccattgc 360 acgggtttgt cccgtgattc tactcgagag cttgggcgga ataatcaatt gttgtggttg 420 tgcaaaaatt gtaacgagtt tcgcaatggc acaaactcac ttctcacaag tgagatagca 480 gccctactcg agttggtgaa agcagaaatt ctcaccacga ttgactcatc tctctcttct 540 cttagatcgg ctatcaagag cgatttgctt gctgagatcc tcgctctcgc tgataagcta 600 acacccgtat tagctaagcc gtctgtttct cagccatcgc gaacgcacac gtccactaat 660 gcatcgtcac tcaacgccac taataccaga acgactaaaa cagcatccac tcgccgtaca 720 tttaccaact caatggagct cactgcagat atccaacaag cagcgaacga taccaacact 780 gtggaagctt ctgatagctg caaccactac actcatcgta ctaaggtgac tagtgatatt 840 agtgctgggc catgccgaac aaatacaaaa tcatcttctg atcctgtttt gaaccatgat 900 accacgaaca ctggcatagc agaaaaagta tggttatact tcacgaacat caaatcgcat 960 gtctccgctg atgatatgcg tgtgtggctt aaagctgtgc tgccaaccga caacatagat 1020 gtttaccgtc tcacgaaaaa gggtgcaaac cttgatttga tgtcctttat atcgtttaaa 1080 gtgagtattc ctaaatcact taaggatctg gcgcttcaat ctactatttg gccagtttca 1140 cttactgttc gtgagtttgt tgatcgtggc ctaccaaagc aacgtataca tgaaagggcc 1200 cgatttgacc cttctgcgct tacttcgcat cgttcaagca gtgcaaattg ctcttcagct 1260 gcgccaaaaa gcaccgctca tccggatcat tttttggatc atcgatcgcc atccccacag 1320 cgcgggaatc aatcactatc ccagatgacc gagatcctag aggctatcca accggagttt 1380 cctcccacac cccctcagtt atcaccgggg gtggggcttc aatcacagaa caatctcagc 1440 aacacgaatc gctcaccaca gatcagcccg tttgccaaac ggatagctca caattaatag 1500 acccttcgtg atctattacc aaaatgttcg aggccttcgc accaaatata atgaattgcg 1560 cctttctgcg aatgaatcag ggtttgaaat gcttgccctt actgaaacct ggttaaatga 1620 atcgattcca tccaatatgg tcctggatag tgattcttac aatatatacc gttgcgatcg 1680 cagcaggtta aacaatgaac gatcgcgtgg gggtggtgtg ctgcttgcat gttccagtcg 1740 ttatccgtct gtggcactta acatgaatca acctacgctt gaagctttat gtattcgtgt 1800 ttcttttcct aagtttcgtc tttatgtggg gattgtttat gtgccaccgt atttgagcag 1860 cgaccgcaac tatttcgaat ccctttctgc tttcatcagn gatgcataca tgcatatgaa 1920 accgaatgat catcttatcc ttcttggcga cttcaatcaa ccggcgttag ggtggtcgcc 1980 tgcagccgca gtaaggtcag attcatcttt acctatgaga cattatgtgc cacatatctc 2040 tttgaattca tccagttcct gctttttgga tgtgttaaat ttgcatgaac tctatcagct 2100 gaacggggtg cataaccatt caaatcatta tctggacctg gtgctctcta actctgctgc 2160 tgctgcttgt tcttctgtgt atcctgcttc gtcactgctc ctgccccagg atgcccatca 2220 tcctgctctg gaaattgcgt taccgtcttc tttatttagg gctagtaggg ttaggaatga 2280 attgccttct gctcctaatt cattgagtgt tcgttataat tttcgtctta cagactatcg 2340 taaacttaat tctattctat ctcgtgccga ctggtctttt ttttatcaat gtacatcggt 2400 cgacgaggct gtccaatcgt ttaatgcttt gttaacctct gcactccttt catgtacacc 2460 tatttttcgt tcccctccta atcctccctg gtccaatcgt actcttcgca acctgaaaaa 2520 ggatagaatg aaatatctta ggaggtatcg tctgaaccga tctgctttca actttcgttt 2580 atttaagtac gctgcctctg cgcatcgact atacaacagg gctcgttttg aggcctattc 2640 gagtagactg caatcgcgtt tccgttctga tccagcatcc ttctggcaat ttgttaggat 2700 tcgaagaggg tgcaatacgt tacctaatga aatggtactt gattctcgaa ctgcctctac 2760 gcctgttgag atctgtgagc tattctctgc acatttttcc caaatgtttg agccaccggt 2820 tagtgaccct aaccttattg agggtgggct actctacacg ccagagaact taattaatct 2880 ctccgatatt tcggttagct ctgaaacagt tgtacaggtg ttatttgggt tgaaacgttc 2940 ttttactcct ggtccagatg gcattcctgc ctcagtttta ataaactgta aggacgtgct 3000 tgctccacac cttgctaaaa ttttcaacct ttcactttct ctcggggtct ttcctgctct 3060 ttggaaatcc tgttggcttt ttccggtaca caaaaaggga tgccgtagca ttgtctctaa 3120 ttatcgtggg ataactcaaa catgtgccac agccaaaact tttgagctat gtatctttcc 3180 aaccatactt catagttgta gttccgctat tagccctaaa cagcatgggt ttatgcctgg 3240 taggtctact tctactaatc tcatgtcttt tgttaccaat attttcagat cttttgaggc 3300 aggtacccaa cttgatgcaa tatacactga ctttcatgct gcatttgata gtttgcccca 3360 ctctttacta ttagctaaac tatctaaact tggttttggt gatggcatta ttagctggct 3420 gtcctcatac ttaagtaatc gatcttgcag ggttaaaacc gggtcgtact tatctgagga 3480 gtttttttgt acgtcaggtg tccctcaggg ttgtgtgcta agtccacttc tgttttcttt 3540 gttcatcaat gatgtctgta atgttttacc tcctgatggt catctccttt atgcggatga 3600 tatcaaaatc tttttacctg tgtcctcttc ttctgattgt atgagtcttc agcattacct 3660 taatgcattt gttcattggt gttcatccaa cttacttcgc ttgtgccctg ataaatgttc 3720 tgttatttct ttctctcact ctctttctcc tatttcattt aactatactc tctctaactc 3780 gtctctctct cgtgttttgt ccatccgtga ccttggtatt atactcgaca gtcgtcttaa 3840 ctttaaactg cagcttgatg aggttctact aaaagctaat cgaactcttg ggtttatttt 3900 acgttttacc tctattttta gagatcaaag cttcttaaga aacctttatt atgctctggt 3960 aaggcctctt cttgaatatg ctagcatcat ctggaatcct cctactattg atggctgttc 4020 gagaattgaa agcattcagc gcctttttac cagggttgct tttcgtcgtt tgttcggtgc 4080 tgcctcacta cctccctatg aaacgcgatt gcagttattc aatcttcact ctttaagctt 4140 ccgccgccaa gtgtctcagg catgttttat tggtggctta ttactttctg atactgatgc 4200 tcctgattta ctctcgtcca tctcgttgta tgttccctct cgttcccttc gtcctcgtga 4260 tcctctgtca attgaaacac gtcatactct ttatactttc aatgatccta ttctatcctg 4320 tttcaggttg tttaaccact tttactatct ctttgatttc gactcctctc tcaactcttt 4380 ccgtaaccgt attttttctt ctaattctct ttaattattc ttctaagttt cattaagttt 4440 tgatagtctc tacgctctac ctatgttttt ttttctttaa tttttttgct aggtctagac 4500 tagtttagtt aggcttagtt tttcatgaat tattttgttt atttgttagg gtttatttag 4560 ttttaagtct gccttattta gccttgatgg cggatattgt attaataaat gaaatgaaat 4620 gaaatgaaat gaaa 4634 // ID AgaP8 repbase; DNA; ANG; 7508 BP. XX AC DQ301497; XX DT 22-AUG-2006 (Rel. 13.08, Created) DT 31-JUL-2008 (Rel. 13.08, Last updated, Version 1) XX DE Anopheles gambiae str. PEST clone AgaP8 transposon P-like, DE complete sequence. XX KW P; DNA transposon; Transposable Element; AgaP8. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-7508 RA Quesneville H., Nouaud D. and Anxolabehere D.; RT "P elements and MITE relatives in the whole genome sequence of RT Anopheles gambiae."; RL BMC Genomics 7(1), 214-214 (2006). XX RN [2] RP 1-7508 RA Quesneville H., Nouaud D. and Anxolabehere D.; RT "Direct Submission."; RL Direct Submission to Genbank (30-NOV-2004)Dynamique du Genome et RL Evolution, Institut Jacques Monod - CNRS - Universites Paris 6 RL Paris 7, 2 place Jussieu, Paris 75252, France. XX DR EMBL/GenBank/DDBJ; DQ301497; Positions 1 7508. XX SQ Sequence 7508 BP; 2063 A; 1583 C; 1513 G; 2119 T; 230 other; caaggttatt agactctata cagattatga caaaagccct ttttggccga tctccaataa 60 aaaaatgaat gaaaaaattt tgacagacaa attggaaagg tttgtttaca atttgaacgg 120 tttgtttaca atcagtttgc aatcggtcag tcgcaaagcc atcgttattt ccagagtgta 180 tttcaacaac ctttatcgtt ttcaacatat tataattaat tatttattat gtcttcaaac 240 tataagtgct ctgtagcctc ttgtaaaaac aataggtgca atgtgaagaa gatgggagct 300 aaaatttact tccataaatt cccagagtgt ttagcaacga agcaaaaatg gattgtattt 360 tgtggaaagg ataatcctta gatgccttct ccaaataatg ttgtatgctc ggaacatttt 420 gtaccatctg attatcaatt aagaaacgtg caggaaatta agcaggggtc taactggtta 480 aaacctcaag gtaagattct ataaattctg atgattattc tatttacgat ttttttaact 540 ttttgttgtg tttacagcca ttccgtcagt tttatcgccg ttagaaaatg atagtgatat 600 tgcattattg cactttgaca acaataccaa aaaccaaaac acaaacgaag ggagaatcgg 660 ttgacccctg ccggaaggta agttgttacc aaattatgtg ttagtattac atactttcat 720 ctatactaat atgcaaatca atataatttc agctggttct tcgttttgtg atgaacatat 780 cattgaacaa agtgcaggca actttactat tgaaataaat gaaaaagatg tgtggttcaa 840 gtgtaaagac tatgaagaga aaatatgtga tctagaatct aaacttaaat cagtagaaga 900 caggaacgag cggttagctg atgtaaataa tcagctaagc gaaaaattga aaatatttta 960 tggaaaagaa aaagctcata ttaagcaaat acaaacactt gaaaaagcaa ataacaaaat 1020 taaggaagaa tggccacgaa acaaatttat agaaaatata aaagaggcac tgaaaggcat 1080 tttatcaagc aaccaaattg atctcatttt agaaatcaag aaaacagtaa gatggtcaat 1140 gcaagaactt tctgcagcat ttacattaaa atatttcagt caaagggctt acaaatatat 1200 aattttttat ttaaaaattc ctctgccatc cataagaaca ttacaacgtt atgcaagcag 1260 gattgacctt aaacaaggca ttctagatga tattttatca ttcattgggt cctttgctta 1320 aacattgagt actatggata gagaatgtgt attgccgttc gatgagatga aggtgtctcg 1380 tatattggaa tacgatccat cggcggatga agtggtaggt ccttttgact tttgacaaat 1440 agtaatgatg cgcgggcttt ttaaacaatg gaaacaacct atttttattg cttttggcca 1500 aaaaatcacc aaagatattt tgattgacat aataacaagg ttcagtgata agatgataaa 1560 tgttgtagcg atagttagcg ataattgcca agcaaacata aaatgttgga aggagttagg 1620 tgctagggat gacatcgaaa aaccatattt tttacatcct aaaacacaaa acaatgtata 1680 tgtagtccca gatactccgc atcttctaaa attgttgaag aactggttat tggatcatgg 1740 gtttgaacat aacggcaaac acatagaaac cagcaacctt ttacgtatgg tagctcaaag 1800 aatggagtcc gaaatgactc ctttcccaac cgacaatttc aaatattttc gacgtcgggt 1860 gtacgtcgtg tgaacgttca cccgacgtcg ggtagtcgtc gcagttaagt caatatacga 1920 cgaagcgaaa ttttttgaaa cgaaaaacgc cacaattata gcagattttc tggagtattt 1980 cgctcgtggt ttttgtctag gacaacggcc ataaaaaggt ggttcaaaaa gtcgataccc 2040 tctctccccc tcactcttcc tctatgggta ccgtacgatc catctcgcta ttctcacatg 2100 gtagcgggag tgatatcgct tatctgctgt cttctttatc cgttctccca gatcatattc 2160 gttttgagtg acgttgcttc cccacctccc tcattcccac ctttcccttc cagcagttat 2220 tggcgaatgc ttgaccttgc cttgtgtgct tgacttggtt tttttttttc cttcccgttt 2280 ttttttggct gcttgatgca ttgcgcgttg gctagagccg tcgttcgaat ctggatttga 2340 tcagcgtgca agttgtcgtg tggatgtatg aatagcttta ccgttgaagc tatttgccgg 2400 gtcgatcgga tgcatgctgc gtacgggttt tatattgttt gttagtgcac ttagtgtcag 2460 ttagttgcac ttgttaaaaa aatcaatcaa aagacgatca tcaatgcact agcacacaca 2520 acatatcatg tgaagccaac caatgcttaa cattgtcaaa tattcaaatg agggatgatg 2580 tctttgaaca catttttctc ttcatttttt tattgttttt attcaacaat gcacaattct 2640 tccaattctc ccccccctct cctctccttc cttcccacac aatcgggaat acaggtaagg 2700 atttaacaac catattgtac cgtctttttc ctaattaacg ggggctcctc tccggaagag 2760 aaaaaggcac gtctcccctg cctctcgctc aatgctgtaa gcgagcaggg cccgcctcgg 2820 cttaggtggc gcacgcttct ctcgctcgcg ttgcttagct tgcatcttct ctcgcttttt 2880 ttggaaacgc atcatcgttg cggatggtgc tgcgcggcca tatcgctatc tagagggtgg 2940 catagatagt tgtgttctag cccgtcgctg cactcgccat tgacgcagca tgcgtaaggg 3000 acgtttatgc tgcatgcatt gtcgatagcg ccgcggctgc ggtcggggat tggctgtggc 3060 gaccggcccc ccgggagcct taaatgccgg tggtgcagtc ggagccacgc gccaagctgc 3120 tccacgtgtc cccgcattct atggatggct tagctggttt cgggccgccc gttatcgtgc 3180 tacccgaaca caacataaca acatacccgt ccaaggttca agcccggtat ggtccagtta 3240 gtaacccgga aaaccagtta gtaacaggta aagtcaactc gcacgaatta acaacatgct 3300 cgttttcggt tcaatcctcg tatggactat ctctcgtagc aagggctaac tatccgaaat 3360 ttctttccaa agtgtggctc gaaggaagaa catcgaccta caaaaaatgc atgttgctcc 3420 cttcctgtat gtcctacccg ttcctgtatg ctcccttcag ctcgcttcgc acttgtccct 3480 cattgtttag aacttatcgt cgctccggaa caatgcgcgc tgctgcacgc aacccacgcc 3540 ttcatctttt catctcacct cttcgattat cacccttccc gctcctcctt ccgctcttgt 3600 ttgccccctt tttccttact cccttccccc ttctttcctt tttattcgtg ttgctgttta 3660 ataaggctag catagtttag gatgtaaatg taaactaaat gtaaaatgta aactgtattt 3720 gccgtgctta aggcaaacaa tagtaaggtg cgggcagggg cggttcaacc agtacgcaat 3780 ctaagcggcc gcgtggggcc ccgaggagag gaacccatcc ccccactcgg aagtaaaagt 3840 gcttcgcaat ccttctctgg tttagaattc ctcatacaac ttctcgctcg acacatccga 3900 atggtcaact ttcttgtgtc cggttagggc ctccactcgt gctaatacgc cactggatgc 3960 gagttgacca accgggtcgg aaccatccgg atcatttggg aggtgcgatt gctttccaaa 4020 tcactgctcc cacgagtcgg aatggtacct gccgaactac ctggcgcccc gccgactaca 4080 tatcagcggt ccttggaacc acgtagtcga accacagccc gtggccaata agaccccaca 4140 gctcccgtat aaattttatt cttcttcttt gacgtaacga acaatgctgt ggacagggct 4200 aaaacttctt caccttacta aaatcaagct ataggatagt cggttcttgc tatcggggga 4260 atggtccgga tgagaatcga tctcgtatgg cctcatagga tccggtctct tagtgatgat 4320 gacgcagtgg atcatcaatg agtgacgatt aaaaacattc tgccgtttgg tttaaagggt 4380 atacaatatg cggcatttta ttgttgctct tccgcataaa aagtttacca ccatgtgtcc 4440 caaatttaca cacatcataa tactgtgttt gtggttgtgc taatcacaaa acgctttctg 4500 tatagcgtgg cacataaacg tgcacgaggt ggtacatcat ttggtgtgcc agagccattg 4560 cttctaacaa ttttccttta actaactaac ggatcatgag gatactgatc gttcgcagcc 4620 aatgccacac acgggaaaaa caaggagaaa gagaattccc tcacgtggga gaaagagcag 4680 caacaaccaa accgtagagc cgatgggatt gtgagaatgc gacgggagat catcgtctgt 4740 gagcgagagt gaaacagaag tcagaaccat aaacttccca tccataatct gtgaccataa 4800 tttgggtgtg tgtcttcgct ccatacaaat cgtcaaaaaa acattttcgc ttggtcggaa 4860 atcgcggcga aaaccgcctt ttgcggatag ggtttgttca agcttacaat ggcacacatc 4920 tatatgaccc cccaacagcg tcaaaatgtt cgtcgagctg cagaattgct gtctcgcact 4980 accgctgtag cttttcgaac ttattaccct tacaatgaaa atgcaaaagt tttagctgat 5040 tttattgaaa aggttgattt atggtttagt gtgtcaaatt cttgacgact aaacgacgag 5100 cgagtgacga tcaaattggc gccttgagag atatgtttga aactgtatca actatgacca 5160 taccaggaaa gtcaaatttg caagtattcc aacgatccat tataatgcaa atcacgtctt 5220 tgcaaatggt ttttgcagat aggaaaaaaa aacagtcttt taggccgtcc acaaggacag 5280 aggaggcgtg gtaggcccaa attgaggtgg caagatggcg tggaggcgtc cgccattaag 5340 gccgggataa cggactggca gacgaaggcg cgagaccgtg agcggtttcg gacactcctg 5400 aggcaggcca agaccgcaaa gcggttgtag cgccggataa gtaagtaagt aagaaaaaaa 5460 aacatgatat acaattcatc tgcacacata aagtaatcaa tatttaagta tttagagttt 5520 tgtatttaat aaagtttatt tttttatttt tcagctaaac caagacgtat tagagaattt 5580 attttcgcaa ataaggcaaa tgggaggagn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5640 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5700 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5760 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5820 nnnnnnnnnn nnnnnnnnna cttttcgggt caacccgaac gactccgggt cgcttacgga 5880 tggctccgac ccgataggtt cgggtcgacc cgcccatcac tattaacgat gcatttattt 5940 tgaatggtta ctcgcaacat ttttataatt tagcttcatg caaatagtaa aattagctat 6000 gttcaagtag actggaccaa gtgtaacaag acatcatgta ccgggtttct cttagcatgt 6060 ttgatgtgag atgtgtaacc agaccaacca atcctcttgt tgtcctacaa aaaagaataa 6120 aatattaatt ccataatcac aattgaatct gcacacacgc gcagagctcg catatcaaaa 6180 tataatatca atcaacaatg tgaagttagt ggcggcttca acgataatcg gactagttac 6240 cgcttaaggc acccggcacc ctaaatcctt tactgtgcga agcagcatat aacacttctt 6300 ggtcgcttcg tatagcatgt ttatttcgtg cctcatgagc taagcgcgat tttttagcca 6360 aatgattgaa accaacaagc atctgattga gcggtcagca cggttacatc aatacgtcta 6420 gatgacggat tgacttgtaa acatagggtt gtaatgcaca cctgataccc acctgattcg 6480 ccatcaagcg ctgaactggc cagagtcagt tatattatca gtcacctaat gctgagcgca 6540 ccgtaggatc cactctctgg tctgcaaatc ggtccatcaa ccgacctgct gatgtcattg 6600 tatgattagt ccatcgtccg atcagccggt tcactacccg cccggtccgg taaccatagt 6660 ggctttccgc acttgctcct gcactcagtc gttgctcctc agataaccat gttatgttat 6720 agaattgacg ttaatcgttc gatcgtgggc cattcattga aatcctacat tcactttgta 6780 gaaccagcat acgcgtttga aaactttgcc cgctgcatta tttctcacag tcgccttagc 6840 agctgattgt tgtgaaagaa gaattacggc aaagctgtaa ggatcgacca tataccgggt 6900 ctctatgatt ggctatgctg agctaaaata taaaattata attatgttta cgattgcccg 6960 gagtaagttt ttaaaaaata ttactgggga tctttatttg cgaactgcta aactaagact 7020 aaagctaact tctgcattcg gtgctattta gctaggctgg tgttttcggt gctgccaggt 7080 ttgccggtca cgttgtcagc cacgtttatg atgatcatgt tgcctcggtc cgtctgacca 7140 taagacacca cacctggtcg tgggaacctg cttccagtgg agttcccgtt ggaacctgac 7200 cccgcgcgtg gagtaacatc gatggctcgc ttgcgaccgt tttcttactc agcacctgta 7260 tcattgcgag tagcccatgt gtacatatct ttatattaac tgtgggatat gttttagttt 7320 attgaaaaag tacgcaaaag ttttgtcttc cgttaactta cctctgttgt tgatgctaag 7380 catcaactga ttggcagcga ttttgacaga caaattgtaa gatattcaaa aagatattca 7440 ccaagatatt caaaatggtg agctgttcaa gagctctgtc gtaacctgta tagagtctaa 7500 taaccttg 7508 // ID GYPSY2-LTR_AG repbase; DNA; ANG; 368 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 08-MAY-2003 (Rel. 8.04, Last updated, Version 1) XX DE GYPSY2-LTR_AG is an LTR of the GYPSY2_AG LTR retrotransposon - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD; KW GYPSY2-I_AG; GYPSY2-LTR_AG; GYPSY2_AG; Gypsy clade. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-368 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "GYPSY2_AG, a family of LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 3(4), 76-76 (2003). XX DR [1] (Consensus) XX CC GYPSY2-LTR is a long terminal repeat of GYPSY2_AG (it internal CC portion is deposited as GYPSY2-I_AG). XX SQ Sequence 368 BP; 97 A; 115 C; 81 G; 75 T; 0 other; tgtagcgacc agaccgccat ctggcgtgag aatcgtgagc gatcgtgaca tccagggaca 60 aagacacgga tctccgtgga gcgcactcac caggcacacg aatgtgtcaa agtgacgtgc 120 gccgcgtgtc cagcgctaca cttcctgctg tcagcgagca cattctctct cttgcgaccc 180 aacctcgaaa gcgaacagac ctcttccttc gctccgcgct cgaaacttcc tcaacgtgaa 240 accctctcgc acgaacgtgc gcaaccgttc ataatatagt gtaaaataaa gttccgtatt 300 acctactcac gaaacccaac gcgttcgcga cataaaaaat tagggaccac ttttgtggcg 360 cccctaaa 368 // ID GYPSY59-I_AG repbase; DNA; ANG; 4552 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY59-I_AG is an internal portion of retrotransposon GYPSY59_AG DE - a consensus sequence. XX KW LTR Retrotransposon; Transposable Element; 5-bp TSD gag; KW AP protease; GYPSY59-I_AG; GYPSY59-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY59_AG; mag lineage; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4552 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY59_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 157-157 (2004). XX DR [1] (Consensus) XX CC GYPSY59_AG is a family of gypsy-like LTR retrotransposons that, CC according to the aminoacid sequence of its reverse CC transcriptase, RNase and integrase domains is phylogenetically CC grouped with representatives of the mag lineage of other CC organisms. GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, CC GYPSY23_AG, GYPSY24_AG, GYPSY25_AG, GYPSY26_AG, GYPSY27_AG, CC GYPSY28_AG, GYPSY55_AG, GYPSY56_AG, GYPSY57_AG, GYPSY58_AG, CC GYPSY60_AG, GYPSY61_AG, GYPSY62_AG, GYPSY63_AG, GYPSY64_AG, CC GYPSY65_AG, GYPSY66_AG, GYPSY67_AG, GYPSY68_AG and GYPSY69_AG, CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY59-I_AG consensus was reconstructed after multiple CC alignment of 4 copies. The consensus encodes the 1482-aa CC GYPSY59_AGP gag-pol like poliprotein (pos. 93-4538). The CC sequence of the LTRs flanking GYPSY59-I is deposited as CC GYPSY59-LTR_AG. CC GYPSY59_AGP: CC MEQADANQMDIFADVDPRRLSQNRASVPGVSMPNFAHREGASVVPPMIIQPPSQPTSTPSQSTS CC TPSLNNASPAIISSADSATMLQMMNLLQQQMTQQQQLLKDFLHARMPSQCAPPTTLQPEQIIDT CC LSHHISEFQYNKETGITFKSWFSRYLDLFQKDAARIDDAAKVRLLLRKLGPSEHDRYLSFIMPS CC RPPDFSLEQTVEKLTCLFDTQESLLSKRFKCLQIMKTRTEDHLSYACRINKACVEFELKKLNEE CC EFKCLLYVCGLKDEIDADIRTRLLARIEDKACVTLQQLSSECHRLINLKKDSAMIEAPVPERVL CC AVNTKMHRAPRQFQPKRDNPTTPCWSCGGLHWSRDCPYKQHRCTTCSRTGHKEGYCNTIRSRKP CC GKRPWKQRKTQLRMVTVNVQSVQQRRRFVSLTMNGTPVRMQLDTASDITVIDHTTWKLIGSPQL CC AAPSVIARTASGANLSLEGEFPCTVEVNGQAKQTVIRVSKSRLLLLGADVIDAFALWSVPMDSF CC CCHVTGTSTTPKQWQERFPTVFQGIGLCKKAGVTLTLKDNCRPVFRPKRPVAYAMQEPVNLELD CC RLEKLGIITPVKFSEWAAPVVVVKKANGKIRLCGDYSTGLNEALRPHDYPLPLPEDIFSRLSNC CC TMFSKIDLTDAFLQVEIDPQYRPLLTINTHRGLYHYNRLPPGIKVAPAAFQQLMDTMLAGLKGV CC SGYLDDIIVGGSSEHEHDTNLAEVLHRLQEYGFTIRADKCAFKQQKITYLGHVIDSHGLRPDPS CC KIELIKKLPEPKDISGVRSFLGAINYYGKFIPNMRKLRYPLDNLLKANNSFCWTPECKKSFATF CC KSLLSSDLLLTHYDPRQKIIVSADASSIGLGATISHVYPGGAMRVIQHASRALSEAERHYSQID CC REGLAIIFAVKKFHKMIFGRRFVLQTDHRPLLHIFGSKKGIPTVTANRLQRFALTLLAYDFSIE CC YVRTDDFGNADLLSRLINTQAKPEEDTVIACIETDIKAMVVSALHNTPLHFADLIRETRKDPLL CC QKLVHYIREGWPSNATYTGELSRFFARKDALSTVDGCILFGERVVIPRVLQQQCLQQIHHGHPG CC IQRMKALARSYLYWPSLDADITEWVKTCNACQAVARSPPHSSPVPWPKAAGPWQRIHVDYAGPL CC DGDFYLIVVDSFTKWPEVFRTSSTTSAATIGILRGIFARFGVPTTLVSDNGTQFTSEDFKFFCF CC QNGIEHITTAPYHPQSNGQAERFVDTFKRTVKKISADGRTMQEALDAFLLSYRSTPSSVLEGLK CC SPAEIMFGRPMRTTLDLLRPPLAGGLTDSPVAAKRREFRPSDLVYVKCYSRNGWSWTAGTIVSR CC IGNVMYTVRTVDRKTIRSHVNQLRERRERHHHRHRESSESDGLPLDILLDSYHLTPQPTTSTAE CC PSSASLQHTTSTHTSVSSASHPQTSAVIDPRTASNAEPLHPTNSPIPVQAPTREPRRFSRHRRP CC PSRFDPYRRF. XX SQ Sequence 4552 BP; 1264 A; 1186 C; 1071 G; 1031 T; 0 other; gtggcgacga ggacaattta ctttaacgtg tgttcagttt aacgtgtgca agtgtatcaa 60 acataggacc accgtgtcgc aagcgtcgta agatggagca agccgacgca aaccagatgg 120 atatttttgc ggacgttgac ccacgccgat tatcccaaaa tcgtgccagc gtaccaggcg 180 tgtctatgcc aaatttcgcg catcgagaag gagcatcagt agtaccgccg atgataatcc 240 agccgccatc gcagcctacg tcgacgccat cgcagtctac gtcgacgcca tcgctaaaca 300 atgcgtcacc agccatcatt agcagcgcgg atagtgcaac gatgctgcaa atgatgaatt 360 tgcttcaaca gcaaatgaca cagcagcagc agctgcttaa agattttctt cacgcgcgca 420 tgccgagcca atgcgcccca cccaccacct tgcagcctga acaaataatt gatactctgt 480 ctcaccacat ttcagagttt caatacaaca aggagactgg tataaccttc aaaagctggt 540 tttcacgcta tctcgacctg ttccagaagg acgcggcaag gatcgacgac gcagccaaag 600 tgcgattgtt gctgcgcaaa ctgggaccat ccgagcacga tcgctactta agcttcatca 660 tgccgagccg cccgccggat ttttcactcg agcagactgt ggaaaaactt acatgcctgt 720 tcgacaccca agaatcgctg ctgagcaagc ggttcaaatg cttgcagatc atgaagacgc 780 ggaccgagga tcatctcagc tatgcgtgtc gaattaacaa ggcgtgtgtc gagttcgagc 840 tgaagaagtt gaacgaggaa gagttcaagt gtttgctgta cgtgtgcggg ttgaaagacg 900 agatcgacgc agacatcagg acgcgactgc tagcccgcat tgaggataag gcctgcgtga 960 ccttgcaaca gctttcttca gagtgccatc ggttaatcaa cctgaaaaag gatagcgcaa 1020 tgatcgaggc accggtacca gagcgtgttc tagcagtaaa cacaaaaatg catcgcgcac 1080 cacgtcagtt ccaacccaag cgcgacaatc ctactacccc gtgttggtcg tgcggaggac 1140 tgcattggag ccgcgattgc ccgtacaagc aacaccgatg cacaacttgt tcccggacgg 1200 gccacaaaga aggctattgc aacaccataa gatcccgtaa gcccggcaaa cgcccctgga 1260 agcagcgtaa aacacaatta aggatggtga cagtgaacgt ccagagtgta cagcaacggc 1320 gcagattcgt gtcacttacg atgaatggta caccggttcg tatgcagttg gacacggctt 1380 ccgacattac tgtgatagat cacacaacgt ggaagctcat tggcagtccc cagttagcag 1440 caccatccgt aatcgctaga accgcctcag gtgctaacct atccctggaa ggagagttcc 1500 cgtgcactgt cgaagtgaat ggacaggcaa agcaaaccgt gattcgcgtc tctaaatcgc 1560 gtttgttact tttaggtgct gatgttatcg acgcgtttgc tctatggtcg gttccgatgg 1620 acagtttttg ctgccacgtt acaggtacgt ctaccacgcc gaagcaatgg caagaacggt 1680 tcccaacagt ctttcaagga ataggcctat gtaaaaaggc aggggttacc ttgacgctga 1740 aagacaactg ccgtccagtc tttcgtccaa aaagacccgt tgcatacgct atgcaagagc 1800 ctgtaaatct agagcttgat aggctagaaa agttgggtat aataacacct gtaaagtttt 1860 ctgaatgggc agcaccggta gtagtggtta agaaggcaaa cgggaaaatt cggctatgtg 1920 gcgattattc cactgggttg aacgaagcgc taagaccaca tgattacccg ttaccacttc 1980 cagaggacat attttcaagg ctgtccaact gcaccatgtt cagcaaaatc gacttaacgg 2040 acgcctttct ccaagtcgaa atcgatcctc aatacagacc gctacttacc ataaatacgc 2100 atagaggctt gtaccactac aaccgcctac cgcccggcat caaagttgct cctgctgcct 2160 tccagcaact tatggataca atgctcgcag ggcttaaggg cgtatcagga tatcttgacg 2220 atatcatcgt aggaggaagt agcgaacatg agcatgacac aaatttggca gaagtattac 2280 acaggcttca agaatatgga tttacgatac gagccgataa atgtgctttt aaacagcaga 2340 aaatcacgta ccttggacac gtgatcgata gccacgggtt acgaccagat ccgtcaaaaa 2400 tcgagcttat aaagaagctc cctgaaccga aagacatatc cggtgtgaga tcatttctgg 2460 gagcaattaa ctattacggg aagttcatcc cgaatatgag gaagctgcga tatccgttag 2520 acaatttgct taaggcaaat aattcattct gctggacgcc agaatgcaaa aagtcgttcg 2580 caacgttcaa gtctctccta tcgtccgacc tacttctaac gcactatgat ccacggcaga 2640 agataatcgt ttctgcagat gcttcgtcca tcggccttgg cgcaacgatt agccacgtgt 2700 atcctggagg tgcaatgcgc gtaattcaac atgcttcccg cgccctcagt gaagcagagc 2760 gtcattacag ccaaatcgat cgcgaaggct tggctatcat cttcgctgta aaaaagtttc 2820 ataaaatgat cttcggcagg cgttttgttc ttcaaacgga ccatcgacct cttcttcata 2880 ttttcggctc gaagaagggt atcccgaccg tcaccgcaaa ccgtttgcag cgtttcgcac 2940 tcactcttct agcgtatgat ttcagcatag agtatgtacg caccgatgac ttcggaaacg 3000 ctgacttgct ttcgcgcctg atcaacacac aagccaaacc tgaagaagac accgtgatcg 3060 catgcataga aacagacatc aaagcaatgg tggtaagtgc gcttcataac actccacttc 3120 attttgcaga ccttatcaga gaaacacgga aagaccccct actgcaaaag ctcgtgcatt 3180 acatacgtga aggatggccc agtaatgcaa cctacacagg ggaattgtct cgtttctttg 3240 cacgtaagga tgctttatca accgttgatg gttgtatttt gttcggggag agggtggtga 3300 ttcctcgagt tctacaacag caatgtctgc aacaaataca ccacggtcat cctggcatac 3360 aacggatgaa agcattggcc agaagctacc tttattggcc atccttggac gctgacatca 3420 cagaatgggt caaaacatgt aacgcgtgcc aagcagttgc acgatcaccc cctcactcaa 3480 gccccgttcc ttggccaaaa gctgcaggtc catggcagcg cattcatgtg gactatgctg 3540 gtccactaga tggagatttc tacctcatcg tagtagattc gttcacgaaa tggcctgaag 3600 tatttcgaac gagcagcact acatccgcag caaccatcgg catcttacgc ggaatattcg 3660 cccggttcgg tgttccaacc actctagtat ctgataacgg tacgcagttt acgagcgaag 3720 attttaagtt tttttgcttc cagaacggca tcgagcacat aacgacggca ccctatcatc 3780 cgcagtctaa cgggcaggct gaaaggttcg ttgatacttt caagcggacc gttaagaaaa 3840 tatcagccga tgggagaacg atgcaagagg cgctggatgc attcctactg tcctaccgaa 3900 gtactccgag ctccgtgtta gaagggctaa aatcaccagc ggagataatg tttggcagac 3960 caatgcgtac aaccttagac ctattacgtc caccactcgc aggagggtta acagacagtc 4020 cagtagcagc taaacgcagg gaatttcgcc cgtccgacct ggtatatgtt aagtgttact 4080 ctcgcaacgg ttggagttgg acagctggga ccattgttag ccgcattgga aacgtaatgt 4140 acaccgtcag gacagtcgat agaaagacca taagaagcca cgtgaaccag ctgcgcgagc 4200 gccgtgaacg tcatcatcat cgtcatcgtg aatcatcgga gtctgacggt ttaccgctag 4260 atattctttt ggattcctac caccttactc cgcaaccgac gacatcaaca gccgaaccat 4320 catcagcatc gctacagcat actacatcta cacacacgag cgtatcgtct gcatcacacc 4380 cacaaacgtc agcagtcatc gatccgcgta cggcctcaaa cgcggaaccg ctgcatccaa 4440 cgaattcccc aattccggta caagctccaa cccgagaacc acgtcgcttt tctaggcata 4500 gaagaccgcc cagtaggttc gacccgtaca ggcgttttta aaagagggga ga 4552 // ID BEL15-I_AG repbase; DNA; ANG; 5669 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE BEL15-I_AG is an internal portion of the BEL15_AG LTR DE retrotransposon - a consensus sequence. XX KW 5-bp TSD; BEL15-I_AG; BEL15-LTR_AG; BEL15_AG; Bel clade; KW LTR retrotransposon; RING Zn-finger; integrase; peptidase; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5669 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "BEL15_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(3), 37-37 (2003). XX DR [1] (Consensus) XX CC BEL15_AG is a young family of Bel/Pao-like LTR retrotransposons. CC BEL15-I_AG, an internal portion of BEL1_AG is flanked by CC BEL15-LTR_AG CC LTRs. The BEL15-I_AG consensus sequence was reconstructed based CC on CC multiple alignment of 6 copies; they are less than ~1% divergent CC from CC the consensus sequence. CC The consensus sequence encodes one 1741-aa BEL15_AGp Bel-like CC protein CC (pos. 311-5533). CC BEL15_AGp is composed of the peptidase A16 (pos. 120-290), CC reverse transcriptase (pos. 770-900) and CC integrase (pos. 1450-1600) domains. XX FH Key Location/Qualifiers FT CDS 0..0 FT /product="BEL15_AGp" FT /translation="MANAVNLLSRRRTLEEKIQRVIAFADNFVPERDEFRL FT GLFISDTERVAAEFDTVQQLIEDGAAPEAREMESHFRATTEDALMAARASL FT QALSRPSHNVIPASSTIATSGVRLPTISLPEFDGNEMQWATFRDTFEALIH FT CNEEVLTIQKFHYLRAALKGEAAKLLESIPLCASNYNIAWKSLVDRYANEY FT LQKKRHLQAMFNIGKVTKESNASLHRLVDDFDRHVKMLHQLGEPTAQWSTV FT LEYVLCTKLPDETLRTWEDYASTLSSPNYSMLIEFLQRKMRTLESISMNHP FT ATREATHPSFVRRAPQHLSSCSTMASSSKGCPHCQHDHALSSCYKFCRLPL FT SERFQIAIEKKVCHNCLRKGHLARNCASSSRCKHCGERHHSLLHRSSAVGT FT EPKLVYAEGQESTGRNDRYQAQSLNVTKHPIRSEEVFLLTVRLSIVDADGK FT EHSVRALLDCASQPNLMTEKLVKLLQLQRCPSNVKISGAGKISRDVRGSVF FT AEIRSKRQPFSCGVQFLVMDKLTSNLPSETVSVGHWCIPKGLELADPEFNT FT SQPVDLVIGVKHYYSFFPSAARVHLGDELPLLIDSVFGWIVAGSATLQCPE FT PQVTSSNAICMMSLEESIERFWKTDSLVMKDGYSPEERRCEQIFRDTTARN FT ETGRYIVRLPRHPDFGIRLGASKVSAVRRYDLLERRFAKNSKLKEEYHAFM FT KEYLELGHMSLVRDGDAVPAESYYLPHHPVFKESSTTTKIRVVFDGSSKTT FT SGYSLNDALCVGPVVQDDLLDQLLRFRTYKVALVGDIAKMYRQILLHPDDR FT PLVRILFRFEPQQPVQTYQLNTVTYGLAPSSFLATRALIQLADDEGNAYPR FT AGPALRKNFYVDDFIGGAQSVEEATCLRTELAELLQKGGFELRKWTSNCVD FT VLHGLDEAQVGTTTKMSFDSHEAVKTLGISWVPQGDWLVFEGVCQPDDDVI FT TKRSVLSAIAKMYDPLGMIAPIIIRAKMIMQEIWVSSRDWDESLPEDIVCK FT WKQFQKEIRSLSQYRMDRFILLPEARNIELHTFADASSAAYGACTYVRCED FT AGRVRVMLLASKSKVAPLKNLTIARLELCACVLAAHLHHRIKGAIDVAVNA FT SYFWTDSAICMHWIKAPPSTWKTFVANRVAEIQHFTSGAKWRHVAGVENPA FT DLVSRGMEVSEFNNSRAWRHGPAWLEHSQDAWPMPNPQSIPEVAEQERKEL FT ISAVTVCHNELFLYWSSYTRLVNVVSYCMRFIAMLPHLRQLRGRRVLRSSE FT GNAQLKDVSSRKVLSARERAAAVNALTRLAQRESFAEELRDLQGGKRVKKQ FT SELKRLTPFVDEKGIIRVGGRLNLSQLPYQSKHPALLPKGHPLARLIAEHD FT HKMLLHGGGRLLLSVIREKFWPLNGRMLVKSVVRSCIKCIRYQPTLAEQHT FT GQLPAARIIPSRPFAVTGVDYAGPLYLKPAHRRAASLKAYLCVFVCFATKA FT VHLELVGDLSTDGFLAALRRFTARRGVPDHLHSDNGKNFEGARNELRELFA FT TLTSEAAQSTIASSCADQGISWHMIPPRAPHFGGLWEAAVKTAKRHLFRHL FT GSTRLSFEGYYTVLHQIEAAMNSRPLLPLTDDPNDLAALTPSHFLIGSSLT FT ALPDPDMTMAHTNAHEHLAKLQLLVQKFWKHWQKEYLQELQKDPRVARQAD FT QIQPGRMVILMDELLPSTRWPLARVIEVHPGPDGLVRVVTLRTIKGLIKRP FT IAKICPLPVEERGNNTLPAAT" XX SQ Sequence 5669 BP; 1408 A; 1286 C; 1506 G; 1469 T; 0 other; tttggtgccg tgaccaggat tggtgtattg tgaagtgaaa ttgttgctcg ttttttgtga 60 ttttgctgcg tgtaaaaatc tgttgacttt acgttttttc gtcaccgacc atagacaacg 120 aacagtttct ttcgaaattc gttgtccagt gcttgttgcg tgttgtgact tgattgttga 180 acattgtgta attgtttgat tggtgtgata attgatcata cgcacggaca gaacatttgt 240 gaatttgtgt ttcgtgttca gtgtcctgtg tttttagcgc catcgttcca tctgccatta 300 cgagactgat atggctaacg ctgtaaattt gctgtccaga cggcgaaccc tggaagaaaa 360 aattcagcga gttatagcat ttgctgataa ttttgtgcct gagcgggatg aatttaggct 420 tggcttgttc atctccgaca ccgaacgtgt tgcagcagag ttcgatacag tgcagcagtt 480 gatcgaggat ggagcagcac ccgaagcgcg cgaaatggag agccattttc gcgctacgac 540 tgaggacgcc ttaatggccg cgagggccag cctacaagcg ttgtcgcggc catcgcataa 600 tgttattccc gcctcatcga ccattgctac atctggagtg aggctgccaa ccatttctct 660 tccagaattt gacggcaatg agatgcaatg ggcgacattt cgggacactt ttgaagcact 720 aatccactgc aacgaagagg tgctaactat ccaaaagttc cattatcttc gagctgcgct 780 caaaggtgaa gctgcaaagt tgctggaatc gattccgttg tgtgcatcta actacaacat 840 tgcctggaaa tcgttggtgg acagatacgc caacgagtat ctacaaaaga agcgtcatct 900 acaggcaatg ttcaacatcg gcaaggtgac caaggaatcg aacgcatcgt tgcacaggct 960 ggttgacgat tttgatcgtc acgttaagat gctgcatcag cttggcgaac caacagcgca 1020 atggagcacc gtgctagaat atgtgttgtg caccaagctt cccgatgaga cgctacggac 1080 gtgggaagat tatgcttcca ccctcagcag cccgaactac agcatgctaa ttgagttcct 1140 gcaaagaaaa atgagaacat tagaatcgat ttctatgaac catccggcaa cgagagaagc 1200 tactcatcct agttttgtac ggcgagcccc acagcacctt tcttcctgct caaccatggc 1260 gagcagttca aaagggtgcc cgcattgcca gcatgatcat gccttaagca gttgctataa 1320 gttttgccgt cttcctctgt ctgagcgttt tcagatagct attgagaaga aagtttgcca 1380 taattgctta agaaaaggtc atttggcaag gaactgcgct tcatcgtccc ggtgcaaaca 1440 ctgtggtgag agacatcact ctcttttgca tcgttcgtct gcagtcggta cggaaccgaa 1500 actcgtgtat gcggaaggac aagaatctac aggaaggaat gatcgttacc aagcacagtc 1560 gctcaacgtt actaagcatc ctattcgatc ggaggaagtg tttttgctca ctgttcgttt 1620 gagcatagtt gatgctgatg gtaaagagca ttcggtacgc gctttgctag actgtgcttc 1680 tcaacccaac ctcatgacag agaaacttgt caaattgctg cagctacaac ggtgtccttc 1740 taacgttaaa atatcgggag ctggaaagat atctcgtgac gttcggggat cagtgtttgc 1800 tgagatacgc tccaagaggc aaccattcag ctgtggtgtt cagttcctgg taatggacaa 1860 gctgacatcc aatttgcctt ctgagactgt aagtgtcggt cactggtgta tcccaaaagg 1920 cctcgagcta gctgatcccg aattcaacac atcgcagccg gttgatttgg tgataggtgt 1980 caagcactac tattcattct tccccagtgc agccagagtt catttgggtg atgagttgcc 2040 gctattgatt gatagtgtgt ttggttggat tgttgctggt tcggctacgt tacaatgccc 2100 ggaaccacag gtaacaagtt caaacgctat ctgtatgatg tcgctggaag agagcatcga 2160 acgattctgg aagacagatt cattagtgat gaaggatggc tactcgcctg aggaacgaag 2220 atgcgagcag atattccgtg atacaacggc gagaaatgag actgggcgtt atatcgtacg 2280 cttaccccgt catcccgatt tcggcatcag actgggtgct tccaaggtaa gcgcagtacg 2340 aagatatgat ctgttggaga ggaggttcgc taaaaattcc aagttgaagg aagagtacca 2400 tgcgtttatg aaggagtatc ttgagctcgg gcacatgagt ttagttcggg atggagatgc 2460 agtacctgct gagtcgtatt atttgccaca tcatcctgtg ttcaaggagt ctagcaccac 2520 aacgaaaatc agggtcgtgt tcgacggttc ttctaaaacc accagcgggt actcgttgaa 2580 tgatgctttg tgcgtgggac cagtggtgca ggacgacttg ctagatcagc ttttgcggtt 2640 ccgcacgtat aaggtggcat tagttggcga tatagcaaaa atgtaccgcc aaatacttct 2700 tcatcctgac gatcgaccgt tggtgcgaat cttgtttcgc ttcgagccgc agcagccggt 2760 gcagacctac cagctgaata ctgtaacgta tggtctcgca ccttcctcct ttctcgctac 2820 gcgcgctctt attcagctgg ctgatgatga gggtaatgca tacccacgag cgggccccgc 2880 tctacgaaag aatttttacg tcgacgactt catcggtgga gcccaatcag ttgaggaagc 2940 cacctgccta cgaactgaat tggctgagct actacaaaag ggcggatttg agctgcggaa 3000 atggacgtcg aactgtgttg atgtgctgca tgggctggat gaggcacagg ttggaaccac 3060 aaccaaaatg agcttcgatt ctcacgaagc cgtgaagaca cttggcatca gttgggttcc 3120 acaaggtgat tggctggtgt ttgaaggtgt gtgccagcca gacgacgatg tgatcaccaa 3180 gcgatctgtg ttatctgcca tcgcaaaaat gtacgatcct ttggggatga tagcgccgat 3240 aatcatccgt gctaagatga ttatgcagga gatatgggtg tcgtcacgtg attgggatga 3300 atcgttgcca gaagacatcg tatgcaagtg gaagcaattt cagaaggaga tacgatctct 3360 atcacaatat cgaatggaca gattcatact gcttcctgag gcgcgaaaca tcgagctgca 3420 cacttttgct gatgcatctt ctgcagccta tggtgcttgc acatatgtgc ggtgtgaaga 3480 cgctgggcga gttcgagtca tgcttttagc gtcgaagagt aaagttgctc cgttgaaaaa 3540 tctgactatt gctcggttgg agctgtgcgc ctgcgttcta gccgcgcact tgcatcaccg 3600 cataaaaggt gccattgatg tggccgtaaa tgcatcatat ttttggaccg actcagctat 3660 ctgcatgcat tggatcaagg cgccaccgag cacatggaaa acgtttgtgg caaaccgtgt 3720 agcggaaatc cagcatttca ctagtggtgc caagtggagg cacgtagctg gagttgaaaa 3780 ccctgctgat ttggtatcgc gtgggatgga ggtgtccgag ttcaacaaca gtcgagcgtg 3840 gaggcatgga ccagcttggc ttgagcattc gcaggatgct tggcccatgc ccaatccaca 3900 aagcattcct gaagtggcag agcaagagag gaaggagtta atttcggcag ttaccgtatg 3960 ccacaacgaa ttgtttctgt attggtcatc ttatactcgc ctggtgaacg ttgtcagcta 4020 ttgcatgcga ttcattgcca tgctccctca tctaaggcaa ttaagaggaa gaagagtttt 4080 acgatcctcc gaggggaatg cgcaactgaa ggacgtttca tctcgtaaag ttctcagcgc 4140 tcgggagcga gcagcagcag ttaatgcctt aacacgcctt gcccaacgtg agtcttttgc 4200 tgaggagcta cgcgatctgc aaggaggaaa gagggtcaaa aaacaatcag agctgaaaag 4260 actcaccccg tttgtggacg aaaagggtat catccgggtt ggtggacgac tgaatttgtc 4320 acagctgccg tatcagtcga aacatcccgc actgctgccg aagggtcatc cattggctcg 4380 attgattgct gagcatgatc acaagatgct tcttcatgga ggaggacggt tgctgctgtc 4440 agtcatacga gagaagtttt ggcctttgaa cggaagaatg ttggtcaaaa gcgtagttcg 4500 gagctgcata aaatgtattc ggtaccagcc tactcttgca gagcagcata ctggccagct 4560 gcccgctgcc cgaattatac caagccggcc ctttgctgtt accggggtcg attatgcagg 4620 tcccctgtac ctaaaacctg cgcataggcg cgcagcatcg ttgaaagcgt acctgtgtgt 4680 cttcgtgtgc tttgcaacaa aggcggtgca tttggagcta gtgggagacc tgtcaactga 4740 cggttttctc gcggctctac ggaggtttac ggcgaggaga ggtgttccgg atcatctcca 4800 ttcggacaat gggaaaaact tcgagggagc gaggaatgag ctacgagagc tttttgcgac 4860 tttgacgagt gaagctgcgc agagtaccat cgcttcatcg tgcgcggacc agggaatctc 4920 ttggcacatg attccaccga gagctcctca ctttggcggc ctttgggaag ccgcggtgaa 4980 aacggccaaa cgtcatctgt tccgtcacct cggaagtact cggctctcct tcgagggtta 5040 ctacaccgta ctgcaccaga tcgaggcagc tatgaactct cgtcctctct taccgcttac 5100 ggatgatccc aatgatttag ccgcactaac accttctcat tttttgatag gctcctcatt 5160 aacggcactt ccggaccctg atatgacgat ggcacacact aatgcacatg agcacttggc 5220 gaagctgcag ctattggtgc agaagttttg gaaacactgg caaaaggagt acttgcagga 5280 gttgcaaaag gacccgcgcg ttgccaggca ggcagatcaa atccaacccg gtcgaatggt 5340 tatcctgatg gacgagttgc tgcctagtac ccgttggccg ttagcgcgtg tgatagaggt 5400 gcaccccggt ccggacgggc tggtgcgagt agttaccttg cgtacgataa aaggtttaat 5460 taagcgtccg atagccaaaa tatgtccttt gccagtagaa gagagaggaa ataacacttt 5520 gccagcagct acgtaaccga aatgtccgta tttgtcgatt taatgtttgt gcaccgcgct 5580 tcgtcgttgt ataaagaaaa ccacgcggtt tatgtcaggt ctgagttagt gttagtgtta 5640 gtgttgatgg tacatcaagg cgaggagta 5669 // ID GYPSY11-I_AG repbase; DNA; ANG; 5781 BP. XX AC . XX DT 03-OCT-2003 (Rel. 8.09, Created) DT 03-OCT-2003 (Rel. 8.09, Last updated, Version 1) XX DE GYPSY11-I_AG is an internal portion of retrotransposon GYPSY11_AG DE - a consensus sequence. XX KW 4-bp TSD gag; AP protease; GYPSY11-I_AG; GYPSY11-LTR_AG; KW Gypsy clade; LTR retrotransposon; RNase-H; integrase GYPSY11_AG; KW mdg1 lineage; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5781 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY11_AG, a member of the mdg1 lineage of the Ty3/gypsy group RT of LTR retrotransposons."; RL Repbase Reports 3(9), 164-164 (2003). XX DR [1] (Consensus) XX CC GYPSY11_AG is a family of gypsy-like LTR retrotransposons CC that, according to the aminoacid sequence of its ORF2, is CC phylogenetically grouped with Drosophila representatives CC of the mdg1 lineage. CC GYPSY9_AG, GYPSY10_AG, GYPSY12_AG, GYPSY13_AG, GYPSY14_AG, CC GYPSY15_AG, GYPSY16_AG, GYPSY17_AG, and GYPSY8_AG are other CC members of this same lineage in Anopheles gambiae. CC The GYPSY11-I_AG consensus was reconstructed after multiple CC alignment of 7 copies. CC The consensus encodes the 445-aa GYPSY11_AG1p gag-like protein CC (pos. 772-2106) and the 2122-aa GYPSY11_AG2p (pos. 2067-5732). CC The sequence of the LTRs flanking GYPSY11-I_AG is deposited as CC GYPSY11-LTR_AG. XX FH Key Location/Qualifiers FT CDS 772..2106 FT /product="GYPSY11_AG1p" FT /translation="AATTTNVFKKVPYTCNFYGRSCINMQKLLEKIEILDR FT TYDQVRQLNKCYRLCALTTLRNNTKELYDEIQELLRKHESSIKDEILTTLV FT KKSRHLYYEINKCIKIHFERHPDSLNTTLSENQFDITIETKSDKMADIIEL FT IKITTSLISKYDGNEKDLKGVVSNLNVLKKIVKPENRETIIELVLGRLTGK FT ARIVVGEAPKSIEDIVNKLQDRCSIKVTPEIVVSKMDNTKQTGTIEDFGSI FT IEKLTQQLEEAYIAEEITPEVARKKATKSGISALSYGLKDGETKIIMRSSK FT FETLHEAIEQAVKLELEDRTKKGKNEQTKILYSNATRNNRGYGNNYQGRNN FT YNRFTNNNNNRYQTQNPPRFPPARYGHNNNRNNNNYRNNNFNNTRNQHANR FT QNNSNRNQYVQSNQRNNSNLQNNRAPIHNTVTAEEQNNFLGQPQASENTQY FT " FT CDS 0..0 FT /product="GYPSY11_AG2p" FT /translation="FFRATSSIGKYPILTINPDADNFVKVKIEITKEIYST FT LIIDTGATVSVLKASKLKPGCKINTSKKLTLISSSDHESETLGTAMTTIHF FT GDYSIMHEFHIIEDVESIFSDGLLGKDFIKHRCIVDYVNWMIYFSSDNGLI FT SHPIEDNVNGNYILPKRSEVVRKISIPNLTEDSIILSQEIQPGVFCGNTIV FT SKRNQYIKFINTTDKDVSFNIKSYTPEVEPLREYEQLQKKLDTSKERIQKI FT HNKIHIENIPQIAREELENLITKFSDIFCLEDEPVSTNNFYTQEISLKDNI FT PSYIPNYKQIHSQTEEMQSQVEKMLKNNIIEHSVSSYNSPILLVPKKSGEG FT KKKWRLVVDFRQLNKKILPDKFPLPRIDTILDQLGRAKYFSTLDLMSGFHQ FT IKLDKNSRKYTAFSTPTGHYQFTRMPFGLNISPNSFQRMMAIAMAGLTPEL FT AFVYIDDIIVTGCSARHHISNLGKVFDRLRKYNLKLNAEKCCFFKTEVTYL FT GHKITDKGIYXDDAKFDTIKNFPIPTNADEARRFVAFCNYXRKFVQNFAKI FT AKXINNLIKKDVKFAWTSECQAAFDTLKQSLLSPTILQYPDFKKQFIITTD FT ASDMACGAVLSQITDGNDLPVAFASKSFTPGEKNKPIIEKELTAIHWAINY FT FKPYVYGQKFIVRTDHRPLAYLFGMKNPTSKLTRMRLDLEEFDFEIEYLAG FT KANVAADALSRIILNSDDLKASIPKSKTILMVNTRAMVKKNNEKTDINKDE FT PIATTGTDHPAMWKTDRPLEVRKVLKIGTQRIKNNVEFIIYNHSYSKALGK FT FLLRNDVNGSQALEFALLEMCKIAKQYGRNKLAWSEEDHLFEEYSQQTIKE FT IANRAITKFEIILFTPTRWITTEKDRLRIISDYHMTPSGGHIGQYRLYQKI FT REKYKWKNMKDDIKKYVRNCKACIVNKTTRHTKEKTVVTTTPTKPFNIISI FT DTVGPLTKTNKNNRYAITIQCDLTKYIVVIPIHNKEANTIAKALVENFILT FT FGTFIELKSDQGLEYNNEILHKISEILKIKQTFSTAYHPQTIGSLERNHRC FT LNEYLRSYTNEHHDDWDDWTKFYEFVYNTTEHSDTNYTPYELVFGRKANLP FT QDIFKTKIEPVYNIDQYYFEMKYKLQKSNEIARENLIKAKIKRQQTLNKDT FT VPLIINLGDQVYLENENRKKLDPVYIGPFTVVSDQGPNCVIQNNTTKKIST FT VHKNRLIKYTGE" XX SQ Sequence 5781 BP; 2420 A; 980 C; 980 G; 1401 T; 0 other; tggcgaccgt gacttttaaa ctgtaatctt cggatgtgca aaaaaaaagt gacgaatgaa 60 acctcaaaca cggacaaaaa gtgcaaagtg gaaacgtttt ataaaaatcg caagtgcttc 120 tgaaatgact tggaaagtga ttaattacca acaatagtga actacgagtg aaaaccaatc 180 attttcaaaa tgggaggcaa ggctgcaaaa cctgaaacaa atataaaagg agaccatgat 240 ctcacaatag ttcaaactca gaatattcat acagaatatc atctgactca ggatttaaaa 300 ctaaacatta ttttagggct gctaatcacc ctgtgcattg ttaaaatagc gaaaacttgt 360 tacaaacacc ttcgtaacca agcgcaaaaa cacgctttaa aagtgcttac gctaccaaag 420 tagcaacgta aacattgaat cgagaacagt gaatgaatga tatacgaaaa aggtgatatg 480 ctatttgacc cacaaaaata ggtaaaggct gtgggaaagt accgtaagtt caccaatgca 540 gctatggacc gcgattatga aaaacaatta tgcgcgtatg agaacctacc tcaacgtgta 600 aaaacactgt gaccagcggt atggtttgga acagccgttc ccaacgcgga agacgagaag 660 aataaggcaa gaatggacgg ttgctcactg tcggacagac aggtgaaaga agagatccct 720 gggccaaggg acagcagttc accgagcacc gggaacaacg tagcaccgtg agcagcaaca 780 acaacaaatg tatttaaaaa ggtaccgtac acatgcaatt tttacggacg aagctgcata 840 aatatgcaaa agctgttaga aaaaatagaa atactagata gaacgtatga tcaggttaga 900 cagctaaaca aatgctatag gctctgcgcg ttaaccacac taagaaataa cactaaggaa 960 ttatatgacg aaatacaaga gcttctacga aagcacgaat catccattaa agacgaaata 1020 ttaacaacct tagttaaaaa aagtagacac ctatattacg aaataaataa gtgcattaaa 1080 atacatttcg aaagacatcc agattcgtta aatacaacat tatcagaaaa ccagttcgac 1140 ataacgatag aaactaaatc tgacaaaatg gctgacatta tagaactaat taaaatcacc 1200 acttctctca tatcaaagta tgatggtaat gagaaagatt taaaaggtgt ggtgtcaaat 1260 ttaaatgtat taaagaaaat agtgaagccg gagaataggg aaacaataat agagctagta 1320 ttaggacgtc tgacaggtaa agcgcgaatt gttgtaggag aagccccaaa atcaatagaa 1380 gatatagtta acaaactaca agacagatgc agcataaagg taacaccaga aatagtagta 1440 tccaaaatgg ataatacgaa acagacagga acaatagaag atttcggaag cattatagaa 1500 aaactaacgc agcaacttga agaagcatac atagcggaag aaataacacc agaagtagcc 1560 aggaaaaaag caactaagtc tggaattagc gcattgagtt atggacttaa ggatggcgaa 1620 accaaaataa taatgagatc aagcaaattt gaaaccctgc atgaagcaat agaacaagca 1680 gtaaaattgg aactagaaga cagaacgaaa aagggaaaga atgaacagac aaaaatttta 1740 tattcaaacg ctactaggaa caatagaggg tatggtaaca actaccaggg aaggaacaat 1800 tacaatagat tcacaaataa taataataat aggtatcaga cacaaaaccc acccaggttc 1860 ccacccgcaa gatatggaca taacaataac cgaaacaata ataactacag aaataacaac 1920 tttaacaaca ctagaaatca gcacgcaaat cgacaaaata attccaacag aaatcagtac 1980 gtacaatcca atcagagaaa taatagcaat ttgcaaaata atcgagcgcc tattcataac 2040 acagtaacag ccgaagaaca gaataatttt ttagggcaac ctcaagcatc ggaaaatacc 2100 caatactaac cataaatcct gatgcagata attttgttaa agtaaaaata gaaattacaa 2160 aggaaatcta tagcacactc atcatagata ccggagcaac cgtatccgta cttaaagcta 2220 gtaaattaaa accaggttgt aaaatcaata catcaaaaaa attaaccttg ataagctcta 2280 gtgatcatga atcagagact ttaggaactg ctatgacaac aattcacttt ggcgattatt 2340 ccattatgca cgaatttcat ataatagaag atgtagaatc cattttttcc gacggactat 2400 taggaaaaga ctttataaag cacagatgta ttgttgatta tgttaattgg atgatatact 2460 tctcatctga taacggattg atttcacacc caatagaaga caatgtaaac ggaaattata 2520 ttttaccaaa acgaagtgaa gtagtacgaa aaataagtat accaaacttg acagaagatt 2580 caatcatctt atcacaagaa atccaaccag gggtattttg cggaaacaca atagtctcaa 2640 aacgtaatca gtatatcaaa ttcattaata ccacagataa agatgtttct tttaacataa 2700 aatcttatac accagaagtt gaaccattaa gagagtatga gcaattacag aagaaacttg 2760 acacatctaa ggaacgaatt cagaaaattc ataacaaaat ccatatagaa aatattccac 2820 aaatagcaag agaagagtta gaaaatctta tcacaaaatt ctcggatata ttttgtttag 2880 aagatgaacc ggtctctact aacaattttt atacccagga aatttcatta aaagataata 2940 ttccttctta tataccaaat tataaacaaa tacattcaca aacagaggaa atgcaatcac 3000 aggtagaaaa gatgttgaaa aataacatta tagaacattc tgtttcatca tataattcac 3060 cgatactatt agtaccgaag aagtcaggtg aaggaaaaaa gaaatggcgt ttagtagtgg 3120 attttcggca attgaacaag aaaattttac cagacaaatt ccctttacct cgcatagaca 3180 cgatactaga tcagctagga agagccaaat atttcagcac attggatttg atgtcagggt 3240 ttcatcaaat caaacttgat aaaaattcga gaaaatacac agctttttcg acacctacag 3300 gccactatca gtttacaaga atgccatttg gactcaacat tagcccaaac agctttcaaa 3360 gaatgatggc tatcgctatg gctggtttaa caccagagct agcatttgta tatatagatg 3420 atattatagt tactggatgc agtgcacggc atcatatcag taatttaggt aaagtttttg 3480 ataggctaag aaaatataac cttaaactaa atgcagaaaa atgttgtttc ttcaaaacag 3540 aagtaacgta cttaggtcat aaaataacag ataaaggaat ttatccggac gacgcgaagt 3600 ttgacacaat taaaaacttc ccgattccta ctaatgctga tgaagcaaga cgttttgtcg 3660 cattttgtaa ttattaccgt aaatttgtac agaattttgc taagatagct aaacctatta 3720 ataatttgat taagaaagac gttaagtttg catggacttc agaatgtcaa gcagcttttg 3780 atacattgaa acaaagctta ctctcaccca caattttaca atatccagat tttaaaaaac 3840 aattcattat tacgacagat gcatcggata tggcatgtgg tgcagtgtta tcacaaataa 3900 cagatggaaa cgatttgcca gtcgcatttg cgagtaaaag ttttacacca ggggaaaaga 3960 ataagccaat tatcgagaaa gagcttacag ctatacattg ggcaattaat tattttaaac 4020 cttatgtata tggacaaaaa tttatagtta gaacagatca tagaccatta gcatacttat 4080 ttggtatgaa aaatcctact tctaagctta ctagaatgag actagattta gaagaatttg 4140 actttgaaat agaatattta gcaggtaaag ctaatgttgc ggcagacgca ctatcaagaa 4200 taatccttaa ctcagatgac ctaaaagcat caataccaaa atccaaaacg attttaatgg 4260 ttaatacaag agccatggtt aagaaaaata acgagaaaac tgatataaac aaagacgaac 4320 caatcgcaac aacagggact gatcaccccg cgatgtggaa aacagataga cctttagaag 4380 tgagaaaggt actaaaaata ggtacgcaga gaattaagaa caacgttgaa ttcataatat 4440 acaaccattc atacagtaaa gcactaggaa aatttctttt gagaaatgat gtaaatggaa 4500 gtcaagcatt agagtttgct cttctagaaa tgtgcaaaat cgcgaaacaa tatggaagaa 4560 ataagctagc atggtcagaa gaagatcatt tattcgaaga atattcccaa caaactatta 4620 aggaaatagc caacagagcc attaccaagt ttgaaataat cctgtttact ccaactagat 4680 ggataacaac agaaaaagat aggctgagaa taatttcaga ttatcatatg accccttcgg 4740 gaggacatat aggccagtac agactgtacc agaaaataag ggaaaaatat aaatggaaaa 4800 atatgaaaga tgatatcaag aaatacgtac gaaattgtaa agcatgcata gttaataaga 4860 cgactagaca tactaaagaa aaaacagttg taactacaac accgacaaaa ccatttaaca 4920 taatttcaat cgacacagta ggacctctaa caaaaactaa caaaaacaac aggtatgcaa 4980 taaccataca atgtgactta acgaaatata tcgtagtaat acctatccat aacaaagaag 5040 caaacactat agccaaggca ttggtagaaa acttcatcct tacattcgga acatttatag 5100 aattaaaatc agatcaagga ctagaatata acaatgaaat attacacaaa atctcagaaa 5160 ttttaaaaat caaacaaact tttagcacag cttaccaccc acagacaata ggatcattag 5220 agagaaatca tagatgtcta aacgaatacc taagaagcta tacaaacgaa catcatgatg 5280 actgggacga ttggacaaaa ttttacgaat ttgtttacaa tacaacagaa cactcagaca 5340 caaactacac accatacgaa ctagtatttg gtagaaaagc gaatttacca caagatatat 5400 ttaaaacaaa aattgaacca gtttataata ttgaccaata ttatttcgaa atgaaatata 5460 aactccaaaa atcaaacgaa attgccagag aaaatttgat aaaagcaaaa attaaaagac 5520 agcaaacctt aaataaagat acagtaccac ttattataaa cttaggagat caggtatatt 5580 tggaaaatga aaataggaaa aaattagatc cagtctacat tggacctttc acagtagtaa 5640 gtgaccaagg gcctaattgc gtaatacaaa ataatacaac aaagaaaatc tctacagtac 5700 acaaaaatag attaattaag tacacaggag aataacttca atcattgtaa ttcattacgt 5760 tattctatta aaggggggag g 5781 // ID GYPSY55-I_AG repbase; DNA; ANG; 4545 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY55-I_AG is an internal portion of retrotransposon GYPSY55_AG DE - a consensus sequence. XX KW LTR Retrotransposon; Transposable Element; 5-bp TSD gag; KW AP protease; GYPSY55-I_AG; GYPSY55-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY55_AG; mag lineage; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4545 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY55_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 149-149 (2004). XX DR [1] (Consensus) XX CC GYPSY55_AG is a family of gypsy-like LTR retrotransposons that, CC according to the aminoacid sequence of its reverse CC transcriptase, RNase and integrase domains is phylogenetically CC grouped with representatives of the mag lineage of other CC organisms. GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, CC GYPSY23_AG, GYPSY24_AG, GYPSY25_AG, GYPSY26_AG, GYPSY27_AG, CC GYPSY28_AG, GYPSY56_AG, GYPSY57_AG, GYPSY58_AG, GYPSY59_AG, CC GYPSY60_AG, GYPSY61_AG, GYPSY62_AG, GYPSY63_AG, GYPSY64_AG, CC GYPSY65_AG, GYPSY66_AG, GYPSY67_AG, GYPSY68_AG, and GYPSY69_AG CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY55-I_AG consensus was reconstructed after multiple CC alignment of 4 copies. The consensus encodes the 1478-aa CC GYPSY55_AGP gag-pol like poliprotein (pos. 98-4531). The CC sequence of the LTRs flanking GYPSY55-I is deposited as CC GYPSY55-LTR_AG. CC GYPSY55_AGP: CC MSQEEDERRSSQGWQNVMAPVTVHAPQDTGQRPQVPLTQFVQQNGSVPPFSASSVQNVPESDNS CC ATAQILKLMQQQMAQQQQILLQLMQQTKAPIEQTPLEQILDSLCGHIKEFRYDEENGSTFAAWY CC TRYEDLFLKDAARLKDEAKVRLLMRKLGTSEHDRYTSYILPKHPRDYSLQETVAKLTSLFGTKE CC SLLHRRYKCLKLTKLRTEDFITFACRVNRGIVDFELGRLTEEQLKCLVFVCGMKGEEDIEFRTR CC LLHRIEENQDVTLDQLSAECQRMTNLRQDSAMIERDRSEQVFSVQRNANRNQRTNTGPSHYTNE CC QRTGKPSRPCWLCGSLHWTRECTYKTHTCTQCGSVGHREGFCKQQRTKKQPHRKKTFRKRANMQ CC SVTVNVNSLHHRRKFLSCSVNGSQIRLQLDTASDITVISRRLWKRIGSPQLFPSSVIAKSASGD CC KLEMDGEFRATVGINGQEKQAVIFVAKGELALLGIDLAEAFSLWSVPIDQLCNNVTSTATTPEG CC IVRKFPGLFKSTMGFCAKAEVTLHLKPNCSPVFCAKRPVAYAMRDAVDAELDRLERLNIITPVQ CC HSEWAAPIVVVRKSNGQLRICGDYSTGLNASLHPHDYPLPLPQDIYTKLGNSTIFSQIDLSDAF CC LQVPIAEQSRRLLTINTHKGLYLYNRLPPGIKVAPGAFQQLMDQMLAGMERVASYMDDVIVGGR CC TQREHDDVLNETLRRIQEYGFTIRPEKCSFNKRQVRYLGHILDNHGIRPDPAKIAAIKDLSAPT CC DVSGVRSFLGAVNYYGRFIPNMRKLRYPLDNLLKEGSSFKWSPECQKAFEQFKSILSSELLLTH CC YDPRREIVVSADASSVGLGATIGHKFPDGTFKVVQHASRALTKAEKNYSQIDREGLAIIFAVTK CC FHNFIFGRHFTLQTDHKPLLRIFGSKKGIPVYTANRLQRFALTLQLYDFDINYVSTDNFGNADI CC LSRLIRNHEKLEEDYVIASIGLEEDIRSVVVNSLSSVALNATDVATATKSDPIMSKVIQFVRQD CC WPRNSTFSGELACFYARKEALSEMGGCLLFGERVIIPKALRQRCLRQLHHGHPGVQRMKSIARS CC YVYWPKIDTDIAELVASCNACASAAKSPPHASPVSWPEITAPWQRIHIDYAGPIDGFSYLIVVD CC AFSKWPEVIRTASTTSKATIRILNTMFARYGMPVTLVSDNGRQFISSEFEDFCICNGIEHLTSA CC PFHPQSNGQAERFVDTFKRAITKITSDGTAIEDALDTFLQTYRATPNPQVPNNEAPATVMFGRQ CC IRTCLELIRPVPKPQETNNDEQRRNFVPNDLVFAKIYSQNGWKWKPGRILRKCGNVMYRVMTED CC HKIIRSHINQLRRRVPSNQQSSKRDQHLLPLHILLDEWNLTPPSSSPDSSSSSSSSSPSSSSLS CC PPSDSAPCNLSPSSAESSSYRTVSQSSASPPPERDCIPEESHRGAQPVPPPRRSFRDRKAPRWF CC DPYLLY. XX SQ Sequence 4545 BP; 1272 A; 1154 C; 1051 G; 1068 T; 0 other; gttggcgacg aggattttta aagtttgtga cgaattttta tcggaagtga atttcgcgta 60 gtgcaaagct attccaccga caaggaaacg ccgcaggatg agccaagaag aagatgaacg 120 tcgatcatcg caaggttggc aaaacgtgat ggccccggtt acagtgcacg caccgcaaga 180 tacaggtcaa cgtccccaag tgccactgac ccagttcgta cagcaaaatg gttcggtccc 240 accattttct gcatcgtcag tacagaatgt tccggaaagt gataattctg caacggcaca 300 aattttgaaa ctaatgcaac agcagatggc gcaacagcag caaattttgt tgcaattaat 360 gcagcaaacg aaagcgccta tcgaacaaac gccccttgag caaatactcg attccttatg 420 cggtcacatt aaggaattca ggtatgacga agaaaacggt tcaactttcg ctgcatggta 480 cacgagatac gaggatcttt tccttaaaga cgctgcacgc ttgaaagacg aggcaaaagt 540 gcgtttattg atgagaaagt tgggaacctc tgaacacgac cgttacacca gctacatact 600 gccaaagcat ccgcgtgatt acagtttgca ggaaacagtt gctaaattga ccagcttatt 660 tggcacgaaa gaatccttgc ttcatcgacg atacaagtgt ttgaagctaa ccaaacttcg 720 caccgaggat tttatcactt tcgcgtgccg ggtcaatcgt ggcatcgtcg attttgagct 780 gggaaggcta acggaagagc agttaaagtg tttagtgttc gtttgcggta tgaagggcga 840 agaagacatc gagtttcgta cacgtctttt acatcgcatc gaggagaacc aggatgtaac 900 ccttgaccaa ctttctgcag agtgtcagcg tatgacgaat ttgcgtcagg atagtgcgat 960 gattgaacgt gaccgcagtg aacaagtatt ttccgttcaa cgtaacgcaa atcggaatca 1020 gcgcaccaac acaggcccgt ctcattacac caacgaacag cgtacaggta aacctagtcg 1080 accatgctgg ttatgtggat ccctgcattg gactcgcgag tgcacttaca agacgcacac 1140 atgcacacaa tgcggatcgg tcggccatcg cgaaggtttc tgtaagcagc aacgtaccaa 1200 gaagcagccc catcgaaaga agacttttcg taaacgagcc aatatgcaaa gtgtcaccgt 1260 gaatgtgaac agcctacatc atcggaggaa atttttatcg tgttcggtca acggttcaca 1320 aattcggctc caactggata cagcatcgga catcaccgtg attagtcgtc ggctttggaa 1380 acgcatcggt agcccgcaat tattcccatc atctgtaatc gcgaaatcgg catctggtga 1440 caaactcgaa atggatggcg aatttcgagc aacggttgga atcaacgggc aggagaagca 1500 agctgtaatt ttcgtggcaa aaggagaatt ggccttgctg ggaatcgacc ttgctgaagc 1560 tttctcctta tggtctgtgc ctatagacca gctgtgtaac aatgttacca gcacagctac 1620 tactccggag ggaatcgttc gaaaatttcc cggcctgttt aaatcaacga tgggtttttg 1680 tgcgaaagct gaggttacac tacacctgaa gccgaattgc agtccagttt tctgcgcaaa 1740 acgtcccgta gcttatgcca tgcgcgatgc agtggacgcc gaacttgata gactggagcg 1800 gctcaatatc atcacaccag tccagcactc cgagtgggca gcaccgattg tagtggtacg 1860 caagtcgaat ggacaactga gaatatgcgg agactattcg acgggactca acgcatcact 1920 ccacccccat gactatccgc tgccactgcc acaggacatc tacaccaaac tgggtaactc 1980 gacaattttt agccaaattg acttatctga tgctttctta caggtcccca ttgccgagca 2040 aagtcgccgc ttgctcacca taaacacgca taaagggttg tacctgtata atcgcctgcc 2100 accagggatt aaagtagcac caggcgcatt tcagcagctt atggaccaaa tgcttgctgg 2160 aatggaacgt gtcgcgagtt acatggacga cgttatcgtt ggtggacgaa cacaacggga 2220 acacgacgac gtactaaacg aaaccctgag acgcattcag gagtatggtt tcaccatccg 2280 cccggagaaa tgttcgttca acaaacgcca agttcgctac cttggtcaca tcctcgataa 2340 ccacggcatt cgtccagacc cagccaaaat agctgcgata aaggatcttt cagcgccaac 2400 cgacgttagt ggtgtacgtt cctttcttgg cgccgttaac tattatggaa gattcatccc 2460 gaacatgcgc aagcttcgat acccgttgga caacctactg aaagaaggca gttcgtttaa 2520 gtggtcccct gaatgccaaa aagccttcga gcagttcaaa agtatccttt catcggaact 2580 gctcctgacg cactacgatc caaggcgtga gatcgttgta tccgcggacg cgtcatccgt 2640 cggattaggc gcaacaatcg gtcacaagtt ccccgatgga acttttaaag ttgtgcagca 2700 tgcgtcccgt gcacttacaa aggcagagaa aaactacagc cagatagatc gggaaggctt 2760 ggcaattatc ttcgctgtga cgaaattcca taattttata tttggacggc attttactct 2820 acaaacagac cacaagccac ttttgagaat tttcggcagc aagaaaggca tccccgttta 2880 caccgccaat cgcttgcaga gattcgcgct caccttacag ctgtatgatt tcgacatcaa 2940 ttacgtatcg accgacaatt ttgggaacgc cgacatcctt tcgcgactga ttcgaaatca 3000 cgaaaagtta gaggaggatt acgtgattgc tagcatcggc ctagaggagg acatacgatc 3060 agtcgtcgtt aactcattga gttcagttgc attaaatgca acggatgtcg caacggctac 3120 taaatccgat cccattatga gcaaagtcat acagtttgta cgacaagact ggccacgtaa 3180 cagtacgttc agcggtgagc tggcatgctt ctacgctagg aaggaagctt tatcagagat 3240 gggaggctgt ctcttattcg gggagagagt catcatacca aaggctctcc ggcaacgatg 3300 cctgcgtcaa ctacatcacg gccatccagg tgtacaaagg atgaagtcga tcgcacgaag 3360 ctacgtctac tggccgaaga tagacactga tatcgctgaa cttgtagcat cttgtaacgc 3420 ttgtgcatcc gcagcgaaat cacccccaca cgctagcccg gtatcatggc ctgagataac 3480 tgcaccgtgg caacgtatcc acatcgatta tgctggcccc atcgacggtt tctcctactt 3540 aatcgtagtc gatgctttct ctaaatggcc agaggtaata aggactgcca gcacaacttc 3600 caaagcgacc atacgaatcc ttaataccat gtttgcgcga tacggtatgc cagtaaccct 3660 tgttagcgat aatggtcgcc aattcatcag ttccgaattt gaggattttt gtatttgtaa 3720 cggcatagag cacctcacat ccgccccgtt ccaccctcag tctaacgggc aagcggaacg 3780 cttcgtggat acattcaagc gcgccataac caaaatcacg agcgatggaa ctgcaataga 3840 agatgcactt gacacgtttt tacaaacata tcgcgccacg ccaaatcctc aagtgccaaa 3900 taacgaagcg ccagcgacag taatgttcgg acgacaaatt cgcacttgtt tagagcttat 3960 acgccctgtg cctaaaccgc aagaaaccaa taacgatgag caacgacgca atttcgtccc 4020 aaatgatcta gtcttcgcta agatttactc gcagaacgga tggaaatgga agccaggaag 4080 aatattgcgg aaatgcggca atgtaatgta ccgtgttatg actgaggacc ataagattat 4140 acgaagccat atcaaccaac tacgtcgtcg ggtcccttct aaccagcaat ccagtaagcg 4200 tgatcaacac cttttgccgc tacatattct cttggatgag tggaacctta cgcctccgtc 4260 gtcgtcacct gattcatcgt catcatcgtc ctcatcgtct ccatcgtcat cgtcgttgtc 4320 gccaccgtcc gattcagctc cgtgtaatct atcaccgtcg tccgctgaat catcatctta 4380 caggaccgtg tcccaatcat cagcatcacc acctcctgaa cgtgattgca tccctgaaga 4440 gtcgcatcga ggagcacagc cggtaccacc accacgccgc tcttttagag atagaaaagc 4500 accgcgttgg ttcgacccgt acctgctgta ttaaaagaag ggaga 4545 // ID GYPSY63-I_AG repbase; DNA; ANG; 4342 BP. XX AC . XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE GYPSY63-I_AG is an internal portion of retrotransposon GYPSY63_AG DE - a consensus sequence. XX KW LTR Retrotransposon; Transposable Element; 5-bp TSD gag; KW AP protease; GYPSY63-I_AG; GYPSY63-LTR_AG; Gypsy clade; RNase-H; KW integrase GYPSY63_AG; mag lineage; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4342 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY63_AG, a member of the mag lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(6), 165-165 (2004). XX DR [1] (Consensus) XX CC GYPSY63_AG is a family of gypsy-like LTR retrotransposons that, CC according to the aminoacid sequence of its reverse CC transcriptase, RNase and integrase domains is phylogenetically CC grouped with representatives of the mag lineage of other CC organisms. GYPSY19_AG, GYPSY20_AG, GYPSY21_AG, GYPSY22_AG, CC GYPSY23_AG, GYPSY24_AG, GYPSY25_AG, GYPSY26_AG, GYPSY27_AG, CC GYPSY28_AG, GYPSY55_AG, GYPSY56_AG, GYPSY57_AG, GYPSY58_AG, CC GYPSY59_AG, GYPSY60_AG, GYPSY61_AG, GYPSY62_AG, GYPSY64_AG, CC GYPSY65_AG, GYPSY66_AG, GYPSY67_AG, GYPSY68_AG and GYPSY69_AG, CC are other members of this same lineage in Anopheles gambiae. CC The GYPSY63-I_AG consensus was reconstructed after multiple CC alignment of 4 copies. The consensus encodes the 1432-aa CC GYPSY63_AGP gag-pol like poliprotein (pos. 35-4330). The CC sequence of the LTRs flanking GYPSY63-I is deposited as CC GYPSY63-LTR_AG. CC GYPSY63_AGP: CC MSLENQNIQLEILKALQKLSETSTGTNNTERFVAMNMTEFTFDPENGGTFQKWFRRYEDLFESD CC AKELEDVAKVRLLLRKLDAQAHNQYTNYILPKLPKELTFKETVQTLSKIFGSQSSLFSRRYRCL CC QLVKTEADDIISYAAKVNRACEDSEFHNMKADHFKCLVFICGLKGQTYADIRARLLSRIDAETA CC DAPITLQNLVDDFQKLVNLKADTSIVEQQPNSSTTVNALHEKTEHHHHEQYRQRYQESKTSEQP CC RRPCWRCGQMHFVRDCQYSTHQCRKCNRVGHKEGYCGCFSKFKPAGEEKTNTKPSTSDQGKLNA CC RGVYIVNHITQHSSKRKFVPATINGVTINLQLDTASDITVISQQTWQKLGSPNIQPVTIQAINA CC SGKPLHLSGEFQCTININGQTQQGRCFVTTAVNLNLLGIEWIELFELWSIPIDTICNQLTTESI CC DQQMREIQAKHADVFKDTLGHCKKTKVKLYLKSNAKPVFCQKRPVPFNTIPLVDAELTRLQNLG CC IIETVDFSEWAAPIVAVRKPNGRVRICADYSTGLNAALEANHYPLPTPEEIFSQLNGSTIFSII CC DLSDAYLQLEVDDDSKHLLTINTHRGLFRFNRLAPGVKSAPGAFQRLVDGMIADIPGVRSFIDD CC VIVFGKDMKSHKDSLNTLFARLKEYGFHVKAEKCHFCKTQLVYLGHVVDKHGIRPDSEKIKTIA CC SIPPPSNVSELRSYLGAVNFYGRFVRNLHELRYPMDQLLKKESKWKWTPECQEAFVKFKEALQS CC HLLLTHYDPKLPIIVAADASNTGIGAVIFHQFTDGKMKAIQHASRTLTPAEQNYGQPEKEALAL CC VYAVCKFHKYLLGRHFTLLTDHKPLLSIFGSKKGIPLHTANRLQRWALTMLNYDFEIQYVSTQD CC FGCADLLSRLIDRNKQPEEEYVIATLTLEDDLSSILSDTSQKVPISFQALRKATASSSTLQAVC CC KFIREGWPNCSTNLPTAIQPYYARRESLSIVQGCVMFGDRVVVPNIFQKKILQQFHRGHPGIVR CC MKSIARSYVYWPGIDKEIEDFVKCCSPCAITAKTPTKTTLEFWPIPSKPWSRVHIDYAGPVDGF CC YFLVIVDPHSKWPEVYATRSITARTTIRILKQIFATFGVPEVLVSDNGTQFTSYEFKEFCVSQG CC INHLRIAPYHPQSNGLAERFVDTLKRSIQKIRKGGESLEDALTTFLQVYRTTPSGDLDGKAPAD CC IMFSRPLRTISSLLKPSEHGNVEPRNRMKEAEFFNKKHGAVKRCYQQGDAVYVKIYRRNSWQWE CC AATVIDKIGNVNYNVFLKEKQQLVRSHTNQLKSRLANGQNMAEFSTPLSVLLDDFGLKTPLPSD CC QQSTSSQFVSCDEPVTTSSDTELASQTLCSTPDHSELGNISSENDNAEESEGEEPVVQQQEQQS CC SILERSRRVIKLPERFKSYWMPNP. XX SQ Sequence 4342 BP; 1395 A; 968 C; 894 G; 1085 T; 0 other; attggcgacg aggatagttg aactaaaagc aagaatgtct ctcgaaaatc agaacattca 60 attagagatt ttgaaggctc tgcagaagct atcagaaaca tcaacgggta caaataatac 120 ggaacgattc gtggcaatga acatgaccga gtttactttc gatccagaga acggaggaac 180 ttttcaaaaa tggtttcgac gttatgaaga tttgttcgag tcggatgcca aagaactcga 240 agatgtcgct aaggtcaggc ttttactgag gaagttggat gcacaagcac acaaccagta 300 caccaattac atcctcccga aacttccgaa ggagctaact ttcaaggaaa ctgtgcaaac 360 attatcgaaa attttcgggt cgcagagttc actttttagc agacgttacc gttgtcttca 420 gctcgtcaaa accgaagcag atgacattat tagttacgcg gcgaaagtga accgggcttg 480 tgaagattca gaattccaca acatgaaggc tgaccatttc aagtgtttag tatttatttg 540 tggattgaaa ggtcaaacct acgcagacat acgagccaga ctactctctc gcattgatgc 600 cgaaactgca gatgcgccca ttacactgca aaatttggtt gacgatttcc aaaaactcgt 660 caatttgaaa gcagatactt ccatagttga gcaacagcca aactcatcta caacggtaaa 720 cgctctacat gagaagacag aacatcacca tcacgaacag tatcgacagc ggtatcaaga 780 atcaaaaaca tcagagcaac ctcgcagacc atgctggcgt tgcggtcaaa tgcactttgt 840 tcgagattgc caatactcga cacaccagtg tagaaaatgc aatcgtgttg gtcacaaaga 900 aggctactgt ggatgttttt caaaattcaa acctgccggg gaggagaaga ctaacacgaa 960 accatcaaca agtgatcaag gaaaactgaa tgccagaggt gtctatattg tcaatcacat 1020 cactcaacac tccagcaaaa gaaaatttgt tcctgcaacc atcaacggcg ttacaatcaa 1080 tctgcagctc gacacagcaa gtgatattac ggtgatttca caacaaacat ggcaaaagtt 1140 gggctcaccg aacatccaac cagtgacgat tcaggccatc aatgcatctg gtaagccact 1200 ccatttatcg ggcgaattcc agtgcaccat caacatcaac ggccagaccc aacaaggcag 1260 gtgtttcgtc actacagcag ttaacctaaa cttgctaggg atagaatgga ttgaactatt 1320 cgagctttgg tccattccaa ttgatacgat ttgtaatcaa ctaacaacag aatcgatcga 1380 ccagcagatg cgagaaattc aagcgaagca tgcggatgtt tttaaggata cattggggca 1440 ttgcaagaaa actaaggtta agctttacct caaatcaaac gcaaagcctg ttttctgcca 1500 aaagcgtcct gtacctttta acacaatacc tttggttgat gccgaactta ctcgattgca 1560 aaacttgggc ataattgaaa ctgtcgattt ctccgagtgg gcagctccaa ttgtggcggt 1620 gaggaaaccg aatggacgtg ttcgaatatg tgccgattat tcaacaggat tgaatgctgc 1680 gttggaggca aaccattatc cattgccaac accagaagaa attttctcgc aacttaacgg 1740 cagcaccatc tttagcatca tagacctgtc cgatgcctat cttcagctcg aagttgacga 1800 cgattcaaag catttactaa ccatcaatac acatcgtgga ttattccgat tcaaccgtct 1860 cgcaccaggg gtaaaatcag caccaggagc attccaacgc ctcgtagatg gaatgatagc 1920 tgatattcct ggggtgcgat cattcattga tgatgttatt gttttcggca aggatatgaa 1980 atcacacaag gattcactca acaccttgtt cgcacgtctt aaggagtacg gatttcacgt 2040 aaaagccgaa aaatgccatt tttgcaagac tcaacttgtg tacttgggac acgttgtaga 2100 taagcatggt attcgtccag attccgaaaa gatcaagaca attgcttcga ttccaccacc 2160 aagcaatgtg tccgagctac gatcttatct tggagcagtg aatttttacg gaagattcgt 2220 tcgtaacctg cacgaattac gttaccctat ggatcagctg cttaagaagg aatcgaaatg 2280 gaaatggacg ccagagtgtc aggaagcttt cgtcaagttt aaggaagcac ttcagtcaca 2340 tttgctccta acgcactacg atccaaaact tccgatcatc gttgctgcgg acgcatcaaa 2400 cacaggaatt ggtgcagtca tttttcatca atttactgat ggaaaaatga aagcaattca 2460 acacgcgtca cgaacactta cacccgctga acagaactat ggacaaccag aaaaagaagc 2520 tctcgcatta gtttacgcag tatgcaagtt tcacaaatac ttgcttggac gtcatttcac 2580 tttgctcacg gatcacaaac cattactttc aatttttggt tcaaaaaaag gtataccact 2640 tcataccgct aaccgtttgc aaaggtgggc acttaccatg ttgaattacg atttcgaaat 2700 tcaatacgtg tccacacaag atttcggatg cgcagatctt ttatcacgat tgattgaccg 2760 aaacaagcag ccggaagaag agtacgtaat tgcaacactg actttagaag atgacctttc 2820 gagcattctg tccgatacat cacagaaggt tccgatttca tttcaagcac tccgtaaagc 2880 aaccgcttca agctcaacac tacaagcagt ctgcaaattc attcgtgaag gttggccgaa 2940 ttgttccact aatcttccaa ctgcaatcca accttactac gcgaggcgcg aatcattatc 3000 aatagtccaa ggatgcgtta tgtttggtga cagagttgtt gtaccaaata tatttcaaaa 3060 aaagattcta caacaatttc atcgaggaca cccagggata gttcgaatga agtcaatcgc 3120 tcgaagctat gtttactggc ctggtattga taaagaaatc gaggattttg ttaaatgttg 3180 tagtccgtgt gcaattacag cgaaaacacc gacaaagaca actttggaat tttggcccat 3240 accatcaaaa ccatggtcca gagtacacat tgactatgca ggcccagtag acggattcta 3300 cttccttgta atcgtggatc cacactcgaa atggccggaa gtttacgcta ccagatcaat 3360 aactgcgaga acaacaataa gaattttgaa acaaattttc gcaactttcg gagtgccaga 3420 agttctcgtg tctgataacg gtactcaatt taccagttac gagtttaagg agttttgcgt 3480 tagtcaaggc atcaaccact tgcgcattgc tccatatcat ccgcaatcca acgggttagc 3540 tgaacgattt gtggatacac tgaaacgaag tattcaaaaa attcgcaagg gaggggaatc 3600 tctcgaagat gcactaacca ctttccttca agtatatcga accacaccat ctggagattt 3660 ggatggaaaa gctcctgctg acattatgtt ctctagacca ttacgaacta tatcgtcgct 3720 cctcaaacca agcgagcacg gaaatgttga gccgaggaac agaatgaagg aagccgaatt 3780 tttcaacaaa aagcacgggg cagtgaaacg atgttatcaa cagggcgatg ctgtttatgt 3840 caagatatat cgtagaaact cctggcagtg ggaagcagca accgtaatcg acaaaatcgg 3900 caacgttaat tataacgttt tccttaaaga aaaacagcag ttagtacgat cacacaccaa 3960 ccagctgaaa tctcgattgg caaacgggca aaacatggca gaattttcaa caccactatc 4020 tgtactgctt gatgattttg gtttgaaaac acctttaccg tcagaccaac aatcaacgtc 4080 gtcacagttt gtttcctgtg atgaacctgt aacaacatca agcgatactg aactagcatc 4140 tcagacactc tgttcaactc cagatcattc tgaattgggt aacattagtt ccgaaaatga 4200 caacgcagaa gaatccgagg gagaagaacc agttgttcaa cagcaagagc agcaatcatc 4260 aattttggaa agaagtcgaa gagttatcaa attaccagaa cggttcaaat cttactggat 4320 gccgaatcca taagggggga ga 4342 // ID P1_AG repbase; DNA; ANG; 4946 BP. XX AC . XX DT 13-DEC-2002 (Rel. 7.11, Created) DT 13-DEC-2002 (Rel. 7.11, Last updated, Version 1) XX DE P1_AG, a P-like DNA transposon - a consensus sequence. XX KW P; DNA transposon; Transposable Element; P superfamily; P1_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4946 RA Kapitonov V.V. and Jurka J.; RT "P1_AG: a family of P-like DNA transposons from African malaria RT mosquito."; RL Repbase Reports 2(11), 21-20 (2002). XX DR [1] (Consensus) XX CC The A. gambiae genome harbors many divergent families of P-like CC DNA transposons. One of those families is P1_AG. CC P1_AG elements are flanked by 8-bp target site duplications. CC Terminal inverted repeats are 11bp long. CC The P1_AG consensus sequence was reconstructed from CC several copies that are only ~1% divergent from each other. CC Presumably, P1_AG copies have multiplied in the genome CC during last 1 million years. CC The P1_AG encodes a 895-aa P-like DNA transposase called P1_AGp. CC Putative exon/intron structure (based on FGENESH): CC 302-563, 624-734, 797-2262, 2433-3281. XX FH Key Location/Qualifiers FT CDS 0..0 FT /product="P1_AGp" FT /translation="MSSNYKCCVAFCKNNRYNVKKVGVNVCFHKFPEGKET FT KQKWIAFCQREISWIPSSSNVVCSQHFLPSDYQLSSSHNTKRGANWLNPEA FT VPSILLPQDTGLNLSQFNEHQQNNNNDESSRAELQGATSSHLLPDENDLIL FT IGKENVCKKCKGLQQKIFNLEAKLNELEERNKQLVTINNKLSDTLKEVNEK FT EKEHLKKLEELDKATKKIKEEWPRANFVQNMKDSLKGSLSSNQIDLILGIK FT KTVRWTKEELSLAFTLRYFSQRAYRYIGDDMKIPVPVPRTLQRYSSKIDLK FT QGILEDILKFIGSYSQTLKPMDRECVLSFDEMKVSRVLEYDPSADEIVGPF FT NYLQVVMMRGLFKQWKQPIFIGFDTKMTKDIIIEIISRLSEKDINVVAIIS FT DNCQTNIGCWKELGARDDVEKPYFPHPKTNKNVYVIPDTPHLLKLLRNWLL FT DHGFEYNGQLIETTNLLRMVAKRMESEMTPLFKLTTSHIDMTPQERQNVRR FT AAELLSRTTAVALRTYFPDDDNAKILANFIEKVDVWFSISNSYTPFAKLDF FT KKAYTASDDQVRALTDMYDIISNMTIPGKNGLQIFQRSIMMQIKSLQMVFA FT DMKLKHDAKFICTHKLNQDVLENFFSQLRQIGGVYDHPSPLSCIYRIRLMI FT LGKTPTILHNQTTTVEAANCEQDEFITTTESNVAGARSNDVFISASMFEKA FT EITPELPDVKAMEEANNFVSQDVSSLSSVSDTSIELPHQEADGLSYILGYL FT AKKHHSQFSHLNLGEHTFKTRIDHNYCQPPTFVHHLSCGGLIEPSDEFLNL FT GKKMEKIFLKMNPDGGLLKGERIVDRITNKIKRHLTELPVEIIRSFAKQRV FT IVRMRYLNLKATAEQLNKMKHKKRKFVTENTKAAKKMKKIIN" XX SQ Sequence 4946 BP; 1701 A; 839 C; 944 G; 1462 T; 0 other; caaggttatt agactgtata caggttacga caaaaccccg ttttgacaca tcttcaataa 60 aaaaatgaat taaaattttg acagacaaat tggaaagttt gtttacctac aaattgcaat 120 ccttccgaac atacagcttt cgcaattgat gagtgcttaa atactaccgt ttttctgcgt 180 ttcaattttc gtttttgtgc ttttttagtt tacggctagt taagtttatt taaggaatta 240 ggtttatcta agctaagtaa ttcattaaat agctataaat tacgaaacaa agcagcgaaa 300 tatgtcgtcc aactacaagt gttgtgtagc attttgtaaa aataatagat ataatgtaaa 360 gaaagttggt gtcaatgtat gcttccataa attcccagaa gggaaagaaa ccaaacaaaa 420 atggatagcc ttttgccaaa gggaaatatc gtggatacct tcctcaagta atgtggtgtg 480 ttcgcaacat ttcttgccat ctgattacca attgagcagc tcgcacaata ccaagcgtgg 540 ggctaattgg ctaaaccctg aaggtgagac gttatcaaac caaacgatgt tgtttcttta 600 taccgatata aactctattt tagctgtccc atctatcctg ttaccacaag acactggttt 660 gaatttgtcg cagtttaatg aacatcagca gaataataac aatgatgaat cttcccgggc 720 ggaactgcaa ggtggtatgt taatgaattg gcgatgttta tttgatccat tctaaaattt 780 atgaatatga tttcagccac ttcatctcat ctcttgccgg acgaaaatga tttaatattg 840 attggaaaag agaacgtttg caagaaatgt aagggattac aacaaaaaat atttaattta 900 gaagcaaaac ttaatgaatt agaagagcgc aacaagcaac tggtcactat aaataacaaa 960 ttaagtgata cactgaagga agtaaacgaa aaagaaaagg aacatctgaa aaagctagaa 1020 gaacttgata aagctacaaa aaaaattaaa gaagaatggc cacgcgctaa ttttgtgcag 1080 aacatgaaag attcgcttaa gggttcgcta tcatcaaatc aaattgacct tattttagga 1140 attaagaaga ctgttcgatg gactaaagaa gaactttcct tagcatttac tcttagatat 1200 tttagtcaaa gggcataccg ttacataggc gatgatatga agataccagt accggtacca 1260 agaactttac aacgatattc ttccaagatt gatctaaaac aaggcattct ggaagatata 1320 ttgaaattca ttggatcata ttcacaaaca ttaaagccta tggatcgaga atgtgtttta 1380 tcgttcgacg agatgaaggt gtctcgagta ttagaatacg atccatcggc agatgaaata 1440 gtgggcccat ttaattatct gcaagtagtc atgatgcgag gattgtttaa acaatggaaa 1500 cagccgatat ttatcggctt tgataccaaa atgacaaagg atatcattat cgaaataatt 1560 tcacgattga gcgaaaaaga tataaatgta gtcgctatca tcagtgataa ttgccaaaca 1620 aacattggat gctggaagga gctaggcgcg cgggatgacg tagaaaagcc gtattttcca 1680 catccaaaaa ctaataagaa tgtgtatgtg atccctgata cacctcattt gctgaagttg 1740 ttaaggaatt ggcttctaga tcatggtttt gagtacaacg gccaacttat cgaaaccacc 1800 aatctgttgc gtatggtagc caaaagaatg gagtcagaaa tgactccttt atttaaactc 1860 acaacatccc atatagacat gactccccaa gaacgacaaa atgttcgaag ggcagcagag 1920 ttattgtctc gtacaaccgc tgtagctctc cgtacatatt ttccggacga tgataatgct 1980 aaaattttgg ctaattttat agagaaggtg gatgtgtggt ttagcatatc taattcttac 2040 acacctttcg caaaattgga ttttaaaaaa gcatatactg ctagcgatga ccaagttaga 2100 gctttaacag atatgtatga cataatatca aacatgacta tcccaggtaa aaatggttta 2160 caaatttttc aacgttctat aatgatgcag attaagtcac tgcaaatggt gtttgcagat 2220 atgaagttaa aacacgatgc caaattcatc tgtacccata aggtaagcaa catgataatt 2280 caattgcaat taattcattg aacattcaat accatatgac tatacaggaa aatctcagct 2340 agagctggtg aattcattgt tacaaaatta atcattttaa ctaatacatt tattgttgtg 2400 tttctttttc tttttttttt ttcattttac agttaaacca agacgtatta gagaattttt 2460 tctctcagct taggcaaatt ggtggtgtat atgaccatcc ctcacctcta agttgcattt 2520 atcgaattcg tcttatgatt ttggggaaaa caccaaccat tttgcacaat caaacaacta 2580 ctgttgaggc tgcaaattgt gagcaagatg aattcattac aacaactgag agcaatgttg 2640 ctggtgctag gagtaatgat gttttcatct cagcttctat gtttgaaaag gccgaaatca 2700 ctcccgaatt accggatgtc aaggcaatgg aagaagcaaa caatttcgta tcacaagatg 2760 taagttcact tagttccgtt agtgatacaa gcattgaact acctcaccaa gaagctgatg 2820 gattgtctta catacttgga tatcttgcaa agaagcatca ctctcagttt tcccatctta 2880 atctgggaga acacacattt aagactagga ttgaccataa ctattgccaa cctcctacat 2940 ttgtacacca tttgtcctgt ggtggactga ttgaaccgtc ggatgaattt ttgaatttag 3000 ggaaaaaaat ggaaaaaata ttcttgaaaa tgaatccaga cggaggatta ttaaaaggag 3060 aacgaatcgt agatcgaatc acaaacaaga taaagcgaca tctaacagaa cttccggttg 3120 agataattcg ttcttttgca aaacagcgtg taattgttcg tatgcgctat ttgaacttaa 3180 aagcaacagc tgagcagcta aacaaaatga aacataagaa aagaaaattc gtgacagaaa 3240 atacaaaagc tgccaaaaag atgaaaaaaa ttattaatta aagtttatgg taatgtaaga 3300 ggtagcatat agcactcatt aataacgatt aagcatatat ataatacata ggactttatt 3360 ataggaataa aaatgacaat gtgataaaaa tttcatactc ctctaaattt aagttttcat 3420 tgtattgtag tattaagaag agattattaa atggattgtt aaatagaaaa taaaaaagta 3480 caatttgtgc aaaatataga gtaatttgac agtagagatt cgttgtacaa tgtaatcact 3540 cggaccacat catttttagt ttttatcttc cgtcaataac accaaccttg atatagaatg 3600 ttaatacgat gaaccaacaa aaacacccta actagccatt gctgtgtgac agtactttcc 3660 aaactacagg ccgcatgcag tcccgagtgc ctataatgtg ccccacgatg atttggcaat 3720 ttttcatcca acagattgat cattgttcat atagcagagc ggcctaccaa acagaaagat 3780 caaaagaaaa atgttgaaag ttctactaaa tatatgaacg gttaagagcc gcaatctcaa 3840 aaataaagag ctgcatgcga gccgcacttt gccgttcgct ggtctacaag ttacaaagca 3900 tttatagtac atcgttgctt tcgaaaacat aatggtgtaa tttatcatta tctgtatcat 3960 gaaaatagaa tgcaactata atcaaattaa aggaaacgtt attgatgtgt tctattccta 4020 cacgacacct tttcaacgat agtcgctacc gaaattttgt ataaagcgta accacatgtt 4080 ttttttatat gaagtttccc tgaagaaacg tcgttcgacc ggtgcatcct agagattaat 4140 catgggttag gtgttgttga attcaagctt agggtgttga agcaaagaat taatctgtat 4200 gaaggaaggt gaaggatgta gcttgtggag cagtacggtc aatgggatat tctcgaggtg 4260 taggcttgca gtataaattg gctggggatt gtttttaaaa agaactgctg tcccaccgag 4320 caggtgcttg taagatgact ctgatttatc gttgtctggc tgatgtaggt tgtccggcat 4380 tggtcacata acacagtgag ctcgatggtt catggatcaa ctagtcaatc tgtcggatga 4440 tggatgcaga cgagaggtag gaggcagtca tgaaccgacg ttaggtgcta atggagcagg 4500 acgtgctcag cagtgattgg agataaaaca aagagagggt gaaattataa aactttgcca 4560 cagtatgagt agtattgatg gtttgaaagc aacatacaac ttgcattatg ccttaattct 4620 cctatacacc tgattgtgct ctgcgatgtt cgatgtcgat ggaccttgca gaataggtca 4680 acaatttggg gatgtttttc acctggaatc agctcactca ctagaattac ctggagtctt 4740 tcagctgcag aaaaaaaatc gagaaattta ttatgcttac tacatttgtt tcgtatttta 4800 aattctaatc ttcgtcattc gtgaacagca taagctgatc aaaacaaagc cggcgtgatt 4860 tgacagacaa attgtaaaca atataaccaa catggttcca gcaaaaacga agtgtcttaa 4920 cctcagtaga atatataata accttg 4946 // ID CR1-2_AG repbase; DNA; ANG; 4665 BP. XX AC . XX DT 13-DEC-2002 (Rel. 7.11, Created) DT 13-DEC-2002 (Rel. 7.11, Last updated, Version 1) XX DE CR1-2_AG is a CR1-like non-LTR retrotransposon - a consensus DE sequence. XX KW CR1 clade; CR1-2_AG; DNA/RNA-binding; PHD finger; endonuclease; KW Non-LTR retrotransposon; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4665 RA Kapitonov V.V. and Jurka J.; RT "CR1-2_AG, a family of CR1-like non-LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 2(11), 2-2 (2002). XX DR [1] (Consensus) XX CC CR1-2_AG is a family of CR1-like non-LTR retrotransposons. CC The CR1-2_AG consensus sequence was reconstructed based on CC multiple alignment of ~100 copies identified in the CC sequenced portion of the genome. Given the ~2% divergence CC of these copies from the consensus sequence, transposition of CC CR1-2_AG occurred less than 1 million years ago. CC Integrations of CR1-2_AG have not produced target site CC duplications. CC The consensus sequence encodes two proteins: a 416-aa CC CR1-2_AG-ORF1p CC (positions 256 1503) and 996-aa CR1-2_AG-ORF2p (positions CC 1559-4546). CR1-2_AG_ORF1p is a putative DNA/RNA binding protein, CC which includes the PDH domain. CR1-2_AG-ORF2p is composed of CC the AP endonuclease and reverse transcriptase domains. The 3' CC terminus CC is composed of the AATA microsatellite. XX SQ Sequence 4665 BP; 1046 A; 1223 C; 888 G; 1508 T; 0 other; tgtcacttgt cacttgtcac tcatagcggc tggttgtgct ttctcactct gctatttcca 60 agttgatttt ttaccgcgat tttcttgttg attaaggctt gaactgtttt agttcggtta 120 atttgcactc gcgcgttcac gtatcacaaa tttgtgcggc acctgtgaat tgtacggttc 180 aacatataaa cattgcaatc cgcctgcggt tgttgtgacc actcactgtt cagtcgaact 240 ttgcgctatt ttgcgatggc gtctgtgatc tgtaagaaat gtgaaggtgc tattagcaac 300 gatccaattc cgtgcttcgg tctttgtgaa cactattacc acgataagtg cattggactt 360 tcaaccccgc tcctgcgtga ttttaagaag tcacaaaatt tattctgggc ttgcgcggat 420 tgtgctcagc gtttgcgggc cgttgacact ttacgcttct cgcacggcct ttctcgtgac 480 gctgcctacc tgttggaatc gttgcagtcc gatttccgcg atacctcacg ttctgtgcag 540 gcggcatcag ctggcttgcg acttgaactt tcctcttcac tggattgttt tagaaacgag 600 atagctttga tgaaacagga atcagcatcg tcaattcgtt ccgtgaaaga tttcatcgac 660 tcacttactg cttctcactc aatggaacgc aactactcac aggctccact actcacaacg 720 cttgatgaag ttaagcatgg catcaaggag cttgatctca tgcaccgtga gcttctcact 780 tccttcaact cactaatgaa caagctcaac tctcatcttg ccacgcatac cactacatcg 840 agtgctcacc attctgcgat tcctgctacg cactcaacga ccacgattcc agttgcggcc 900 tctaagctca cccatcaagc tgttggtgag aatccttcta aacgtcgatt gttggatcgc 960 tctcccgacc catcgcctac caataccgtt acacgcgcta tgctttcatc gggcacaggg 1020 ttgtcctgca ataatattac gaccgttcct gaacgcccac cccgtacttg ggtctttatc 1080 tcccgtattg ctcctgatac tccgattgaa gcgatccgcg aaatggcctg ttctaacata 1140 gggacagacg acatcttggt atacagcctt gtacgacgcg accgggatct ttctacgctc 1200 tcctacgtat cttttaaaat tggtgtaccg gattcgcacc gggctattgc tttggctgct 1260 tcaacctggc ctcgcgggat ctcttttaag gagttcatcg accttaatcc ccgctccgtc 1320 aatgtttggc gacccactac tgcagcatcc catgcacctt ctgcgcctgt tactcgtgaa 1380 tcggatcatc attcatcgcc accttctatc aaccacacga cacaactgcg aaacgctgat 1440 ttcactattt cgcccgatca tggtcctatg agcttgcctt atacgcagta ctttcagcaa 1500 gcgtaaaccg gctgatccgg ctgaagattt tcatgaacta ccgactttac ccgaattgat 1560 ggaagctaac cctgcatcta atacacacaa tccgtccgac tctcttccgc cgttaataac 1620 ttgcagcgag agcactcccg gcgcatctcg ctctctcatt ccttcgatcg atcgactaaa 1680 catctactat caaaacgtaa gaggattgcg cacaaaacta gacgagttac gcctctctct 1740 atctgagctt gatatggatg tgttggtgtt aactgaaaca tggctcgatg gctcaattcc 1800 gtcctctctt atctcggagg atgcgtatgt catctatcgc tgcgaccgaa attctctcaa 1860 cagtaaccgt tgccgtggtg gtggtgtact cattgcctgc tcttccgtgc tgaacacgtc 1920 aactttatca ctgccgttcg actcgctgga gtctgtttgg acaattgtta agcttcaaaa 1980 ccttgcaatc tatatcggcg ccgtttacat cccacccgat ttgcgttctt ccgaagtagt 2040 ccttgacgat ttacatgaga gtgttagttt cgttgctggc aaacttaaac caaatgatct 2100 tatggtgtta cttggtgatt ttaattcacc atcactctcc tggcaaccgt cggcatcgtg 2160 cgttaatcag ttcattccta ctggtgtatc tcgtgaaaat gtttctctgc ttgatggaat 2220 gtcggtaaac ggtttgctgc aattatccgg cataaaaaat atacgcggaa gacaattaga 2280 tcttctattt gccaatgctg ccttcttgga gtgctgctcc cccgtcatag cttcacctgt 2340 tccactcgtc gcgctagaca atcatcatcc tgctcttgag acaagcgtgc tccttacgca 2400 ctctcgccct gcatcctccc aacgaatacc tacggctcgt atgttcaatt ttaggaaatt 2460 ggactaccaa aagcttcatc gtatattagc cgataccgat tggtccttta ttgatgctga 2520 ctgtgacata aatcaagctg tagcagcgtt tactaatgta atcacttccg ccttcccttc 2580 ctgctgtcct ctccttaaac ccgctcctaa tcctaagtgg tcgaacagag ctttacgtct 2640 attaaaatcg gacaaaaacc gcgctcaacg cgcctaccgt ttaaataata ctttacacaa 2700 tctttgtgta tataaatatg cggctaaggc ttatcgtctt cttaatcgcc atctgtatcg 2760 tcgctacgtt cggcgtcttc aaatgcgctt cactattgat cctgggtcat tctttcgttt 2820 tgcgaactct cggcgaggct ctgctagcct tccatccacc ctgtttcttg atctgtcctc 2880 tgctacgtcc aatcctgata tatgtaacct gtttgccaaa catttctcta gtgtatttgt 2940 cgatcctagt acttttaagg ttcctttgga cgtaggcttg tcttacacgc cctctgatgt 3000 gatatcatgt aattcagtcg ttgtaagtga aagcctagta aaatccgctt tatccaaact 3060 taaaacttcg ttttcaccag gccccgatgg catccctgcc tgtgttttga aaaaatgtgg 3120 caataccctc actcctattc tcactcgtct tttttctcgc tcgcttagtg tagggatttt 3180 tcctagtcaa tggaaactag cttggcttgt tcccatttat aagaaaggtg accgcacact 3240 agcatcaaac tacagaggca tttctattat ttgtgcgtgt tctaaaatcc tcgagtcaat 3300 tgtccattta tctgtaatgc cttgtgtaaa aaattatatt tctacggaac aacacggctt 3360 tatgcccaat cgctcagtgt ccaccaacct aatgtgtttt ttgtcttctc tatatcacta 3420 tctgtctagt ggtaagcaag tagacactat ctataccgat ttcaaagcgg catttgacag 3480 tatacctcta tcattacttg ttgctaagct tcgaaaacta ggtttcggtg gctctatatt 3540 gccgtggttc aactcctatc ttgagaatcg ttcatatgca gttaaaatct gtggctcttt 3600 ctctgaatgt tttcttagtt cttcgggtgt ccctcaaggt agtgtcctta gtcccttact 3660 gttcattctc ttcctcaatg actgtacttc gatccttcct cctaacggct tcttgctata 3720 tgcggatgac gttaaaattt ttcttcctgt atcttctaca gctgattgtc tagtccttca 3780 atcctggctc tgtaaattct ctacatggtg tgcttctaac ggtttagtcc tgtgtcctga 3840 aaaatgttcc gtcttgtctt tcttccgatc ttctacaagt ataactcatg cttatagtgt 3900 ctgtgatgcc cctattcccc gtgcgtcctt gtctaaggat cttggcgtct tctttgaccc 3960 gagtctttct ttcaaggagc acacggacta tgtcatcaac aaggccaaca aaagtcttgg 4020 ttatatttgt cgcatgtcca ctgaaattcg tgatcccttt tgtcttaagt ccctttattg 4080 ttgttgggtc cgttccgtac tggaatatgc ctgtgtcatc tggtctcctg tccaactatc 4140 tctgctccaa aggattgaga ggatccagag acgttttaca aggatcgtat ttcgtaggtc 4200 gctgggtcat cactctattc cgcttccttc gtatgacgat agatgcactc tattaggcct 4260 tgccaagttg gagcaccgcc tctcggtcgc tcaggcttct ttcgtcgctg gcatattgct 4320 caatacgatt gatactcctt cacttctgtc gcgcttacat ttgtatgcac cttgtcgcac 4380 cttacgttat cgttttcgtc tccagttacc tatatgtcgt acacgttttg ctcgcaatga 4440 gccttttgta agagctatgt cgtcttttaa tagtacttct gatctgttcg atttcaacat 4500 atcctatcct gtctaccgat cccgtcttcg ctccttttcc gtaccataaa ctccttccaa 4560 ctccttcgtg aataattgta tactatagtc agtaagcact attgctgtaa cccaacgtgg 4620 ccgagcaatt aataaaataa aaataaataa ataaataaat aaata 4665 // ID GYPSY34-LTR_AG repbase; DNA; ANG; 271 BP. XX AC . XX DT 16-APR-2004 (Rel. 9.03, Created) DT 16-APR-2004 (Rel. 9.03, Last updated, Version 1) XX DE GYPSY34-LTR_AG is an LTR of retrotransposon GYPSY34_AG - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; KW 4-bp TSD GYPSY34_AG; GYPSY34-I_AG; GYPSY34-LTR_AG; Gypsy clade; KW MDG3 lineage. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-271 RA Tubio M.J., Costas C.J. and Naveira F.H.; RT "GYPSY34_AG, a member of the MDG3 lineage of the Ty3/gypsy group RT of LTR retrotransposons in Anopheles gambiae."; RL Repbase Reports 4(3), 61-61 (2004). XX DR [1] (Consensus) XX CC GYPSY34-LTR is a long terminal repeat of GYPSY34_AG (its internal CC portion is deposited as GYPSY34-I_AG). XX SQ Sequence 271 BP; 74 A; 73 C; 54 G; 70 T; 0 other; tgtaacgtcc ggactaatat cgcccactgt gcactcaacc cgaaccccga acgcagcggt 60 ataaacgcat tataccttga ccgctggagc acccggtgtg ctagatgaac tgtcatagaa 120 taaagctctc ttcttggcgc gacattgaac tgaacagacg taagccactg acttctgcgt 180 ataattattt gtgtgctctt ccgaattgtg ctaaatctta ttaaaacggc caattaacct 240 tccgccaacc gtaaaacgct tggtcgttac a 271 // ID R6Ag1 repbase; DNA; ANG; 5075 BP. XX AC AB090817; XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE Anopheles gambiae retrotransposon R6Ag1 DNA, complete sequence. XX KW Non-LTR Retrotransposon; Transposable Element; gag-like domain; KW reverse transciptase; R6Ag1. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RA Kojima K.K. and Fujiwara H.; RT "Evolution of target specificity in R1 clade non-LTR RT retrotransposons."; RL Mol. Biol. Evol 20(3), 351-361 (2003). XX DR Genbank; AB090817; Positions 1 5075. XX SQ Sequence 5075 BP; 1417 A; 1080 C; 1398 G; 1179 T; 1 other; gcccggcaaa gtccgaccac cacctccacg cggtgccggg cggggtaggg gcccgtcatt 60 aagttgacga ggccccgagc taacggtggc ggtgaagacg gcgttgagca tgacaacgtc 120 gcaggggagg gtgtggtgtt aagtcgccac accctacatt ctgtcaagtg ggatactgaa 180 ctcgagagga taatacctct gcgcggatta gactagggta cggtttatgg aataactagg 240 atgtacacgg gctggcggat gaacccgccg aacccttagt aaagcagggt gaaacctgct 300 tgaaacgggg gagggctaag ggggattctg tcaccgctct attaaaactt agcaagtctt 360 aggtagcgct ccaagactgt cgccatacga tacgctctat ggcaaatgca ggtgtgggtg 420 ggttcagccc cactttgctc cggttcggca ctgagcccgt cgtctcataa gggcgcgggc 480 caacccttgc gtggagctcc ctgtttcata gccacccacg gaaccaaata ggagactgca 540 taaacagacg cggatagcgg gcctgagttt gagcccaata gaatgagtaa cactaaaact 600 aagaaggtca aaaaggccaa gggggttgac aagaaagatc cggaccaagc gttcatggat 660 cttcaagatc ggatggaggc tatgcgtttg ggataagtaa actcgatgat gagtctyccc 720 attttacgct agtaacggta atgatggact tcctcgatga ggtaatgtcc gaatttcgaa 780 ggtacagagc cgttaatcgc atgcagcagg tcctcgaccg tcaaacgcaa acgtcgtttg 840 acggtaacga tggattcggg ccgcaaacgc gaaaaggcag aagaccagtg gctgatgacc 900 aacagcctgg tccaagtggg ttgcaaaggt tgcaacaaca acaacaacaa ccatcgaggt 960 tgacccctgt tagggaagcg gtggaaaata ttccaagtcc gagaaacgga ccgaatatta 1020 atgagggtag cattaataag aggaagaaga agaagagcaa gaagaagcag aataagccca 1080 ggaagcggcc tgaggctctg ctgatatcgg actgcacttc cgaggagctg gcgaaattgc 1140 tcaaggaaat gaagcagtcc gatgctctta aatcggttgg agagacaatc tctaaggtcc 1200 gacgggccca gaatggtggc atgttgctag aattgaagca gggtagttct gctagtgcaa 1260 ttgccccaaa ggttaaggaa gcggtgaagg gcaaggcgtc agtgagaacg ctagctcctt 1320 ctaaaatgat tgagatcatg catctcgatg aaattaccac cccagaggag gttgcggagt 1380 ctgtcaaagc acagctcaac atcgagatcg aaatagatcg tatcaaaatg aagaaaggcc 1440 gcgcggccgg tacgcagtgg gcacggatca acgtatcgct gccagacttt caaagcttcc 1500 tgaatttggg aaagctgaaa gttggttggt cgatatgcca tatccgtgag gtaatggagg 1560 agcagaaatg ctataagtgt tggaaggtag gccatacgag ctaccattgt agggaaccag 1620 acagaagtaa tctgtgctgg aaatgcggtt tgagtggaca caagaagcaa gcttgtacca 1680 attctgttaa gtgtttggat tgcggtacga ggtcacagaa ccttcacgca acgggcagtt 1740 acatgtgtcc ccgtaggcga acgattagat cttaatggtt aggttgttac aacataacca 1800 gaatcatagt tatgctgcat ttcagttaat gtggcaaacg attagggaag aatctgcgga 1860 tatagtgttg attgcagatc cgtatctggc aacaacaaac gtcaaagtgt tacgcaatga 1920 cgataacaca gcagcggtag tggttaacgc ggacttacca gttaaggtag tcagtaaggc 1980 tctgaagggt ttaatggtag ttgacatagg tgatatgcga gtggttagcg tttacgcgcc 2040 acctagattt agtatggaag agttccagat catgttggat aacacggtaa tggccgtaac 2100 cggtatccac aaattcgtta tcggggggga cttcaacgct tggtcagcaa gttggaacaa 2160 ccaacttggc gagcgtggag aaacccagaa acgaagaggt gagttggtct tatcaacctt 2220 tgcgcagatt gacgcaattt tattgaacga cggcagcacc ccaacttatg ttggaccagg 2280 gcgcacgtca gtagttgatc ttacttttgc gagcagaaca gttgcaagat catttaagtg 2340 ggaggtgtta tctagctata tgaactctga tcatcgtgcg atacgcatag atcttgagac 2400 gcaaagcgtg cgtaatctgt cccgacccat aacgggatgg agcatcaagt attttagcaa 2460 agatatattt gaagttatga tgcaagccgc tgttgatacc gaggtcacaa caagcgaaga 2520 cttaatgcgt atacttgtca cggcgtgtaa tgcgacgatg actaagcgta agaggtacac 2580 tcctaacaag agtgcatttt ggtggacgct cgagattgag gcacttcgca aagagtgcaa 2640 acaccgcgat cgattagcgc aaagagcttt tcatactgat ctctattcta cttttaggga 2700 cgagttcaag gtggcaagga atgccctcaa gcgattgatc aagcataccc gacagaggaa 2760 gtggaaagag ttcctgggaa cagcgaacaa cgcatcattt ggtatagtat atcatacgtt 2820 caagaaagtg gccgagggtt cgattggacc ccgaactatg acattagacg agtttaggga 2880 agtggtgagc gagctttttc ctactcaccc aaacacggtg tggcctgatt atcgtatcga 2940 tcagccacga gagtttgaaa ggattactaa tgatgagatt cttgcggttg ccaggagact 3000 acccaacaag aaggcgccgg gaccagatgg tatcccgaat gaggcgctga aagttggtat 3060 gttgactgca accgatgcat tttgcagggt ttaccaaggc tgtttagaga acgcgaagtt 3120 ccccgatgag tggaaaaggc agaggttggt gttaataccg aagccgaaca aaccaccagg 3180 ggaaccgggt tcggttcgcc ccatttgtct attagacggg gcaggtaaag gtctagaacg 3240 catcatagtg caacggttaa atgcacacat cgaggaggtc aacggactgt ctgacgacca 3300 atttggtttc agaagtcgtc gatcaacagt tgatgcgatt caacgggtag tggacattgt 3360 ttcggtagct agaagcagaa accgatacag tggacggtat tgtgcagttg ttacattaga 3420 tgttactaat gcttttaaca gtgcttcgtg gttggcgatt gcaaatgcat tacagagaat 3480 taacactcct aaatatcttt atgatatcat tggtgattat tttaggaatc gtgtgctgat 3540 gtatgatacc acagatggac cggcagagat tgcagtcaca tcgggtgtac ctcaaggctc 3600 ggtactcggc ccaacgttat ggaatctcat gtacgacggt gtcctacgag ttgcaatggt 3660 ggaaggtgca cggattatcg gctatgcaga cgatatagtg ttgttggtgg aaggtaattg 3720 tgttgatgat attgaaattc tcgtttccag ccagattcgc atcatcgaca gatggatgac 3780 cgacaacgga ttaaagatag ccccgaccaa gaccgagttt attatggtca gttcccatca 3840 gaggatacag cacggggcta tcagggtagg tgatcacgta gtacattcgt cgcgcacttt 3900 aaagtatttg gggatggtct tagatgaccg cctcgaatac acttcacaca tcaggtatgc 3960 ggtggagaga gcgacgaagc tatggaccac cttggtaagg atgatgccta ataaggcagg 4020 tccaagtagt aatgttaggc gagtaattgc tcttactgtt gtggcgaagg tccggtatgc 4080 ctcgcccatt tggtgtcata cccttagatt tgctaaccgt agacaatggc tacgtcggtt 4140 ttaccggcca gtagtccagc gagttatctc ttctttcagg acaacttctc acgatgcagt 4200 ctgcgtgctt gcgggaatga tcccgctgca tctcctcctg gacgaggact ccaggacttt 4260 tcatcggaga cgagcagaga acatcgccgg atcggttgca cgtaacatgg aacgtgtcac 4320 aactatggaa cgatggcaac gagaatggga tgagagtgtt cacggtcggt ggacataccg 4380 tctcataccc gacgtcaaca gatggataag tagaagattt ggtggtgtag atttctttct 4440 ttctcagttt ctttccagcc atggcttcta cgcctaccag cttcatcgga tgcagttaac 4500 gggctcgccg ctatgcgatg cgtgcgagga acctgaggac gccgaacaca cgatattcca 4560 ttgtgtacgt catcgtgaac tgatcattag acttcagcat caagtcgacg aggagttaac 4620 gccggagaac atcatcgaag ttatgtctgc taacagatat aactggagca tggttcatca 4680 agcagtacgg acgattatga ttcgacaaca acatcggaga cacgtcatcg aacgaggcga 4740 acgacgtgct ttgctcgcca acatccagtt ggccttgcag agcagcgaca gtgacgacga 4800 gtaacgacaa ggattcatcg tagttcatcg tagcttcatc gccgagggct agacagtggc 4860 taatcaccac tgttggaagc cattcgttgc ctgggatgat ggacatccac cgcccgagtg 4920 acgtcgatac cctaacgggt gatccactcg gggccggttg aaggcacgga ggggttttag 4980 tgagtaagaa tctcacacta ccggggttga tcacccaggt gtcttatgca agatttcccc 5040 ttcgataaca aaaaaaaaaa aaaaaaaaaa aaaaa 5075 // ID GYPSY2-I_AG repbase; DNA; ANG; 5178 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 21-JUL-2005 (Rel. 8.04, Last updated, Version 2) XX DE GYPSY2-I_AG is an internal portion of the GYPSY2_AG LTR DE retrotransposon - a consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD; KW AP protease; GYPSY2-I_AG; GYPSY2-LTR_AG; GYPSY2_AG; Gypsy clade; KW gag; integrase; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5178 RA Kapitonov V.V., Pavlicek A. and Jurka J.; RT "GYPSY2_AG, a family of LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 3(4), 75-75 (2003). XX DR [1] (Consensus) XX CC GYPSY2_AG is a family of gypsy-like LTR retrotransposons. CC GYPSY2-I_AG, an internal portion of GYPSY2_AG, is flanked by CC GYPSY2-LTR_AG LTRs. The GYPSY2-I_AG consensus sequence was CC reconstructed based on multiple alignment of 20 copies; they are CC ~1% divergent from the consensus sequence. CC The consensus sequence encodes the 1408-aa Gypsy1_AGp poliprotein CC (predicted by FGENESH) and composed of the putative gag-like CC (pos. 1-300), AP protease (pos. 325-410), reverse transcriptase CC (pos. 514-681), and integrase (pos. 1030-1200) domains. XX FH Key Location/Qualifiers FT CDS join(846..1790,1875..5153) FT /product="GYPSY2_AGp" FT /translation="MLHSPPVRDVSTPDGVTPSADPAASGSKSPHVPTPPV FT PNTPRVPGPSACDAMFMPPESQIDTLNAMQLKPPEMDTTDIQTFFFALENW FT FDAWNITTNQHIRRFNILRTRIPLRVLPELRPLLENIRQYATDRYEVAKRA FT IIEHFEESQRSRLHRLLAEMNLGDRKPSQLLAEMRRAANGAMTDSMLVDLW FT IGRLPPYVQSAVIATNTDTNDRAKVADSVMDSFALYHRTGPYQTIHEVRNE FT DFERLSRHVTELGQRLDAVLSKLNERERARPRSRTRQRQPNQDAVTPSGHC FT YYHTQYGQAARNCRAPCSFNNRRYRLVITDPKTNIKFLIDTGADVSVIPRQ FT HSSVPSKPSTMKLFAANSTPIQVYGESLYTLDLGLRRSFLWNFIIADVGTA FT IIGADFLQHFHLLVDLRKKCLVDALTNVRSTGVPSQNPSEPTVKVCDSTSP FT IATLLKEFPGLTALSTPGTLLQSEVTHRIETTGQPTFARPRRLPPEKYAAA FT RKEFESLVQLGVCRPSNSSWASPLHMTKKADGTWRPCGDYRALNAKTVPDR FT YPLPFLQDFTMHLQDKIIFSKVDLHKAYHQIPIHPDDIAKTAITTPFGLYE FT FTTMPFGLRNAAQTFQRLIHDVLRGLEFVFPYIDDMIVASTSEAEHHEHLR FT QLFERLEKHQLAINPAKCEFYRNEISFLGHLVNASGIRPLPDRVQAISELP FT QPTTIMELKKFLAMINYYRRFLPHALETQGILLEMTPGNKKKDRTPLTWSL FT EASEAFAQCKEQLKRATLLAHPVKNAELSLWTDASDFAAGAVLHQRTNEDL FT QPLGFFSKRLEKAQQKYSTYDRELTAIYLAIRHFRYQLEGREFCIYTDHKP FT LTFAFRQTHDNASPRRARQLDFIGQFSTDIRHIAGKDNVTADLLSRIETVH FT ATPTIDYERLAEEQERDPELSDILSGKIQTDLFLQKTPIPGSPKSLYADCP FT GGIIRPYITRSFRTQLLHAVHDLSHPGARATARLITERFVWLNARKESQDF FT ARNCLACQRAKVGRHVKSPLIPYPATTARFSHINVDIIGPFPISNGNRYCL FT TIIDRFTRWPEAIPISDITASTVVSALLFHWIARFGVPAHVTTDQGRQFES FT SLFKELTKALGTKHIRTTAYHPQANGIIERWHRTLKAAITCKDTARWSEHL FT PLILLGLRTTFKNDINASPAELVYGTTLTIPAEFFIAKPQNALADQSDFAK FT TLEETMSSIRPQSTAWHTNRTPFVHSDLNKCTHVFIRDDTVRPALTTPYHG FT PYKVLTRNPKSFQILLRGQPTLVSIDRLKPAYGAEEEATPAPQCSWEGLTT FT NLLPPTTDHSETLPLPDVQANSDRRDATAASKPTSREQPVRNQTTPAPPSH FT PTTSRQTDRAAVDAPPPSILRRNDQTVSTGVTRSQRKVIIPLRYR" XX SQ Sequence 5178 BP; 1352 A; 1631 C; 1174 G; 1021 T; 0 other; actggtgacc ccgacgtgat cgcgtgcgcg agtgagtgag tggtaacctg acgaacaccg 60 tgtccagccg agaaaaaacg tgtttccatt gttccacggt ccggaccgac ggcaacgttc 120 ccccccatca tcgaggagcg gccgaccacg aaggaggcac cacgcaagcg cagccagcga 180 aaaaaaaccc cgtgcacaaa ccccgaaccc acgtgagtgc aaatcgacac cgaaggtggc 240 cgacagtgag gaacactgtt caggaacatt tttacccgac ggagcgaccg atcctagcgg 300 aaaagtttcc tctcggtgct gagcgatcgc cgaacatttt gctgacacac cccgcgccgt 360 gtcgcacacc cgccgagcat tttggtaccc gtacgtgttt gcgcacccgc cgatcataac 420 ctcacacgta ccgccgagcg cgctccagac ccacgcggtt tttgtgtgtg caccgtgtgt 480 gtgtgtgtgt gtgtgggtga atgtgcgcag gccgacgccg agcggattgc gtcagaattt 540 tgctcgagct acgttcgtca tttttttcga ccgtgcaccg aagacgtcgt cagcgcacgc 600 agccatcgtt ctcttctcgc cgacaccacc gaccgaacgc caccgaagat catcgcccct 660 cgtttctcac accaccggcg tcatcgacga acgcagccaa cgagcgacta atcctaacac 720 gatcgaccgc gtgtgcggat ttttcgtcgc cgaaggatcg acctagccaa cctccagctg 780 gacttgcttg cgcccccgcc actaaggtaa gatccaccct tttttaacta accttagtcg 840 taaggatgtt gcacagtccg ccggtccgcg acgtatcgac tcccgatggc gtaaccccga 900 gtgccgatcc agccgcgagt ggatccaaat cgcctcacgt accaacaccg cccgttccga 960 ataccccgcg cgtaccaggg ccgtccgcct gcgacgccat gtttatgccg cccgaatcgc 1020 agattgacac tttgaatgcc atgcagctga aaccaccgga gatggacacc actgacattc 1080 aaaccttttt cttcgcattg gaaaactggt tcgatgcgtg gaatatcacc acgaaccaac 1140 atattcgccg ttttaacatt cttagaacgc gtataccgct tcgtgtcctt cctgagcttc 1200 gccccctgtt ggagaacatt cgacagtacg ctacggaccg ttacgaggta gcaaagcgtg 1260 caataattga gcactttgaa gagtcgcaac gaagccgctt gcatcgtctg cttgccgaaa 1320 tgaacctcgg ggaccgaaaa ccatcgcagc tattagcgga gatgcgccgc gccgcaaatg 1380 gagcaatgac ggactctatg ctggtagatt tgtggatcgg ccgtctcccg ccatacgtcc 1440 agtccgccgt tattgccact aacacggata ccaacgatcg agctaaagta gcagactctg 1500 ttatggattc gttcgcgtta taccaccgaa cgggcccgta ccaaaccatc cacgaagtac 1560 gcaacgagga cttcgaacgt ctttctcggc acgtaacgga attaggtcag cgcttggacg 1620 ccgtactgag caagctcaac gaacgagaac gcgcgcgacc acgctcacgt acccggcaac 1680 gtcaaccgaa ccaggatgcg gtaacaccca gcggacactg ctattaccac acgcagtacg 1740 ggcaagcagc gcggaactgt cgtgccccct gctccttcaa caatcggcgg cagggtagta 1800 actcggccac tgcttccgat tgacgcttaa ccagaggcca acctcaacag atacacgtac 1860 tttcgaccca tagctatcgt ctcgtaataa ccgatccaaa aactaacatc aaattcttaa 1920 tcgataccgg tgcagacgtt tcagtaatcc ctcgacaaca cagttccgtc ccgagtaaac 1980 cctccaccat gaagctgttc gccgctaatt ctacaccaat ccaggtttac ggagagtcgc 2040 tctatactct cgatttggga cttcgccgat ctttcctttg gaacttcatc atcgcagacg 2100 tggggacagc gattattgga gccgattttc tccaacattt ccatctgctc gtggacttgc 2160 gcaaaaaatg tcttgtcgac gccttaacga acgtacgttc taccggagtg ccgagccaaa 2220 acccgtcgga accaaccgta aaagtatgtg attccacctc accgatcgcc actctcctaa 2280 aggaatttcc cgggttaact gcactatcca ctcctggcac cttactgcag tccgaagtga 2340 cgcaccgaat cgaaacgacg gggcaaccaa cattcgcaag acctcgccga ttaccacccg 2400 aaaagtacgc agctgcccgc aaagagttcg aatcactcgt ccagctcgga gtgtgccgcc 2460 cctcgaatag cagctgggcc agcccgctac atatgacaaa aaaggccgac ggcacctggc 2520 gcccttgtgg tgattaccgc gccctaaatg caaaaaccgt acccgaccgt tatccactac 2580 cgtttttaca ggacttcacg atgcatttgc aagacaagat catattttcc aaggtcgatt 2640 tgcacaaagc ataccaccag ataccaattc atccggatga tatagcgaag acagccatca 2700 cgacaccctt tggactttac gagttcacta ccatgccttt cggattgagg aacgcagcgc 2760 aaacattcca acgccttatc catgatgtcc tacgaggact cgagtttgtt ttcccgtata 2820 tcgacgatat gatcgtagca tcaacgtccg aggcagaaca ccacgaacac ttacgccaac 2880 ttttcgaacg attggagaag caccaactag ccatcaatcc agccaagtgc gagttctacc 2940 ggaacgagat ttcctttctg ggccatctgg tcaacgcttc tggtattcgt cctctccccg 3000 atcgagtcca agccatcagc gagctgccac agccaacgac gattatggag ttgaagaagt 3060 tcctcgccat gataaactac taccgacgtt ttctgccgca cgccctggaa acgcaaggta 3120 tacttctcga gatgactcca ggtaacaaaa agaaggacag aacgccatta acctggtcgc 3180 tagaagcttc cgaagcattc gcccaatgca aagagcaact gaaacgtgca acgttattgg 3240 cacatcccgt gaagaacgcc gaactttctc tatggaccga cgcttcagat ttcgcagccg 3300 gagccgtact tcaccaacgc accaacgaag acctgcaacc actaggcttc ttctcgaaac 3360 gtctcgaaaa ggcacagcaa aagtactcga cctatgaccg agaacttacc gccatctatc 3420 tcgccatacg acacttccga taccagctag agggtcggga attctgtatt tatacagacc 3480 acaagcctct aaccttcgcc ttccgacaaa cgcacgacaa tgcctcacct cgacgagccc 3540 ggcagttaga cttcattggc cagttttcca ccgacatccg tcacatcgcc ggaaaagaca 3600 acgttacagc cgatctgctc tcccgcatag agacagtgca cgcgacaccg accatcgatt 3660 atgagcgatt agcagaagaa caagagcgcg accctgaact ttccgacatt ctcagtggga 3720 aaattcagac ggacttgttc ctgcagaaga caccaatacc gggaagcccc aagtcactct 3780 acgccgactg ccctggaggt atcatcagac cgtacatcac ccgatcgttt cgaacacaac 3840 ttctccacgc cgtacatgat ctcagtcatc ccggagcccg cgccacagct agactaataa 3900 cagagcgttt cgtgtggctc aatgcaagga aggaatccca ggacttcgct cggaactgct 3960 tagcctgcca gcgcgctaag gtaggaaggc acgtcaaaag ccccttgata ccgtaccctg 4020 caacaacagc gaggttcagt catatcaacg tagacatcat tggaccattt cccatcagta 4080 acggtaaccg atactgcctt acgataatcg accgatttac tcgctggcca gaagcaatac 4140 cgatctcgga tatcaccgca tctaccgtcg tatcagcact actattccac tggatcgccc 4200 gattcggagt tccggcgcac gtaacaacgg accaagggag acaattcgaa tcctccttgt 4260 tcaaagagtt gacgaaagcc ctaggaacga aacacatccg tacgacagcc tatcacccgc 4320 aggcaaatgg aataatcgag aggtggcacc gcactcttaa agcagcaatc acctgcaaag 4380 acaccgcaag atggagcgaa cacctaccgc taatactgct tgggctacga accacgttca 4440 aaaatgacat caacgcctcg ccagccgaac ttgtgtatgg aacgacgttg accatcccgg 4500 cagaattctt catcgcgaaa ccgcaaaatg ccctcgccga ccaatccgac ttcgccaaaa 4560 cgttagagga gacgatgagc agcattcgac cacagagcac cgcttggcat accaaccgca 4620 caccgttcgt gcattccgat ctgaacaagt gtactcacgt gttcatacgc gacgacaccg 4680 tccgacctgc actaactaca ccttaccacg gtccatataa ggttcttaca cgcaatccta 4740 agtcttttca gatactccta cgtggacagc caacgctggt ttcgatcgac cgcttaaaac 4800 cagcgtatgg cgcagaagag gaagccaccc cggccccgca gtgctcgtgg gaagggctaa 4860 cgacaaacct gctgccgcca acaaccgacc actcggaaac tctgccgtta ccggacgtcc 4920 aggcaaattc ggaccgcaga gacgccaccg cagcctccaa accgacgtcg cgcgaacaac 4980 cagtgcgtaa tcagacgaca cccgcaccac catcgcaccc gacgacatcg agacaaaccg 5040 accgagccgc cgtcgacgcc ccaccaccct ccatcctacg ccgcaacgac cagacggtat 5100 cgaccggcgt caccaggtct cagcggaagg tcatcatacc tctacgttac cggtgacacc 5160 gctctaggag gggagtac 5178 // ID QUETZAL repbase; DNA; ANG; 1680 BP. XX AC L76231; XX DT 21-AUG-1997 (Rel. 2.07, Created) DT 01-JUL-2005 (Rel. 2.07, Last updated, Version 2) XX DE Quetzal, a DNA transposon of the Tc1 superfamily. XX KW Harbinger; DNA transposon; Transposable Element; TIR; KW Tc1 superfamily; Quetzal. XX OS Anopheles albimanus OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-1680 RA Ke Z., Grossman G.L., Cornel A.J. and Collins F.H.; RT "Quetzal: a transposon of the Tc1 family in the mosquito RT Anopheles albimanus."; RL Genetica 98(2), 141-147 (1996). XX DR GenBank; L76231; Positions 1 1680. XX CC TA target-site duplication. CC The closest known element to the Quetzal is the Uhu transposon CC from D.heteroneura. CC It has 236 bp TIRs (position 1-236 and 1445-1680). CC The transposase is ecoded by the sequence 373-1398. XX FH Key Location/Qualifiers FT CDS 373..1395 FT /product="QUETZAL_1p" FT /translation="MTREELSVSKRQDIIRLHGAQGKSYTEIAMLTNINRN FT TVARVIQRYKYEGRVSNLPRKGRPSVCTDRMRRAIKRLVDAEPEISAQSVA FT IVLNERHGIAISCETVRRYIHKFGYKAYNRRKKPQISPINRKRRLEFAKKY FT VNHPPEFWKKVLFTDESKFNIFGWDGTIKVWRPPGEGLNPKYTAKTVKHNG FT GGVLVWGCMAANGVGNLQVIDGIMDQYVYINILKQNLGPSLEKLGMSQDYW FT FQQDNDPKHTAFNSRLFLLYNTPHQLKSPPQSPDLNPIEHAWELLERKIRQ FT TRIKNRVDLENKLKEAWITISEDYTQNLVNSMPRRLAEVIKMKGYATRY" XX SQ Sequence 1680 BP; 561 A; 320 C; 355 G; 444 T; 0 other; cacttctcca caaaagtgaa tacacagcaa acagttttag gaataaatgc ctctagttgt 60 gcatagaccg aattaaaaat atgggaaaat tatcaattat gctttgacct atttgcaatc 120 gattgacgct tggtacgatg tccgtccgat gttgaagttc cgagaaattc tcggaaaact 180 gctgggatgc ctcaaaatcg ttccacaaaa gtaagtacac agcacaggtg ttcgttttgt 240 tttgaatcgt agataaactt ttaattgatg cgttaatgat cgattgggtg caatatgctc 300 agtcgttcag atagtttcga ggtgaacagt ttttagcttc caaatcgact gcagtttacc 360 ataattccaa aaatgaccag agaagaactt tctgtctcta aaagacaaga tattataaga 420 ttgcacggcg ctcagggcaa aagctacaca gaaattgcaa tgttaacaaa cattaataga 480 aatactgtcg ctagggtcat ccagcggtac aaatacgagg gccgtgtatc taatttacct 540 agaaagggtc ggccctcggt gtgcactgat cgtatgcgac gggcgataaa acgattggtg 600 gatgctgaac cagaaatcag tgctcaatct gtagctatag tacttaacga aaggcacggt 660 attgccattt catgtgagac agtgcggcgg tacattcata aatttggcta caaggcttac 720 aacaggcgca aaaaacctca gatcagccct atcaatcgga aacggcgatt agaatttgcg 780 aaaaaatacg ttaaccaccc acccgagttt tggaaaaaag ttttatttac agacgagagt 840 aaatttaaca ttttcgggtg ggatggcaca ataaaggttt ggcggccacc cggagaaggc 900 ctgaacccta aatacacagc caagacggta aaacataacg gagggggtgt gctagtttgg 960 gggtgtatgg cggcaaatgg tgttggaaat ttgcaagtta tagatggaat tatggaccaa 1020 tatgtttata tcaacatttt aaagcaaaat ttaggaccaa gtttggaaaa attagggatg 1080 tctcaagatt attggttcca acaagacaat gatccaaaac acacggcatt caattcacgg 1140 ctatttttgt tgtacaacac tccccaccag ctaaaatcac cgccccaaag tcccgacttg 1200 aacccaatag aacatgcttg ggaattactt gaacgaaaaa ttcgtcaaac acgaattaaa 1260 aaccgtgtcg atctagaaaa caaattaaaa gaagcgtgga tcacaatttc tgaagattat 1320 acgcaaaatt tggtaaattc aatgccacga aggttggcag aagttataaa aatgaaaggg 1380 tatgctaccc gatattgaaa acgttacaaa atgtcgaagg acacgaaata aaaacgaaat 1440 acccaacgaa cacctgcgct gtgtacttac ttttgtggaa cgattttgag gcatctcagc 1500 agttttccaa gaatttctcg gaacttcaac atcggacggc atcgtaccaa gcgtcaatcg 1560 attgcaaata ggtcaaagca taattgataa ttttcccata tttttaattc ggttctatgc 1620 acaactagag gcatttattc ctaaaactgt ttgctgtgta ttcacttttg tggagaagtg 1680 // ID GYPSY7-LTR_AG repbase; DNA; ANG; 327 BP. XX AC . XX DT 16-JUN-2003 (Rel. 8.05, Created) DT 16-JUN-2003 (Rel. 8.05, Last updated, Version 1) XX DE GYPSY7-LTR_AG is an LTR of the GYPSY7_AG LTR retrotransposon - a DE consensus sequence. XX KW Gypsy; LTR Retrotransposon; Transposable Element; 4-bp TSD; KW GYPSY7-I_AG; GYPSY7-LTR_AG; GYPSY7_AG; Gypsy clade; Gypsy group. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-327 RA Kapitonov V.V. and Jurka J.; RT "GYPSY7_AG, a family of LTR retrotransposons from African malaria RT mosquito."; RL Repbase Reports 3(5), 88-88 (2003). XX DR [1] (Consensus) XX CC GYPSY7-LTR is a long terminal repeat of GYPSY7_AG (it internal CC portion is deposited as GYPSY7-I_AG). The A. gambiae harbors CC ~30 copies of GYPSY7-LTR_AG. XX SQ Sequence 327 BP; 101 A; 79 C; 78 G; 69 T; 0 other; agttagaccg acccgctaga aaccagactg caagctggca ttccacaccg gctacgagca 60 gacgaagatg taacgctaca ccggccacga gcggacaacg gcgacatgcg caagtaatgc 120 gacacgcaga ccgatcgtga gaacggacca atgcagccag caaccagcgg cctcgttaga 180 acattagctc gttaggttta gtcagtcgaa gtcgaagtct agaatcagcc agatatagtt 240 catagtttag ctttagtcag gagtaatcct gtttgtgtaa aataaaaatc tttttttatg 300 gccaaccggc ctagataaag attaact 327 // ID AGM1 repbase; DNA; ANG; 5983 BP. XX AC AF060859; XX DT 27-JUL-1999 (Rel. 4.06, Created) DT 01-JUL-2005 (Rel. 4.06, Last updated, Version 2) XX DE Anopheles gambiae Moose LTR retrotransposon, complete sequence. XX KW LTR Retrotransposon; Transposable Element; AGM1. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5983 RA Biessmann H., Walter F.M., Chuan S., Le D. and Yao G.J.; RT "AGM1."; RL Direct Submission to Genbank (29-JUL-1998)Developmental Biology RL Center, University of California, Irvine, CA 92697, USA. XX RN [2] RP 1-5983 RA Biessmann H., Walter M.F., Le D., Chuan S. and Yao J.G.; RT "Moose, a new family of LTR-retrotransposons in the mosquito RT Anopheles gambiae."; RL Insect Mol Biol 8(2), 201-212 (1999). XX DR GenBank; AF060859; Positions 1 5983. XX FH Key Location/Qualifiers FT CDS 836..1960 FT /product="AGM1_1p" FT /note="ORF1" FT /translation="MKLPVIQLPEFGGDFNDWLPFHDTFVSLIDKSDELSG FT VQKLHYLKAALKGEAARLMSQFSLQMRITKCMANVGRPLWHKHLLKKRHIQ FT AILRLPKIINSNLDLLRRTVDDFQRHTLVLEQLGEPIKHLSSFLVELLSEK FT LDSASLAAREEAQADKSYTYSDMVEFLRKRVRLLETLANDTGETSKRQPRV FT KVSVNTAAAAEKKVDMCVVCGKQGHTIVNCRRFNEFDAKKRHEVVRQHKLC FT WNCLQGSHFVTSCTSRYGCQTCGKRHHTLLHAERSSSVIADDSVGSVSTMV FT LANIPMQCNSTDRSSYSNVMLTTVVLFVVDANGTQHPVRALLDNGAQPNAI FT SERLSQLLCLRVCVPMYPLQVWMERRLRRHVK" FT CDS 1960..5607 FT /product="AGM1_2p" FT /note="ORF2" FT /translation="MKVEIRSRFTQFALKLNFLVLSKVTANTPATSFSTSC FT WKLPAGLALADPEFHQSGRVDMLIGASHFYTFLREGRLKLSEHGPLLVETV FT FGWVVTGEVLREEAIIQQQAAQCHVMLSSENISDQLERFWKIEELHVSHFS FT ADEQRCEAYYEQTVSRDETGRYIVKLPKHQQHSTMIGKSETTSLKRFAGLE FT RKLFANSQLRQQYNEFMLEYIQLGHMVPVSPDNLDAATCCYLPHHPVFKET FT SSTTKMRVVFDGSAPTSTGHSLNDALLVGPVIQDDLLSLIIRFRKFQVALV FT PDLEKMYRQVLVHPEDRPLQRYGGAYELRTVTYGLAPSSFLATRTLQQLAE FT DEGDAFPTAKDTLKKQLYMDDLIAGSNSVDGAIQLREELSALAQRGGFTFR FT KWCSNSLAVLSDVPAEQLATKSSLRFDDKETISTLGICWEPEIDTFQFNIS FT ITTKSERDTMRTILSMIAELYDPLGLISPVIITAKVLMQSLWRLKLSWDDT FT VPEELQRNWIRFRAELPELKDFSIPRFAFAHQYRQAEIHCFTDASELAYGA FT CIYIRSEAEDGSIHVNLLASKSRVAPLKALTIPRLELCGALLGARLHEKVM FT AAMEIKFVAHRFWTDSTVVLDWLNAESKTWKTFVANRVAEIQAIRDAVWQH FT VSGQENPADLISRGVLPHQLINNQLWKQGPQWLSERKENWPQQKERTGQIT FT TDEIRSNVVLTTQIQEKNEIFTRYGSYQKLIDVVAYCFRFVHNARRLQSRI FT SNSALTVKELADAKKRLVKLVQAEEFTNDLYKIHKGIPVARNSTLKLLNPF FT IDNEGIIRVGGRLRNSDLNYNIKHQIVLPGFHPFTQLLIMDKHVKAMHGGI FT SSTLNAVRDEIWPINGKRAVRKVIRNCFRCCRANPQPIIQPEGQLPAERVT FT VNEVFSCTGLDYCGPLYLRPTHRKAAPNKCYICVFVCMSTKAVHLELVGDL FT STNSFLMALDRFVYRRGKPKHIYSDNGTNFIGAKNELHQIYKMLFNDSADS FT KIAKHLAKEEIQWHLIPPRAPNFGGLWEAAVKVAKTHLIRQLGSSRLSSEE FT MTTVLVKIEGCMNSRPLVPLSEDPNDLTALTPAHFHITNNLKVILEPDLKE FT VPMNRLGRYQLLHGYTQNFWIHWKQDYLKNLTVLHRSAKQSKQLSVGDIVI FT LKDEQLPAVQWPLARVVEIHPGADGISRVATLRTASGIVKRAVSKICPLQC FT SNQRMD" XX SQ Sequence 5983 BP; 1808 A; 1152 C; 1427 G; 1596 T; 0 other; tcgttgctaa ccaaaaccat ccaacgaaaa acacattcaa attgttcgat ctcaaatgca 60 tgcggttcta ccacagtgcc ggattatggc agctgacgtt ttgtcattct tgtgggtcta 120 gggacgtacc gctgtgaaag gggaaccgcc tttttgtggc aattgtcacg agccgaagaa 180 aagcgaagca cgcataaatg gtgtgtgaat aaagaagtga acttacttgc caacaaagaa 240 caaccgactg attatttcgt gtaaaaaaaa agtaaaaata tcacaacgta gggactttca 300 cggaacattt tggtgcaagt gaccaggata agtgcctgat tatcggtgtc aattgtggta 360 atcataatta gtgcaaaagt tctcgcttgt tctttgctgt gtgcgtgtga gtgtgtgctt 420 ggagctggct aggattttgt gtgcgtgtgc gcgtgtgcaa cgtattaaac agtgtgttta 480 acgcaagtgt aacaatggcg acggcgcgag aagacaaaat tcgtggcaaa gagcaaaaaa 540 gaaaaaacat tatagattcg atgcggcgaa tcgatttatt cgtacaaacg tacactgtgg 600 acaagattca tgaggtgacc acaaggctcg agaggctcga gaaagtgtgg tatggttttg 660 aagaagttca agaagagtta gacaagctaa cccttgaggg agacaacgct gaaaatgaaa 720 agacgcgtgc agaaatggag gagctataca tgagtgttag atccaatcgc tacggctgaa 780 gccatcgtct aatccaatag tcagcgacgt taaaccggta cccattgtgt cccaaatgaa 840 attaccagta atacaattgc ctgagtttgg aggtgatttt aatgattggt taccatttca 900 tgacacgttt gtgtccctga ttgataaatc agatgagctt tctggtgtgc aaaagctaca 960 ttatttgaaa gctgcgctca aaggtgaggc agcacggcta atgagccaat tctcactaca 1020 gatgagaatt acaaaatgca tggcaaatgt tggtagaccg ctatggcata aacatttgct 1080 taagaaacgt catatccaag ccattttgag gctaccgaag attatcaata gcaatctaga 1140 tttattgcga cgcactgtag atgatttcca gcgacacaca ttggtgttag aacaactcgg 1200 tgaaccaata aaacatctta gctcattcct tgttgagctt cttagtgaga aattggatag 1260 tgcttctctt gcggcgaggg aagaagcaca agcggataaa agttacacat acagtgacat 1320 ggttgagttt ctgcgtaagc gtgtgcgttt gttggaaacg cttgctaacg atacgggtga 1380 aacgagtaag cgacaaccgc gtgtaaaagt gagtgtgaac actgctgcag cggccgagaa 1440 aaaagtggat atgtgtgttg tgtgtggaaa gcaggggcat acaatagtga attgtcggcg 1500 tttcaatgaa ttcgatgcaa agaagcgcca cgaagttgtg aggcagcaca aattgtgttg 1560 gaattgcttg cagggcagcc attttgtaac cagttgtacg tcaaggtatg ggtgtcaaac 1620 gtgtggcaaa cggcatcata ccttgctgca tgctgaacga agtagtagtg taatagcaga 1680 tgattcggtg ggttctgtta gtacgatggt gcttgctaat attccaatgc agtgcaactc 1740 tactgaccga agctcatatt ctaatgttat gctgacaacg gtggtgttgt ttgtagttga 1800 cgcaaatggt acgcaacatc cagtacgtgc gttgttggat aacggtgcgc agcccaatgc 1860 gataagtgag cgattgagtc agcttttgtg cttgcgcgta tgcgtaccca tgtatccatt 1920 acaggtgtgg atggaacgac gactcaggcg tcatgtgaaa tgaaggtgga aatccgttcc 1980 agatttacgc aatttgctct gaaactaaat ttcttggttt tgagcaaggt aacagcaaac 2040 actccagcca catctttctc cacatcatgt tggaaattac ctgctgggtt ggcgctcgca 2100 gacccagaat tccatcagtc tggacgagtg gatatgctaa tcggtgcatc gcatttctac 2160 acgtttctga gggaaggccg gctcaaactt agtgaacatg gtccattgtt agttgaaaca 2220 gtgttcggtt gggtggtaac aggtgaagtg ctccgagaag aagctataat tcagcaacaa 2280 gcagctcagt gtcatgttat gttatcgtcg gaaaacatta gcgatcaact tgagcggttt 2340 tggaagatcg aagagctgca tgtttcgcat ttctcggcgg atgagcagag atgcgaagct 2400 tattatgagc aaacggtatc acgagacgaa acaggcagat atatcgtgaa actgcctaaa 2460 catcagcaac attctaccat gattggaaaa tcagaaacga catcacttaa gaggtttgct 2520 ggattagaac gtaagttatt tgctaactca caacttcgtc agcagtacaa tgagtttatg 2580 ttggagtaca tacaactcgg tcatatggtt cctgtgtcgc ctgacaacct ggatgcagca 2640 acctgctgct accttccaca tcaccctgtg tttaaggaga caagttccac gactaaaatg 2700 agagtggtat ttgatgggtc agcaccaact agcacaggac actctctgaa tgatgcgcta 2760 ttggttggac cagtcatcca agatgatctt ttgagcttaa taatccgttt tcgtaaattt 2820 caggtcgcgc tagttccgga cttggaaaaa atgtatcgtc aggtacttgt gcatccggag 2880 gatcgaccgt tacagagata tggtggggcg tatgagttgc ggacagtaac atacggtttg 2940 gctccttcat ctttcttagc tacaagaaca cttcagcagt tggctgaaga tgagggtgac 3000 gcgtttccta ctgccaagga tacgctgaaa aaacaactgt acatggatga ccttattgct 3060 gggtcaaata gtgttgatgg agctatacag ctacgtgagg aattgagtgc actggcgcag 3120 agaggcggtt ttacttttcg aaaatggtgt tcgaattcat tagccgtttt atctgacgtt 3180 cccgctgaac aattggcaac aaaatcatcg ttaaggttcg acgacaagga gacaattagt 3240 acgcttggta tatgttggga accagaaatt gatacgttcc agttcaacat ttctattact 3300 acgaaatcag agagagacac catgcgtacg atactatcaa tgattgctga actgtatgat 3360 ccattgggat tgatttcacc tgtaattatt acagccaaag ttttgatgca atcgctctgg 3420 cgtttgaagt tgagttggga tgatacggtg cctgaggaac ttcaaaggaa ttggattaga 3480 tttcgagcag aattaccaga acttaaagat tttagtatcc ctagattcgc tttcgctcat 3540 caatatcggc aagcagaaat acattgtttc acagacgcgt cagagcttgc atatggcgcg 3600 tgcatctata ttcggtctga agctgaggat ggaagcattc atgttaattt gctagcatca 3660 aaatcgagag tggctccact gaaggcattg acgattccta gacttgaact ttgcggtgca 3720 ttattaggag ctcgattgca tgagaaggta atggctgcaa tggagattaa gttcgttgct 3780 catcgatttt ggaccgattc taccgtggtg ctagattggc taaatgctga atcaaagact 3840 tggaaaacat tcgtagcaaa ccgagttgct gagatccaag caattcgaga tgctgtttgg 3900 caacatgtat ctggacaaga aaatccagct gaccttattt cacgcggagt tttaccgcat 3960 cagctcatca acaatcaatt gtggaaacaa ggtccacaat ggttatcaga gagaaaggag 4020 aattggcccc aacaaaaaga gagaacaggt caaattacaa cagacgaaat aagatcgaat 4080 gtcgttttaa caacgcaaat acaagaaaaa aatgaaatat ttacaagata tgggtcatac 4140 cagaagctaa tcgacgtggt tgcatattgt tttcgttttg ttcataatgc tcgtcgtctg 4200 caatcaagaa taagcaatag tgcattgacg gtaaaggaac ttgctgatgc taaaaagcga 4260 cttgtaaaac ttgtgcaagc tgaagaattt accaacgatc tttataaaat ccacaagggg 4320 attcctgttg ctcgaaattc aacgttgaag cttttaaacc cgtttataga taatgaagga 4380 ataatccgcg taggtggccg gctcagaaat tctgatttga attataatat taagcatcaa 4440 atagttcttc ctggattcca tccatttact caactcctca tcatggacaa acatgtaaaa 4500 gcaatgcatg gaggaatatc atcaactctt aacgcagtta gagatgagat ttggccaatc 4560 aatggaaaaa gagctgtacg taaggtcata cgaaattgtt ttcgatgttg cagagcaaat 4620 ccacaaccta taattcaacc tgaagggcaa ctaccagcag aacgtgttac agttaacgag 4680 gtgttcagtt gtacgggtct agattattgt ggacctttgt atttgaggcc aacacaccgc 4740 aaggccgcac caaataagtg ctacatttgt gtatttgtct gcatgagtac gaaagcagta 4800 catttggaat tagtcggaga tttgagcaca aattcgtttc tgatggcact tgatcgcttt 4860 gtttatcggc ggggtaaacc aaagcacatt tattcagaca atggtaccaa ttttattggt 4920 gccaaaaatg aacttcatca gatctacaaa atgctgttca acgattctgc agatagcaaa 4980 atagcaaaac atttggcaaa agaagagata caatggcatt tgataccccc acgcgcccca 5040 aatttcggag gcctttggga agcggcggtg aaggtagcca aaacccattt gattcgtcaa 5100 ctaggatcat cgcgtttatc atcggaggaa atgactactg ttttagtaaa aatagaaggt 5160 tgcatgaact cgcgaccatt agttccgctt tctgaagatc ccaatgattt gacggcatta 5220 actccagcgc actttcacat tacaaacaat ttgaaggtta ttctcgaacc tgacttgaaa 5280 gaggtgccta tgaatcgtct gggaagatac caactccttc acgggtacac gcaaaacttc 5340 tggatacact ggaaacagga ttatctgaaa aatcttactg ttttgcatcg gtcagctaag 5400 caatctaagc aattatctgt cggagatata gttattctga aggatgagca gcttccagca 5460 gttcaatggc cattggcacg tgtcgtcgaa atacaccctg gagctgatgg aatttcccga 5520 gttgcaacac ttcgtacggc atccggtatt gtgaagcgag cagtatctaa aatctgtccg 5580 ttacaatgca gtaatcaaag aatggattga aaactgagtg tttcaaggtg gccggtatgt 5640 tcgatctcaa atgacctgcg gttctaccac agtgccggat tatggcagct gacgttttgt 5700 cattcttgtg ggtctaggga gctaccgctg tgaaagggga accgcctttt tgtggcactt 5760 gtcacgagcc gaagaaaagc gacgacgcat aaatggtgtg tgaataaaga agtgaactta 5820 cttgccaaca aagaacaacc gactgattat ttcgtgtaaa aaaaaagtaa aaatatcaca 5880 acgtagggac tttcacggaa cacaaattaa ccagcaatca aatcaggtga gacagaaaac 5940 ctaccatgca ataaatgctt agcacacaca aaccgtagac agg 5983 // ID IKIRARA1 repbase; DNA; ANG; 610 BP. XX AC U55049; XX DT 15-SEP-1998 (Rel. 3.08, Created) DT 15-SEP-1998 (Rel. 3.08, Last updated, Version 1) XX DE Anopheles gambiae transposon Ikirara1. XX KW IKIRARA1. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-610 RA Romans A.P., Bhattacharyya K.R. and Colavita C.A.; RT "Ikirara, a novel transposon family from the malaria vector RT mosquito, Anopheles gambiae."; RL Insect Mol. Biol0-0 (1998)In press. XX RN [2] RP 1-610 RA Romans A.P.; RT "IKIRARA1."; RL Direct Submission to Repbase Update (15-APR-1996)Zoology, RL University of Toronto, 25 Harbord Street, Toronto, Ontario M5S RL 3G5, Canada. XX DR GenBank; U55049; Positions 1 610. XX SQ Sequence 610 BP; 181 A; 122 C; 123 G; 184 T; 0 other; cagggtttca cactttatct caagaccgcg ggaccccttc ccgaatctgt cttagccaaa 60 gccaagactg cggtatgatg cttgaaaacg ttgatgatac ctgaattcat cggatgcaat 120 gctgaatctg cggcacattt ctcgaaacac actggtccgc gccgaggaca tgcggtcgaa 180 tattttttta cacggataac ctttgtgtac atcagtgtac tgaaaacagt taattatttt 240 ttcgtgtttt taccagaaaa gattatttaa acagttcaaa ctaatttcta aacacttgta 300 aatattttaa ttaaacaaat atgcattcac tcattttatt aaattgtctt tttttggtaa 360 aaaaacgaaa aggataattg agtgtttcta atagactgat gtacacaaag gttatccgtg 420 taaaaaaata ttcgaccgca tgtcctcggc gcggaccagt gtgtttcgag aaatgtgccg 480 cagattcagc attgcatccg atgaattcag gtatcatcaa cgttttcaag catcataccg 540 cagtcttggc tttggctaag acagattcgg gaaggggtcc cgcggtcttg agataaagtg 600 tgaaaccctg 610 // ID Waldo1_AG repbase; DNA; ANG; 5580 BP. XX AC AB090814; XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 1) XX DE Anopheles gambiae retrotransposon Waldo1_AG DNA, complete DE sequence. XX KW Non-LTR Retrotransposon; Transposable Element; gag; KW reverse-transcriptase; Waldo1_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RA Kojima K.K. and Fujiwara H.; RT "Evolution of target specificity in R1 clade non-LTR RT retrotransposons."; RL Mol. Biol. Evol 20(3), 351-361 (2003). XX DR Genbank; AB090814; Positions 1 5580. XX SQ Sequence 5580 BP; 1512 A; 1273 C; 1549 G; 1246 T; 0 other; gttgagtgat gagttggagt tgagtcatga aggacgtgct tttgctgtcg cttctgatcg 60 aagctgtttt tatagtgttt tagtgcgatt cgctggtgct tttcctggac aaggaatttg 120 gttctgcctg tggtttccgt gcgtgtagcg ggttctggcc gcgggccttt ttctgctgag 180 gggctgaagc gcggaacgtg ctaatcttgt gcaacgcggg tgaagccgcc actatagaac 240 aaggtcgtgt tagaaggaag tgcggttgtt tgtagtggtg ttgccgtgga aagtatctcg 300 tgcatttagg atacacactg taccacttaa atattgattg tgcgcgtgaa gtatagttcg 360 gtattacggg tgcccctacg cacacttgag tttagcgagt agtgtaattg tgctggtgaa 420 gctagcagca cccagtgcgt gagtgactcg aacgtaagtt ctgtgagttt cagtgactta 480 tcgttgcgtt cgttgcgttg tgtgtggtat ccaaacacca ccggtatact gtgtgcgtag 540 tgagtgaatt acgatcatcg agtgcgtgag tgcctttgag tttacgtgcg gaagaagcat 600 taagatgcgc atacacgtac cgtgcgcggt tagcgtctta gggtgcgcat acactcaaag 660 aactaaatgc gctgcgtgtc tggattcgcc tgacggaatg aacggaaacg aatcgcttca 720 tcctcgtccg ctagggtctg cccttaagga cataggcgcc ttttttggtc gtagcagcaa 780 gacgccaaga tcgcccccga atgataatgt gcagggttct gtttcaccgg cggtggatgt 840 cgttgaagtt atgccggaag agcagacttc agccagcatg gagtgccagg aaacttccca 900 tcctatcaag gaacagggtt tcgaggttag tgccagcaaa ctgcaagaag ctctgatggt 960 agcaagggag ctgcatacgt atacgaagga tcggaacaat gtacatgctc cgataaaaaa 1020 gatgtcagtg agcatcctct cggcgttatc gtgtatcgaa cgggagctgc tgactatgaa 1080 gttgcgagcg gagagggctg aaaaggcgct tcgcgaggtc caatcggaac ctccagaaac 1140 acctatgact gggaagagaa gtaggaaagc gagaacgcca gaggaagcag aggacgctaa 1200 acgagcaaaa aacgatgctc cctcttgtaa ccgtccagat gcagagtaca gcgaaggggt 1260 taaaaactca gagaacggtg agctatggag cacagttgtc agcaaaaagg cccaacgcaa 1320 gaagaagatg ggaaccatgg cggagggtaa gcaaacaaga gcgggtgaac acaacggccc 1380 agttaaaccc gtaccgcgac gaccaaaaac ggaggcaatc ctagttgaga caactgaagt 1440 atcaacgcac aaagacatcc tccgcaagct taaagctgac cctgagctac agtcgttcgg 1500 caaacaagtt gttcgaataa gaagcacaaa aaatggagga ttgctatttg agcttaagaa 1560 aagtgatcaa acggaatgcg aaagcttttc cggaaagatt caacaagcca ttggtgaagc 1620 tggcaacgta aagtctttgg gacaaatgga gacagtagaa attcgtttca tcgacgaaga 1680 aacggaagca gctgacgtag agagggatct gagaaaccag ataactggcc tcgaaggcta 1740 taaggtggag gtaacgatga agacttcctt ctctggtatg cagaccgctc tagtgaaact 1800 cccggtaaag ctggtgtcgg tggttacagg agcaggaaaa gtgcaaatcg gttggtcagt 1860 ctgcccggtg cgtataaata taccgagcag aagatgttat cgctgctggc aaaccgacca 1920 catttctcag gactgttgtg gaccagacag aagggactgc tgcctgcgtg gcggggaaaa 1980 agggcacttc gctgctacgt gtcgtctgcc accacgatgt gtgctttgtc cagatggatc 2040 caacgcgcat cactctagcg gagcattctg tccggcggct aagaaaacag caccatggaa 2100 gtagcccaga taaacctaaa ccattgtgaa gaggcacagg cactactgag ccaggtgatg 2160 gtggaggaga taggagatat tgccatagtc tctgagccat acagcgctcc aacaggctct 2220 agtagctggg tggcagataa gactgggaac gctgcaatat gggtgacagg cacaatacaa 2280 cgggtagtat ctaacacctt cgagggtttc tgcatagccg aagtaaatgg agtgtttttc 2340 tgcagctgtt atgctccccc aagttgggag ctagagaggt tccatgttat gttagacaac 2400 ctcgtagcag agctggatgg gcatagacca cttgtaattg ccggtgattt caatgcctgg 2460 gctgttgaat gggggagcaa gcgaactaat agcagaggtg atgctgttct cgaaagcttc 2520 gcccggctgg gagtaacgct gggaaatgcc ggcacaaccc ccacgttcaa cagaaacaag 2580 aggacatcaa tcgttgatat tacgttctgc agtaccaccc tctcggagag attgaactgg 2640 cgagttagtg atgcactcac tctcagcgac cataatgttg tcagatacgc catcatccaa 2700 gggcataggc taacatcatc atcagcgcat gggtcccgag ttggtggtag aggctggaaa 2760 accgaatcct tcaatgagga tttctttaag gagcttttag ctttcggaga tttcggcgaa 2820 gccgtgagcg cctctcaaat cgtgatagcc cttaccaaag cctctgatgg cgctatgcca 2880 agaagaaaac ctcctaacgg ggcgacaagc cgtcggcagc ctgtgtactg gtggaacgcg 2940 tcgattaaaa tacaacgggc tgaatgcgta gcagcacggc gcaaaatgca gcgagaaaga 3000 tgccctgaag taaaacaaca actcaggatc gtatacattg cggctaggtc agaacttcaa 3060 agagcgatta aagccagtaa aaggcaacac ttcctcaagc tatgcgacga aatcgctcgg 3120 aagccctggg gacttgcctt caatacactc atgaataagg taaagtcttc ggaacctgta 3180 gaacagtgtc ctgtaaagct gaagagcatt attgaaactc tgtttccaac acatcctacg 3240 atcaacacgc cagagactga ccctgtgcct ataacgagac aggagattat taacctggcc 3300 aatcgtttaa aatgtggcaa tgctcctggg atagatggca tccccaatat ggcgatcaag 3360 gctgctatgt tggcataccc tgatgtgttc aaaaaacgtg caccaccgaa cctcagtcct 3420 acgatcaaca cgccagagac tgaccctgtg cctataacga gacaggagat tattaacctg 3480 gccaatcgtt taaaatgtgg caaagctcct ggaatagatg gcatccccaa tatggcgatc 3540 aaggctgcta tgttggcata ccctgatgtg ttcaaaaacg tgctaatgaa caccctgact 3600 accggccagt ttccgtcaat gtggaaaata cagaaattag tcttgatacc gaagccaggt 3660 aagccgccag gccacccgtc agctttcagg cccttgggac ttgtggataa cctggccaaa 3720 gtgcaagaga tggtgatttt ggaccggctt acaaaataca ctgagggacc acacggtctg 3780 tctgatcgcc aattcggctt ccggaaaaaa cgatctacag tggacgcgat actggcggtg 3840 ctggaaaaag gtattgcggc gtttcaacgg aagcgatgtg gagctcgata ctgtgcgctc 3900 atcaccattg acgtgaagaa tgccttcaac agtgctagtt gggaggaaat tgcggcagcg 3960 gtagagcgca tgaagattcc cccgcacttg tgcaggctgt cgaggaatta tcttgacggt 4020 cgcgtgctgc agtatgatac ggcggaagga gtgaaaacca atgccatccc cgcaggcgta 4080 ccccagggat cagtgcttgg tctcaccctg tggaacgtca tatatgatgg agtactgacc 4140 ctcgccctcc ctcatggtgt cgagatcact ggttttgcag acgacatcgc tatcactgtc 4200 tcagctgtgt ctattgagga ggtagagatg ctcgccaccg acgcagtcgg tcggattgac 4260 cggtcgatgc gggacgcaaa gctagtgatt gcgcatgcga agactgagtt catcgtcatc 4320 agcagtcaca aggtgcacca aaaagcatcg ataatggcag gaactgtaca agtcgagtct 4380 accagatcgc tgaagtatct tggggtagtc attgatgacc gactgaaatt taagagccac 4440 ctcgaggaag cctgcaagaa ggtcatgaaa gcaataaatg cactagcggc attcactcca 4500 aacattggcg gaccgtgtag cagcattagg cgccttcatg ctaactgcgc catctcggtg 4560 ttgaggtacg gagcgccggt ttgggcgcat atactaaagg agaagcagca ccagaacacc 4620 gtgaataagg tgcacaggaa gttggccatg cgtgttacta gcgcgtaccg taccatttcg 4680 tacgaagcgg tatgcgttat tgcgagcatg atgcctcttt gcatcaccct cgaggaggac 4740 tcgaagaact tccggaaatc gcgtgcgggt gaatctttca ctgagactgc caagaaagcc 4800 tcaaggcaag catcgatgcg gcaatggcag aacgagtgga gtaactcgtt gaacggaaga 4860 tggacctact tgctgatccc cgacgttgga gcatggctag acaggaaaca tggtgatgtg 4920 gactactttg tcacccaggt tctttccggc catggctgtt ttagaagcta tctgcacagg 4980 ttcaatcgcg cctcttcatc tcggtgccct gcgtgcaagg atgaagacga gacggtggac 5040 cacgtcatgt tccactgccc tcgattcgcc gaggaacgcc tgcagttgaa cgagagttgc 5100 agagtagaag tgggctgttc taacctggta caagtcatgc tgcagcacac cgactcatgg 5160 gaggcagcgg caacaacaat gcgtctgatc ctgaccaagc tgcaccaaaa atggaagcaa 5220 gaccagcagc tcggcgatat ttaaattcgt cgtgagtgta tgtgttagtg aacgcgtgag 5280 tgcgcggcgg aaaaagtgtc gtttagtgtc tagtgtttcg cgtcgtcgtc atgaccgtct 5340 tgtgttcgcg tgataactgt caaatctgtc tcgcgaactt tggctcatcg tgcagtagcg 5400 aggatgaaat gctaatgcat aagcccttcc ccaagaagca taccgaaagg tgaacccatg 5460 gggaagggta tatggcccaa ggagggggtt tactgggtaa gaatcccatg tcaacacccg 5520 tgcgacaacg ggagtctttc gaagattccc cctccttgta gaacaaaaaa aaaaaaaaaa 5580 // ID BEL17-I_AG repbase; DNA; ANG; 5596 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE BEL17-I_AG is an internal portion of the BEL17_AG LTR DE retrotransposon - a consensus sequence. XX KW 5-bp TSD; BEL17-I_AG; BEL17-LTR_AG; BEL17_AG; Bel clade; KW LTR retrotransposon; PHD Zn-finger; integrase; peptidase; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5596 RA Kapitonov V.V., Pavlicek A., Drazkiewicz A. and Jurka J.; RT "BEL17_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(3), 41-41 (2003). XX DR [1] (Consensus) XX CC BEL17_AG is a young family of Bel/Pao-like LTR retrotransposons. CC BEL17-I_AG, an internal portion of BEL1_AG is flanked by CC BEL17-LTR_AG CC LTRs. The BEL17-I_AG consensus sequence was reconstructed based CC on CC multiple alignment of 11 copies; they are ~2% divergent from CC the consensus sequence. CC The consensus sequence encodes a 1842-aa BEL17_AGp Bel-like CC protein CC (pos. 34-5559). CC BEL17_AGp is composed of the PHD Zn-finger (pos. 9-56), peptidase CC A16 CC (pos. 223-353), reverse transcriptase (pos. 839-1006) and CC integrase (1540-1700) domains. XX FH Key Location/Qualifiers FT CDS 34..5559 FT /product="BEL17_AGp" FT /translation="MGPKRNDHCQACGDQRDDPDFVACDKCNLWWHFSCAG FT LTEPAEAVEQRKWLCVACQAKERSGLIQRTPVKHTEQMEAANANLNLEEVT FT QNVAEKHNLKLVEVIQQAAETPMPSSSGTSYNAVSKPNLTPAKVSDSVANR FT LAIMKRRQEAERKRMELELQLKFVKEEEDLLSEELGVMAISGASTVTPSRP FT DGESMRGSQQDERGPAPRLESFHRNVPTELPEFSGDPAEWPVFIAHYDYTT FT EKCGFSNWENMIRLQKALKGPALEVVRSRLVLPEVVPQVIATLRSRYGRPE FT HLISALIGKMRRMPAPCREKPDTVVAFGEAVRSMVDHMQAAGLRAHLTNPL FT LLQEIVERLPTSEQYSWARHIRGVTEPDLIVFGEFMTEWMDDAETLTRLDS FT PSLKAVDRKKPNTKGYVHAHVENQGATTSGTRAIQVGQSCFVCNKRGHLVS FT KCFAFGAMAVKDRWRKARALSLCFSCLERHNWRTCQNRAVCSIGGCTRRHH FT ALLHGAEESREIGSEQNNIESREGGIDGGVIAESNHHQYVSSSSKALFRIV FT PVTVYGPAATVTTFAFLDEGSSMTLVDDDLAEQLGVEGKVEPLCIRWTGNT FT TRVEAGSRRVNLKVGPVGSTKRFAIHSVRTVPGLNLPRQSFVQDEGRWQHL FT ERLPIRQYRDAEPKLLIGLDNLRLAVPLRTKEGGVGDPIAVKTRLGWCIYG FT KPANLECERLLHICECNDQGNIHETIREYFDMQLIGAAHGVEQDPDERRAK FT QILDTTTARIGKRFESGLLWRKDDIELPPSIDMARRRFNCLERRMERDGHL FT KEQVHRQIRDLLSKQYVHKATLRELEEADQRRVWYLPIGVVTNPNKPGKVR FT LIWDAAAKAHGTSLNDMLLKGPDELSSLLGVLFRFRLYAVAACADVKEMFL FT QIMIRKEDKHAQRFLWRYEPTDELETYIVDVVMFGSACSPATAQYVKNRNA FT REHMEQFPRAVEGIIESTYVDDFLDSFETEEEACQVSHDVREIFRNGGFEL FT RNWTSNSMELMRCLGEANGDIKCLSSMGDEAERVLGMRWNPASDELGFCTR FT ACTTVSDLLIAERIPTKREVLRCVMSLYDPLGLLAMFVIHGKILIQDLWRT FT GTQWDEEINDMQLRHWRRWIDLLPAIADLRIPRSYFAAASKKMYENGEWHL FT FVDASQHAYACVLYLRIFDDAGEPQCTLIGGKAKVAPLKPLTIPKLELQAC FT VLGARFLRYTQEHHPINVRRRVLWSDSTVALSWIRSDPRNYKPFVAHRVVE FT ILESTSVDEWRWVPTDHNPADEATKWKGKPNFDFGGNWFQGPEFLLHGEDD FT WPSQRHNSDNPSEEIRQVNLHVEDSNTGLLPIRYERFSRLERLQRMIGWIV FT RYVGNLRRKYRGEPILGGALRQEELYEADKILWRQTQLEYYPEEVRILSLD FT DNDGKPGGRTVSKQSHIYHLLPFVDDEGVLRMRGRIGAAADVPYSAKYPVI FT LPRGSRLAELIVERYHRLYRHANNETVTNELRQQFQIPKLRALVTKTVKNC FT VFCKIRRSLPQVPPMAPLPKERLTPFVRPFSYVGLDYFGPVLVKRGRSNEK FT RWIALFTCLTVRAIHLEVVHSLSTESCVLAVRRFVARRGAPVEIFSDNGTN FT FLGASRQLRREIEERNETLAAIFTNAHTRWTFNPPGAPHMGGVWERMVRSV FT KAAISTVMEAKHAPDDETFETVILDAEAMINSRPLTYVPLDPENQEAITPN FT HFLLGSSSGVKQQPVLPTNYRDSLKGNWKLAQHMLDGIWRRWIKEYLPVIS FT RQSKWFENVREIRKGDLVLVVDGTIRNQWKKGIVERIMAGPDGHIRQAWVR FT TNTGAVRRPVAKLALLDIAT" XX SQ Sequence 5596 BP; 1465 A; 1184 C; 1717 G; 1230 T; 0 other; ttaaaaagcc tttaaaacgg ttccaaaggt gaaatgggcc caaaaaggaa cgatcactgc 60 caggcttgcg gtgatcagcg ggatgatccg gattttgttg catgtgacaa gtgcaattta 120 tggtggcatt tttcgtgcgc cggattgacg gaacccgcgg aagctgtgga acagcggaaa 180 tggttgtgcg tggcttgcca ggctaaggaa cggtccggct tgattcagag gacgccggtc 240 aagcacacag aacaaatgga agcggcaaat gctaacctca atctggagga ggttacgcaa 300 aacgtggctg agaaacataa cctcaaactg gttgaggtta tccaacaagc ggcagaaaca 360 cccatgcctt cttcaagtgg gacaagctac aacgcggtga gcaaacctaa cctcactcct 420 gctaaggtat ccgacagtgt ggctaatcgg ctagccatta tgaagcggcg gcaggaggcg 480 gagaggaaac ggatggagct cgagttgcag ctgaagtttg tgaaggagga ggaggacttg 540 ttgtccgagg agttgggtgt gatggcgatt tcgggagcat cgactgtgac accgtcccgg 600 ccggatggag aatcgatgcg aggctcgcag caggatgagc gtggtcccgc cccgcgactg 660 gaaagttttc atcgaaacgt gccaaccgaa ttacccgagt tttcagggga cccggcggaa 720 tggcccgtat ttattgcgca ctacgattac acaacggaga aatgtggttt ttccaactgg 780 gaaaacatga tacggctgca gaaggcgctc aaaggacctg cgctagaagt tgtgcgaagc 840 cgtttagtgc taccggaggt ggtgccacaa gtgattgcga cgctgcgatc gcgctatggt 900 cggccggaac atctcatttc agcgctgatt gggaaaatgc gtcggatgcc tgcaccttgc 960 agggagaagc ccgatactgt tgtggcgttc ggcgaggcag tgcggagcat ggtagatcac 1020 atgcaggctg ctggtctacg ggcacacttg accaacccgt tgctgctgca agagatcgtg 1080 gaaaggttgc cgacgagcga gcagtacagc tgggcacggc acatacgagg cgtgacggaa 1140 ccggatctta tcgtgtttgg cgaattcatg acggaatgga tggacgatgc tgaaacgtta 1200 accaggctgg attcaccctc attgaaagca gtagacagga agaaacctaa caccaagggt 1260 tacgtgcatg cgcacgtgga aaaccaggga gcgaccacat cgggaacaag agccatccag 1320 gtaggacaat cttgttttgt gtgtaacaaa cggggacacc ttgtgagcaa atgcttcgcg 1380 tttggagcaa tggcggtgaa ggaccgctgg cggaaggccc gtgcactttc tttatgtttt 1440 agctgtctgg agcggcacaa ctggcggacg tgccaaaaca gagcggtttg cagcattggt 1500 gggtgcacgc gccgacatca tgcgttgcta cacggcgctg aggagtctcg cgagatcggc 1560 agcgagcaga ataatataga aagccgtgaa ggaggaatcg atggtggtgt cattgcggag 1620 agtaaccatc accaatacgt gtcatcatca tcgaaggcgc tatttcgaat agtgcctgtc 1680 accgtgtatg gaccggctgc tacggtgacg acgtttgcgt ttttggatga aggctcgtcc 1740 atgacgctgg tggatgacga tttggctgaa caattaggtg ttgagggcaa ggtggagcca 1800 ctttgcatcc gttggacagg caacactaca agggtcgagg ctggatccag acgagtaaac 1860 ctgaaggtgg gacctgttgg ttctacaaag cggtttgcca tccactcggt acggactgtg 1920 ccagggctaa acctacctcg acaatccttc gtgcaggacg aagggagatg gcaacatttg 1980 gagcggctac cgattcggca atatcgggat gcggaaccca agctgcttat tgggttggac 2040 aatttgcggt tggcggtccc tctcaggact aaggagggag gcgttggtga tccaatcgca 2100 gtaaagacac gccttgggtg gtgtatttac ggaaaaccgg cgaatctgga gtgtgaacgg 2160 ttgttgcata tttgcgagtg caacgaccaa ggtaacattc acgagacgat tcgggaatat 2220 ttcgacatgc agttgattgg tgctgcacac ggcgtcgaac aggatccaga tgagcggcgt 2280 gcgaagcaga ttctggacac taccacggca cgaatcggga agcgattcga atcgggatta 2340 ctgtggagga aagatgacat cgagctacct cctagcatcg acatggcgcg tcgtaggttc 2400 aattgtttgg agaggaggat ggagcgagat ggacatctta aggaacaagt acatcgccag 2460 attcgagatt tgttgagcaa gcagtacgtt cacaaggcta cgttgcgtga gctggaggag 2520 gctgaccagc gacgcgtatg gtacctacca ataggggtgg ttaccaaccc aaataaacct 2580 ggaaaggttc ggctgatttg ggacgcggca gctaaggcgc atggaacatc gttgaacgac 2640 atgctgctga agggaccaga cgagctgagc tcgttacttg gcgtactctt ccgatttcgt 2700 ctgtacgcag tggcggcgtg tgcagacgtg aaggagatgt ttttgcagat tatgatacga 2760 aaggaagaca aacacgcgca gcgtttcttg tggcgttacg aaccaacgga cgagttggaa 2820 acctacatcg tggatgttgt aatgttcggg tctgcgtgtt cacccgcaac ggcgcagtat 2880 gtcaagaacc gaaacgctcg agagcacatg gaacaattcc cacgagcggt ggagggaatt 2940 attgaaagca cttacgtaga tgactttctg gacagcttcg agacggaaga agaagcatgt 3000 caggtatccc atgacgtaag ggagattttt aggaacggcg gatttgagtt gaggaattgg 3060 acctcaaaca gtatggaatt gatgagatgc cttggcgaag caaatggcga catcaaatgt 3120 ttatcatcca tgggagatga ggcggaacga gtattgggaa tgcgatggaa ccctgcatct 3180 gacgagcttg gattttgcac tagggcgtgc acgacggtgt ctgacctctt gatagcagag 3240 aggattccaa cgaaaagaga ggtattgcga tgtgtgatgt ctctttatga tcctcttggg 3300 ctgcttgcga tgttcgtgat ccacggcaag atcctgatac aagatctttg gcgaactggc 3360 acgcagtggg acgaagagat taacgacatg cagttgagac attggcgtag atggattgat 3420 ctgcttccgg caatagcaga cctacgcatt ccgcgcagtt actttgcggc agcatcgaag 3480 aagatgtacg agaatggcga atggcatttg tttgtcgatg caagtcagca cgcttatgca 3540 tgcgtcttat atctgaggat atttgatgat gctggagaac ctcagtgtac actcatcggt 3600 gggaaagcta aggtggcgcc actgaagcca cttactattc cgaagcttga gcttcaggct 3660 tgcgtgttgg gagcgagatt tttacgctac acgcaggagc atcatccgat taatgtgaga 3720 cgacgagtgc tttggtcgga cagcacagtt gcgttgtcgt ggataaggtc ggatccaaga 3780 aactacaaac ccttcgtggc tcatagggtg gtcgaaatac tggagagcac atcggttgac 3840 gagtggagat gggtgcctac tgaccacaac ccggcagacg aggccacgaa gtggaaggga 3900 aagccgaatt tcgactttgg tggcaactgg tttcaagggc cagagttctt actccatggg 3960 gaggatgatt ggccatcaca gaggcacaac agcgacaacc cgtcagaaga aatacgacag 4020 gtaaatcttc acgtggaaga ctctaacacc ggactattac ctatccggta tgaacgtttt 4080 agccgattgg agaggctaca aaggatgatt ggttggattg tcaggtatgt gggcaatctg 4140 agacgaaagt atcgtggcga acctatttta ggaggtgctc tacgacaaga agaactctat 4200 gaagcggata agattctgtg gaggcagaca caactcgagt actatccgga agaagttcgc 4260 attctgagtt tggacgataa cgatggaaaa cctggaggaa gaacggtatc aaagcaaagc 4320 cacatttacc atctattgcc gtttgtggac gatgaaggcg tattgcgaat gcgaggaagg 4380 ataggtgcag cggctgatgt tccttattct gctaagtatc ccgttatact gccgaggggt 4440 tcccgactgg ctgagttgat agtagagcgg tatcatcgat tgtatcgtca tgcgaacaac 4500 gaaacagtga cgaatgagtt acggcaacaa tttcagatcc cgaagctgag agcgttggta 4560 acgaagacgg tgaagaactg tgtcttctgt aagatcaggc ggtcactacc acaagtgccg 4620 ccaatggcac cgttaccaaa ggagaggctc acgccctttg ttaggccatt cagctacgtc 4680 gggctggatt actttggacc agtgttggtc aagagaggaa gatcgaacga gaagcgttgg 4740 atcgctttat tcacgtgttt gactgtgcgc gcgattcact tggaggtggt gcacagtcta 4800 tcgacggaat cgtgcgtgtt ggcggttaga cgatttgtgg ctagaagagg tgcaccggta 4860 gaaatcttca gcgacaacgg gaccaatttc ttgggtgcta gcaggcagct gcggagggag 4920 atcgaggagc gcaatgaaac tctggcggcg attttcacga atgcgcacac ccgttggacc 4980 ttcaacccac ctggcgctcc acacatgggc ggcgtgtggg agcgtatggt acgctctgta 5040 aaagcggcga ttagcacggt gatggaggca aagcacgcac ccgacgacga gacgtttgag 5100 acagtgatct tagacgcaga ggcgatgatc aactctagac cgttgactta tgttcccttg 5160 gacccggaga accaagaggc aatcacaccg aatcatttcc tgttggggag ttcttcaggt 5220 gtaaagcagc agccagtgtt acctacgaac tatagggata gcttgaaggg aaattggaag 5280 ttagcgcagc atatgctcga cggaatatgg aggcggtgga ttaaggaata tttaccggtg 5340 atctcgcggc agagtaagtg gtttgaaaat gtgcgggaaa ttaggaaagg agatttggta 5400 ctggtagtgg acggaacaat caggaaccag tggaagaaag gaatagttga gcgaattatg 5460 gcaggacctg atggtcatat aaggcaagcg tgggtacgca ccaatacagg agcagttagg 5520 aggccagtag ctaagctagc actgctcgac atagcaactt agggtgacca aatatggttg 5580 gtcacgggcg ggggaa 5596 // ID TRANSIBN1_AG repbase; DNA; ANG; 978 BP. XX AC . XX DT 08-MAY-2003 (Rel. 8.04, Created) DT 08-MAY-2003 (Rel. 8.04, Last updated, Version 1) XX DE TRANSIBN1_AG is a TRANSIB-like DNA transposon - a consensus DE sequence. XX KW Transib; DNA transposon; Transposable Element; 5-bp TSD; KW TRANSIB superfamily; TRANSIBN1_AG; nonautonomous DNA transposon. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-978 RA Kapitonov V.V. and Jurka J.; RT "TRANSIBN1_AG: a family of target site-specific nonautonomous RT TRANSIB DNA transposons from African malaria mosquito."; RL Repbase Reports 3(4), 83-83 (2003). XX DR [1] (Consensus) XX CC TRANSIBN1_AG is a family of nonautonomous DNA transposons that CC belongs CC to the TRANSIB superfamily originally identified in Drosophila CC (see CC description of TRANSIBN1-TRANSIB4 in drorep.ref). CC TRANSIBN1_AG is characterized by a remarkable target site CC specificity. CC Its copies are inserted into the CCagtGG target site, and CagtG CC is CC a 5-bp target site duplication. There are ~100 copies of CC TRANSIBN1_AG CC in the genome. CC The TRANSIBN1_AG consensus sequence was reconstructed based on CC multiple alignment of 30 copies. Given the ~2% divergence CC of these copies from the consensus sequence, transposition of CC TRANSIBN1_AG occurred recently (in the last 1 Myr). CC TRANSIBN1_AG has 17-bp terminal inverted repeats (2 mismatches). XX SQ Sequence 978 BP; 332 A; 167 C; 163 G; 314 T; 2 other; cacagtgggc aaccgccata caaacgccgg gatgaaaatc aattcctcgt gctattgcar 60 tttwtcttca ttcaatacaa ttgctcttac tatacagggt agtcctatac taaaatcgtc 120 aagacagcga ataaaactta ataattatcg ctcacaacat tgcattattg cgtatcagtt 180 aacagcatca ataataattg ttaggaatta aaacgaaggc ggaataagtt tctgactgaa 240 aacgaatttt taaagtatta cgcactaaaa aagttgtgtt tttcatggtt tgtttggaaa 300 agagccgatt cctatcttac gacctttttt aaactgtttt tcctctcttc agctttgttt 360 tctgcctgtt tgtttcaaat gcccgtttat gacaggtagt tggatactgg tggtgtatgg 420 ctcatacaac aacacgtcaa aatcgctcgc gctattttca agaaaaagtt taataatttt 480 cggggtcgca gatggtcgca gctaatttta acgacaatgc gtaggaaaat tgttgatctt 540 tccaatgata tacgactcac gagtgaaatc gagtcgaatc ataaaaaaaa tcactctcca 600 aaataaaaat atccaaaaat tcagcagtga tgtttggttt tcaatcattt atgaacttta 660 aaaacaagtt tttgcaaaat attaagacat aacatcaaag tatgacaaaa aacctttcca 720 acgacacatt gattatcaaa atctaaccat catatactaa aatatgatgg tttatgttcg 780 gtcgaaaaat agctcaaagt tgggacaaaa aacccaaagt ttacactttg atggcctata 840 tctcagtaag tttaagataa aaacgtgaaa tattttggtt tcaactaagt ttaagtatct 900 attttaagaa aatgattatg gtgtaaacct gcgatgaagt tggtttttcg atttttatac 960 aggcgtttgc ccaatgtg 978 // ID RETRO56_AG_LTR repbase; DNA; ANG; 141 BP. XX AC . XX DT 06-FEB-2003 (Rel. 4.1, Created) DT 06-FEB-2003 (Rel. 4.1, Last updated, Version 1) XX DE Anopheles gambiae long terminal repeat from RETRO56_AG DE retrotransposon - a consensus. XX KW Long terminal repeat; retrotransposon; MAG; RETRO56_AG_I; KW RETRO56_AG_LTR. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-141 RA Jurka J. and Drazkiewicz A.; RT "RETRO56_AG_LTR: LTR retrotransposon from Anopheles gambiae."; RL Repbase Reports 2(12), 12-12 (2002). XX DR [1] (Consensus) XX CC Related to MAG from Bombyx mori. 5 bp target site duplication. XX SQ Sequence 141 BP; 49 A; 27 C; 26 G; 39 T; 0 other; tgttgcatag taacacgcat aacgcagtaa cattgcaaga ctcgatcaga gtacacattg 60 agtgaataaa gacgattcca ttctgaacta aggaataaag cagttgtgtt tttctcaaga 120 tatattccct gcgacatatc a 141 // ID RT1 repbase; DNA; ANG; 8037 BP. XX AC M93690; XX DT 28-SEP-1995 (Rel. 1.08, Created) DT 28-SEP-1995 (Rel. 1.08, Last updated, Version 1) XX DE Anopheles gambiae RT1 retroposon. XX KW reverse transcriptase; Nucleic acid binding protein; retroposon; KW RT1. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-8037 RA Besansky J.N., Paskewitz M.S., Mills-Hamm M.D. and Collins H.F.; RT "Distinct families of site-specific retroposons occupy identical RT positions in the rRNA genes of Anopheles gambiae."; RL Mol. Cell. Biol 12, 5102-5110 (1992). XX DR GenBank; M93690; Positions 18 8054. XX SQ Sequence 8037 BP; 2175 A; 1995 C; 2212 G; 1655 T; 0 other; agtgtcaaac gtgaggtcgt actgtaccga cacgttgtgt ttgggctccg aaaaaaggac 60 ttaaaaacag tgcaaaatcg tcgttaatcg actttaatag tgtgaaaaac gtgctcaggg 120 cacctgattt accctttaag agcgttatca agtgttttta aggcgaaaat tcaacaaaaa 180 tcgtgcattt gtgtttgttg cctatgaacc ccccaccatg tgtttgtgtg acatggtgta 240 aaagcagaaa atcgaaaaaa gtgcccaaaa agtgccagaa ttgcacgatt ttagtgtaaa 300 cagtgcatct gagcacgaaa aaaggtgtta agaagtgatc cgcgcatgaa aaaaccacaa 360 aattgtgaaa aaaaagtttt gcgtcatcca gcccatactg ggccaagtaa tcgccggatc 420 tgtgttttca gcgatataat ttgcccagat tctcgccggg tgtggttttt aaggccatcg 480 ggtggccaaa gaacgttcat ccagtgaaaa ccgatcgaaa atcggccgtt tgtgaccgaa 540 aaaaagtgaa ataatattta atattattcg gtgaacaaat ttacgaaaat tcggttccag 600 gtgaaattgg gtgaaattgg gtgggttgac tgtacggccc ctgatggccc ctgagttaag 660 gtgccgttac acgagtaacc atagtttagg tgaccatcga cgtgcggaaa atttttgtga 720 catcggccta ttgcgaatgc tcaccgtgtg tttgtgtgac gtggtgcaaa agaagaaatt 780 cgaaaaaagt gcccaaaatg tgcctgagtt gcactatttt agtgtaaaca gtgcatctga 840 gcacgaaaaa aggtgttaag aagtgatccg cgcatgaaaa aaaccacaaa attgtgaaaa 900 tagttttgcg tcatccagcc catactgggc caagtaatcg ccggatctct gtttgcaggg 960 atataatttg cccagattct cgccgggtgt ggtttttaag gccatcgggt ggccaaagaa 1020 cgttcatcca gtgaaaaccg atcgaaaatc ggccgtttgt gaccgaaaaa aagtgaaata 1080 atatttaata ttattcggtg aacaaattta cgaaaattcg gttccaggtg aaattgggtg 1140 ggttgactgt acggcccctg acggcccctg agttaggcat cggttgtcaa acggcctgac 1200 agctcggacg gaccgatcgt taaggtgccg ttacacgagt aaccatagtt taggtgacca 1260 tcgacgggcg gaaaatgttt gtgacatcgg cctattgcga atgctcaccg tgtgtttgtg 1320 tgacgtggtg ctaaagaaga aattcgaaaa gtgcccaaaa tgtgcctgag ttgcactatt 1380 ttagtgtaaa cagtgcatct gagcacgaaa aaaggtgtaa agaagtgatc cgcgcatgaa 1440 aaaaccacaa aattgtgaaa aaagttttgc ggtcatccag cccatactgg gccaaggttt 1500 cgccggatct ctgttttcag cgatataatt tgcccagatt ctcgccgggt gcggttttta 1560 aggccagcgg gtggccaaag aacgttcatc cagtgaaaac cgatcgaaaa tcgcccgttc 1620 tgtgaccgaa aaaagtgaaa taatatttaa tattattcgg ggaacaaatt tatgaaaatt 1680 cggttccggg tgaaattggg tgggttgact gtacggcccc tgacggaccg attgttaagg 1740 tgccgtcaca cgagtaacca tagtttaggt gaccatcgac gtgcagaaaa tttttgtttt 1800 tttttgtgac atctgccaat tgcgaatact caccgtgtgc ttgtgcgacg tggtgcaaaa 1860 acagaaattc gaaaaaagtg cccaaaatgt gcctgagttg cactatttta gtgtaaacag 1920 tgtatctgag cacgaaaaaa ggtgtgacat cagtttgaca gctaagcaga ggaatttcaa 1980 agtggcataa cttcggcacc cgccaagcta tcccgacggg aaagtgttgc ctgtggaagc 2040 ggtatcgaaa tctttcaggg cagggtgaaa tatttcactt tggaaatatt tcacattttg 2100 aatttttttt tttttttttt tttttttttt tttttttttt tttttttttt tttttttttc 2160 ttctcttcaa aaaaaaacca tcttccatgt gcgcgggcgt caggcagagt gcgtcgggct 2220 gccatcaaac cctcaagtag gtgcgcgttt gtgaaagcgg ccctgaaatc tttcagggag 2280 ggtgaaatat ttcacttttg aaatatttca aaatttgaaa tttttcaaca tttcaaaaat 2340 ccgttctcca tcggttttct tgacgaacct aaaaccgggg gtcggtttct gacgcctcgt 2400 tggcgagagt gcgggcgtta ggcatagtgc gtcgggctgc caacaaaccc tcaagtaggt 2460 gcgcgcaacg ttgaaaccaa catgtcgggg ctgggcggtg atgcccatcc ccaagggagc 2520 tcgggacgag tacttcgccc gagggctcga tcggtgtccc tcaaccgggt cgacgcatta 2580 aaagtgtctg actcaacacc ggtggagcca ccacccaccg caggcatcgt ctacttgagt 2640 gatgatgaag aagaagaatt gaactgcacg atcctcgcgg gcccgtcggg attagcagtt 2700 ccaccaatgg ggaaggtgcc attggtagtg ctggacaagt tgcccagcca gagtcaacag 2760 cgcgaggaga tgacggtacc ggccacctca acaccaaagg ctggtaagtg ttcttctgct 2820 gaaccatctc tctcagagat gaacgagagc ttgaagctcc tggccatgca ggttgctcag 2880 ctctcaaagg agcttagcct ctgccgtaag gagctccagg aaagtttgat gaaaaatgcg 2940 gcgcttgaac gggagctcga aacgtacagg atgggcgccc gttcggtcat cgagctgcag 3000 cagcaagcag cagcagcccc aatgatgaca gcccagggag cccacagctc tcgcaaccgt 3060 cgcggtcgcc aaggaccaca gcagcaggag cagcggcagc agcagcaaca gcatcagcag 3120 cgggaacagc agcagcagca gcagcagcag cagcagcagc agcagcagca acaacaacag 3180 cagcagcgga accagcagcg tgaatggcag cagcagcagc agcagcagca gcatcaacag 3240 cgagaacagc agcagcagca acgggtgcag caacagaatc agcagcacca acgtcagcag 3300 cagcagcagc agcagcagcg gcaacagcaa cagcagcagg agcagcaaga attatggacg 3360 acggtagtgc gccgccgtca aaatacacag cagcagcagc agtctaacca accgcagcaa 3420 caacaacagc agactgggcg gtatcagccg ccgcaaatga ggcagcagct acagcagcaa 3480 cagcagcaac gacagccaca gcgatatgtg gtcgcaggct cgtcgcaaca gcagcagcag 3540 cagcatcaac agcagcagca gaagcgtaag cgtcctaagc ccgaactgat agagatctct 3600 cctggtcaga acgagacttt cgagagcgtc tccttgaaaa tccgtaaagc cgttgacgat 3660 aatggcacac ataaggagtt aaaggatttc atcatcatgg gccggcgcac agataaggcg 3720 ttgctacgac tgacgcttgc tagatccgca aacgcgacct taattctcca gcagatccga 3780 acgatcatcg gcgaggctgg aacttgtcga cacgtgacgg aaatggcggc cttggtagta 3840 aacgacatcg accccctagc caaggaggaa gagcttacag ctctccttga aaacaagatc 3900 gagggtgggg caggcatcgt ctcaacgagc attaggacaa tgccggatgg cacccagcgg 3960 gcacgcgtcc gtctgccagc caaggccgcc aaagcgctgg atggtacgaa gcttcgcttg 4020 ggcttctgca tttccagagt gaagatggct cctccaacac ccaaagagca tcttcgctgc 4080 taccgatgcc ttgagcacgg ccacaacgcc cgcgattgtc ggtcacctgt agaccgacaa 4140 aatgtttgca tccgttgcgg acaggaaggt cacaaggctg gtacatgcat ggaagaaata 4200 cgctgcggca aatgcgatgg cccccatgtt atcggggacc ggacatgcga tcggtcggcc 4260 acccaatgac gcagctaaaa gtcctccaag tgaacctggg tggaggcagg atcgcccaag 4320 atctggtcct gcaaaccgcc cgacaaatgg aagtggacgt gctggttctt tcccacacgt 4380 atcgaccacc cgagaacaac ccaagatggg cagttgatgc ctccaaaaag gtggcagtcg 4440 tggccacagg acgataccct ctacaaggac aatggagcag tgatgttcca ggccttatag 4500 ctgccaaggt gggtggcatc accttcctaa gctgctacgc gccacccagc ctgtcgcggg 4560 aaggatttgc ggaattcgtt gaagcaattg aattggaagc ccaatcccac cctcaggtag 4620 tagttgccgg agactttaac gcttggcatg aggagtgggg aagccgacgc agcaatgagc 4680 gtggggaagt actgctcgag gcgtcccagc aattgggcct gctgctgatg aatcgaggga 4740 atgtggcaac ctttgttgga aacggtgtgg cgactgccag cgttgtcgac gtgaccttcg 4800 ccagctcgtc catagctcag ccgagcactt ggttggtaag aaacacggac acgcgatctg 4860 accataggta tatcacctat tcggtaggcc cagcgtcagc agaccagcag cgtaaccaag 4920 gacagtcacg tcaacggggc cagcgagagc gttttcaaca tgcaggcacg cgatttaaga 4980 cgaaacagtt ctcgaaagag aatttcctgg ccacgctaca tggcgaggga ttccgagaga 5040 aggcagtcaa tcaccaggga atgatctcgg caatgatatc ggcctgcgag aaaaccatgc 5100 aaaggatgac gtcgtctttc cccgaccctc atcgggacgt ttactggtgg acgccactga 5160 ttgctctgct taggcaaaac tgcgagcaga cgagagatcg catgcagcag acaagtgatc 5220 tccagaaccg aagtctggct gcagcccaat accgaacagc taaagctgag ctggatagag 5280 ctatacgtgc cagcaaaaag gccgccttcc aggaattgat cgatgctgcg gaggaaaacg 5340 ttttcggagc cgggtactta gtagtcctct cccgtcttcg cggtggaagg gccccacccg 5400 agacggagag agcgaggctt gaaagcatcg ttacagagct tttcccgcaa catccgccct 5460 tcaactggcc cagcatcagt tccgaggaag aacaggaaca gcctgcagac cagcagactc 5520 catggaccca agtcacgatc ccggaactcc gtctgatagc tagcaccatg ccgaacaaaa 5580 aagcgccggg ccttgacgga attccgaacg ccgctgtcaa ggctgcaatc cttgcgtaca 5640 cggacgtttt ccaggcgttg taccaaagct gtctggaaac ggctacattt ccagcaccat 5700 ggaaacgaca gcggttggta ctgctcccca agccggggaa accaccgggt agcaacgggt 5760 cataccgacc tttgtgcatg ctggatgcct tagggaaagt gctggagaag ctcattctaa 5820 acagacttca caaccacctg gaagatcctg ctgcggtgag gctgtcagac aggcagcatg 5880 gcttccggag ggggcgatct acaattggcg ccattcgaac agtgatcgag gctggtcaga 5940 gcgcgatgag attccgccgc acgaacgggc gggataacag gttcctgctg gtcgtgtcaa 6000 tggatgtcaa gaatgccttc aatacggcaa gctggcaggc catcgccact gcactgcaga 6060 tgaaaggagt acctgctggt ctgcaaagaa tcgtgaggag ctacttcgag aaccgagagt 6120 tggtcttcga gacatccgac ggcccagtaa ctcggtccat cacggctggt gttccacagg 6180 gttcgattct tggccccacc ctgtggaaca ccatgtacga tggagttctg gacgtcgccc 6240 ttccgcagga ctgcgagatg gtggcgtatg ctgacgactt ggtgctgctg attccgggca 6300 tcgacgtaaa tgcagtgaag gctgcagccg aggaggcggt cgccagtgtc tctcactgga 6360 tggctcaaca tcatctccag attgcgccgg aaaagacgga gtgcgtgctt atctccagca 6420 cgaagaaccc tacgcaggtc accataagag taggggacgt ggaggtgaca tcttcccgca 6480 cgatgcgtta ccttggggtg acccttcacg atcacctatc gtggctgccc catgtccgag 6540 aggtaaccac tcgggcaagg aagatagcag atgccgttac ccgcctcttg cgcaaccaca 6600 gtggaccaaa gaccagcaaa gcgcgactgt tggcctcggt cgcagagtcg gtcatccgat 6660 atgctgcgcc catctggcat ggcgaggtga cgaagagaga gtgtcgtcga cttttggaga 6720 gggttcagcg agtctcagca cggagggtgg cacgcacctt ccgtaccgta aggtatgaga 6780 ccgccaccct cctcgccggg ctgaccccca tctgtctctt gatagaggag gatgcgcgag 6840 tgttcgagcg tgttaatgat ccgggtcggt cgataacgaa ggcagccatc cggttggagg 6900 agaggcagcg caccatcacg atgtggcaga gccaatggga cgccgaggcc gacacctcca 6960 gatacacccg gtggacgcat cggataatcc gcgacatcag cgcttggcaa ggccggagac 7020 acggggagat gaccttccac ttggcccagg tgctctctgg acatgggttc ttcagagagt 7080 acctggccat caacggcttt acggaatccc ctgactgtcg cagctgtgcg ggtgttccgg 7140 aaaatgccca tcacgccatc ttcgaatgcc ccaggtttgc tcgagtgagg atggagtact 7200 ttggtgaact ggggccgaat ccggtcacgc cggacagtct tcaggacttc cttatgggca 7260 gccaagacaa ctggagcagc ttctgtgagg cggcacgtcg aataacaaca acgctgcagc 7320 gtgactggga cacggaacga gaacagcggg ctgcttccaa tcgtgaggaa gctgaaatac 7380 aacagcagct acagcgagag gaagacgaac gccgcactga agaacggcgg cagctccaca 7440 atgaggcaaa tcgtgcctac cgccagcgca ataggagaag tcaacctact ccaccagctc 7500 caccaccaac acccagggag gccgctcgcc tcgaagatgg gcgacggaga gtggcccgct 7560 ggcgggaaag acaacgaatg attcggaacg gtggaatcca aatgctacga gcgctgttcg 7620 gccatgatgc ctggtcgagc gaaagtgacg atgaacccga tgacgtcgag cgaggcggac 7680 ttgatgctgc ccagcaggca gcagcagccg aagccgaaag agcggcacgc taaagctaac 7740 tttagtaaaa agaacaacga aagtgtaaaa aaaaaaaaga gaaaggtgct tcggtacaga 7800 tatagggctc cccttaaggg aatgaattca gaggaacagt ttggaaaata gaatttcaac 7860 ttaaattaaa cggaaaggtg cgaacgcacg gctaaaaagt taggctccct atgagcatca 7920 cgtccaccct taaatccctt cgcagggcat aaggggcgga ttatgagagg gctgggtttt 7980 cttttcgatg taacatacga tcaataaaaa tccactcgat ataaaaaaaa aaaaaaa 8037 // ID AMER3_AM repbase; DNA; ANG; 1075 BP. XX AC AJ006554; XX DT 09-JUL-2004 (Rel. 9.06, Created) DT 09-JUL-2004 (Rel. 9.06, Last updated, Version 2) XX DE Anopheles merus Amer3 copia-like retrotransposon. XX KW Copia; LTR Retrotransposon; Transposable Element; KW reverse transcriptase; copia-like retrotransposon; AMER3_AM. XX OS Anopheles merus OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RA Cook M.J., Martin J., Lewin A., Sinden E.R. and Tristem M.; RT "Systematic screening of Anopheles mosquito genomes yields RT evidence for a major clade of Pao-like retrotransposons."; RL Insect Mol. Biol 9(1), 109-117 (2000). XX DR Genbank; AJ006554; Positions 1 1075. XX SQ Sequence 1075 BP; 344 A; 192 C; 263 G; 276 T; 0 other; gttttgttgg atacggggtc aatggctatc gtgtgtggga tcctgtacat cggaagatta 60 tcgttgctcg tgatgttgta atcgaagaat taatgtcgaa tcgttgtttg gaagaatcca 120 cttttgacca agaacggatg ctaccagaac aggttagtga aacaaatcgt aatattgaat 180 ttactgttcg acacctgtaa gattttggcg attatttgaa cacatcgact gagggcattg 240 aaaacgtgaa aaatacaaat attggaacta cgaacagatc ggtcgacatt gttggaaata 300 tggaagaaac atgtcatgaa aactcgaacg atacgcctga tgttattgaa aacacagaat 360 ataacgaagt agcagttcgt cgtagtgaga ggcttcgaaa accacctgtg cgctttagtg 420 attatgaagc taatgtcgca tttgcactga acgcagaaag ttacgtggaa gatttacccg 480 atacgattga tgcacttcgg aagcgtgacg attggcctga atggaaacaa gcgatcaatg 540 aggagatgca agcgctcgag aaaaatgaaa catgggatct agtagaactt cctattggcg 600 cgagagcagt tccttgtaca tgggtattca aaatcaaata ctcagagaat ggtactgtga 660 acagatacaa ggcgcggctc gttgctaaag gatgttctca acgtcaagaa tatgactatc 720 aagaaacgta tgctccagta gtccgaacaa caacagtgag gaccctactg gcagtagcag 780 tgcagaagaa gttctatctt catcggatag atgtgcggac cgcctttctg aacggaaatc 840 tttcagaaac agagtacatg ctacaaccac caggatttga gagggggaag aaggtatgta 900 aactgaacaa atccttgtat ggtttaaaac aggcgccgcg gagctggaat gaaatgttcc 960 acaattatat gttgacactg gaatttgtac ggtcagcata tgacagttgt ttgtacactc 1020 ggaaatctgc gaaagttgag atgtatccaa tcctatacgt cgatgatatc ctcat 1075 // ID HARBINGER1_AG repbase; DNA; ANG; 5377 BP. XX AC . XX DT 12-MAR-2003 (Rel. 8.02, Created) DT 12-MAR-2003 (Rel. 8.02, Last updated, Version 1) XX DE HARBINGER1_AG is an autonomous DNA transposon - a consensus DE sequence. XX KW Harbinger; DNA transposon; Transposable Element; HARBINGER1_AG; KW Harbinger superfamily; MADF/SANT; transposase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5377 RA Kapitonov V.V. and Jurka J.; RT "HARBINGER1_AG, a family of autonomous Harbinger-like DNA RT transposons from African malaria mosquito."; RL Repbase Reports 3(2), 17-17 (2003). XX DR [1] (Consensus) XX CC There are ~20 copies of HARBINGER_AG in the genome. CC They are ~98% identical to the consensus sequence. CC HARBINGER1_AG copies are flanked by the TWA 3-bp target CC site duplications. This element has imperfect 40-bp terminal CC inverted repeats (5 mismatches) and subterminal TIRs (positions CC 54-106 and 5269-5321). CC The HARBINGER1_AG consensus sequence encodes two proteins: CC a 471-aa HARBINGER1_AG1p transposase and a 245-aa HARBINGER1_AG2p CC MADF/SANT-like DNA-binding protein. The MADF/SANT domain is CC present CC in various transcriptional regulators and is involved in the CC chromatin remodeling. Putatively, HARBINGER1_AG2p CC was recruited by the transposon from the host genome. CC HARBINGER1_AG1p is encoded by 4 exons: positions 452-805, CC 875-952, 1396-1596 and 2345-3127. CC HARBINGER1_AG2p is encoded by 4 exons in -strand: positions CC 5133-5095, 4587-4359, 4287-4196, 2229-1852. XX FH Key Location/Qualifiers FT CDS 0..0 FT /product="HARBINGER1_AG1p" FT /translation="MSEPELENDNNEVRDRSRSRSRSPQARRLWVQDLFLN FT RNETGNRLLTDITTSGIYETMNRFLRMKKEDFFHLLSLVGPKIAKMDTDFR FT KAITEQERLLITLRYLATGETFTSLQYVFRVSRHSISRIVKETCACLIEAL FT RDYVKSQPESIPIRGCSGCARSELESNPTRGCNEFGSTRLCFLAPTDLSGR FT PELVATEPTVARPDPNSQHQRLPSTEEEWLAISRRFEQRWRFPHAIGAIDG FT KHVEIICPRNSGSEYHNYQKFFSIVLMVVVDADYNFLWADAGGKGGISDGG FT IFKNTRLYHKLENDQLNIPPATPLQVPYQTPVPYFILGDKAFAFTNYCLRP FT YSGVHPPDSMERTFNKMHSTCRMPVENSLGILANRWRVLKGIQLQPDVAKN FT IVLTTVYLHNFLRKHASRDTYTPPSAFDRVVRGRRVDGDWRSEGGLTDLQN FT IASRPSENLADIRNHIANHLKHNRST" FT CDS 0..0 FT /product="HARBINGER1_AG2p" FT /translation="MEQAGSSRRNEKIDHEKSLRFIAEVQKHRVLWEKKNK FT NYKNVVLKGDAWAAIAAKEEVSPQDAKHLWSRLLGIYRTNKAKVKKTTQTG FT AGNDDVFRPRWFAYQAMSFVDEATQDAVHVDTLGYDDEGLTPRSLVEAAYA FT VPQSLDDIDWDAAVAFDPEPFSPTPSSYPSALSVTPGTIREGNGVSGALNN FT RLQQPVSGATGAPETADPDDFGRYVADELRLIPSPKRRRIMSIQRELLETL FT KKNL" XX SQ Sequence 5377 BP; 1580 A; 1209 C; 1120 G; 1466 T; 2 other; aggccgggct acattgatcg tactcgcaag cgtaattttt atattcacta gcgcatctgg 60 cggcgaccag cgcaagctag atgtgggtat aatttaagtg ccaaaattat tcatgcttgg 120 tgtctcatca attttcgacc aggctgtctg tatgccgttt gacatgttga tcaaaatcaa 180 taaattgcta caggtcagga cgccgagtac tgctgggcaa atcgtcattt gacagttggg 240 tgcctaagct tcatacaaag cgcacgacaa cattccttcc catctgccaa actccatcca 300 tcatggcttc cacagccaaa tcttccacat ataattgtag tctatgttat caccttgatc 360 gcagcagtgt gcattgttta tcgtttgaat cgtttgccgg ccggtgttta gtagaagtga 420 aaattttggt ggttctcgat tttttggaaa aatgagcgag ccagagcttg aaaatgacaa 480 caacgaagtc cgggatcgtt cccgatctcg ttcccgtagt ccgcaggcac gtcgtttgtg 540 ggttcaggac ttattcctga accgaaacga aaccggcaac cggctactga ccgacatcac 600 gacatcgggt atatacgaga cgatgaaccg atttttaagg atgaagaagg aagatttctt 660 ccatttactg tcccttgttg gtccaaaaat tgcgaaaatg gacacagatt tccgcaaagc 720 aatcacggaa caggaaaggc tgctgataac attgcggtat cttgcgactg gagagacatt 780 taccagcctc caatacgtgt tccgggtaag taatattatt tttgcttttt gctttttgct 840 aagcacactt ttatcaatca acatataatt ttaggtgtcg aggcattcta tcagcagaat 900 agttaaagag acgtgcgcat gtcttatcga ggctttgcgg gattatgtca aggtaagttg 960 cgggtgtaac cagcctcgct tcagtgttta tgtgtttaaa acgttcagcc cggtggcagg 1020 caccagggcc tgccagtaaa ctttcccgtt tgccggggcc aaccattgct ctggggactc 1080 tttggctaga gaaacagaat acattgcgaa caactgtact ctaatcgcat gaatacaacg 1140 gtgtgaaggt aaaaaaatga ttttctaaat attattgcga ttgataaaga catcaaccca 1200 cacacaacag tattgatgaa gagagtagag ttgaaattat agcgaaaaac ctcatttttt 1260 ctacatgcat ctttggatca aaattcactg cttgggtagt agatgccggc ggatagttca 1320 tgaaaaacgg agccggaatg aacgtgtgaa atgtgctgca ctacgttttc ctcttsaacg 1380 tcctcgtata aaaagagtca gccggaatcg attccgatcc gtgggtgcag cgggtgcgct 1440 aggtcggagt tggaatcaaa tccgacccgc gggtgcaacg agttcgggtc gactcgctta 1500 tgctttctcg cacctactga tctttcgggt cgacccgaac tcgttgcaac cgaacctacc 1560 gtggcccgac ccgacccgaa ctcgcagcac caacgggtcg gagtcgttta tgctttctcg 1620 cacctaccgg agtatttgca cccactgaaa ctacttatac cttttgggtt gacccgaacg 1680 actccgggtc gctttcggat ggctccgacc cgataggttc gggtcgaccc gcccattact 1740 aaaaggttca tgtaatatga aaacggaaac tttttttttc ctcgaacgta gctcttcaat 1800 caagtaacta atgtgatata caaaactcaa aaagtattca aagttatgtt tttaaagatt 1860 tttttttaat gtttctaata actcccgctg aatactcata atgcgtctcc tttttggaga 1920 cggaataagc cgaagctcat cggccacgta ccggccgaaa tcgtccggat ctgcggtttc 1980 cggtgcgcct gttgcgcccg aaaccggttg ttgcagccgg ttgttgaggg ccccactcac 2040 accgtttccc tcgcggatcg tgccaggagt gacggataag gcagaagggt aggaagaagg 2100 agtgggagag aatggctcgg ggtcgaaggc aaccgcggca tcccaatcga tgtcgtccaa 2160 cgattgtgga accgcgtagg ctgcctccac gagggaacgg ggagtgaggc cctcatcgtc 2220 gtagccaagc tgaaaaaaaa agaaaatgaa cattaaaatc ataatttatg tttctcataa 2280 gctctaaaat atacagtaac taaaacatgc attgaaatca ttcctctatt ttattttctt 2340 ttagctaccc tctaccgaag aagaatggct tgcaatctca agacgatttg agcagcgctg 2400 gagatttcct cacgcaatag gtgcaatcga tgggaagcac gttgaaatta tttgccctcg 2460 taatagcgga tccgaatatc acaactatca aaaatttttt agtattgtat taatggttgt 2520 ggtcgatgct gattataact ttttatgggc agatgctggt ggtaagggag gaatatcgga 2580 cggtggaata tttaaaaaca cacggctgta tcacaagcta gaaaacgacc aactaaacat 2640 tccaccagca acgccattgc aggtcccgta ccaaacccca gtcccatact ttattctcgg 2700 tgacaaggca tttgccttta ccaattactg cttaagaccg tacagcgggg tgcatcctcc 2760 tgattcaatg gagcgtacat tcaacaaaat gcactctact tgtcgtatgc cagttgaaaa 2820 ttcgcttgga atattagcga atcgatggag agtgctcaaa ggcatacaac tgcagccgga 2880 tgttgccaaa aacattgttt tgacaacagt ttacttgcac aattttttgc gcaagcatgc 2940 ttcgcgggac acatacacac ccccgtctgc atttgatagg gttgttcgcg ggcgacgagt 3000 cgatggagat tggagaagtg aagggggctt gaccgatctc caaaacattg cttcccgacc 3060 ttcagaaaat cttgctgaca taaggaacca cattgcaaac catttaaaac ataatcgttc 3120 tacgtaaatc catccaacca ataccatgga tattaaaatt aataataaat tatgaaatcg 3180 caaacacaac tacaccgact attctgtaca ttactaccag catggttatt tataatatac 3240 atttaagtac gcgagtgcaa caactacgga tttttttaag cattactcgc ctgctctatt 3300 agcggctgga taaactggct ggcttactag ctcactcacc gaataaacgt ggccacatgc 3360 ctgttgtgtt tggttgctgc tggtggatcg agtggacgaa cgtggcgata atcatacgca 3420 ygtaactatg tacagtaaaa tctttctaaa atcaattctc ttcaagattg atcatggaga 3480 aacttaacta tagctacgta cacgatagcg actcgcccaa acacattatc ggacgaattg 3540 gccatcgacc cgctgaaata gcggagtatc acggacgatc gtaacaagca gggtacgctc 3600 gtctatcaac tacgcacctg cttccagcgc aacgatcgat tgccaattgt tattgccgat 3660 tgcaattgca caacgagcga agaaagaatc catcgtcatg tcagcactat gattccggac 3720 ggctatacat attggtctcg tgctgttgac gtggtgcgct ggtcagcatt tcctgcttgt 3780 cgtgatcgtc cgtatatgtg tacattttaa ggatcatagg aacacttcgt gtgtgaaatg 3840 attcatttgt attaaaactt tgacgaacga agcacatact actgtggtgt tatggggaaa 3900 accaccatga tatactttgt tcgtcaaaat gttaattata taagtgcaaa agagagcttt 3960 tttatgtata taacaaacga gcttcgcatt aagttagtta aatagaatgt gagtaaggtc 4020 aactagttcg tcggagaaca tctatcaatc atggataccc aaatgtgttg atgctcaata 4080 ctaggcattt atgaaaaaaa tgatttgaaa gctgacatta gcttccacgc caaagattaa 4140 acaaactagt taacccacac cactagttat ttaaaataag tataaacata ctcaccgtgt 4200 ctacatgcac cgcatcttgg gtggcctcgt ctacaaaaga catcgcttgg taggcgaacc 4260 accgtggccg gaacacgtcg tcattgccta aaataaaaca ataataaatt gggatgagaa 4320 ataataaatt aactgtatat atattttcct acatgtacct gcaccggttt gcgtcgtctt 4380 cttgaccttg gccttgttgg tccgataaat gccaaggagt cgggaccaca gatgtttcgc 4440 atcttgtggc gaaacctcct ccttggccgc tatcgcagcc cacgcgtcgc ccttcaaaac 4500 tacattttta tagtttttat tttttttctc ccacaaaaca cggtgctttt gcacctcggc 4560 aatgaagcgg aggctttttt catgatcctg gaatgaaaaa gaaaattcag ttaaatactt 4620 ccctttttta atttaaatta aaacaacaga aaggtcgaca atcgcgtgat aattcggtca 4680 tgctagccgt cacaccggaa ccacagtgat aataataaca cgtgctattt ttatattaaa 4740 tgtatacaaa taggccggta aattttgctt acaattttat aaaaaaaagt tctaccaagc 4800 aatatgacaa agccgttcct catgaataaa aaaacacgct ctatactaaa aaccgctaca 4860 ataccaaaac gaacaaccag tttcacacac tctgcaatgt agtttaaatg atgaaataga 4920 tggacaaacc ctacaacaaa ttatttaagc acttgtgaaa cctacaaaca ctacattttt 4980 caataaaatg ttcagcacaa ttgcggcccc gagccgagcc ggcgattcac agacaacagt 5040 tcacacacac acacacacac acacacaatt ttcgttaaaa acagttgtac ttactatttt 5100 ttcattgcgt cgactgcttc ctgcttgttc catggctaga aatacaatat gtatgttaaa 5160 ttaccacata ttttcaaaaa aacatacacg acaaaatatc gaacatactc accgagctgt 5220 ttgtttactg gtttgtgatc aaaattatag gctcctcgac ccaaaacgct ttttgacaga 5280 taaattatac cagcctctag cttcgacccg ccgccgctag atggcgtacg cgttcacaaa 5340 aattacactc aggagtacga tcaatgtagc ccggcct 5377 // ID RTAg4 repbase; DNA; ANG; 7072 BP. XX AC AB090813; XX DT 14-SEP-2005 (Rel. 10.09, Created) DT 14-SEP-2005 (Rel. 10.09, Last updated, Version 1) XX DE Anopheles gambiae retrotransposon RTAg4 DNA, complete sequence. XX KW Non-LTR Retrotransposon; Transposable Element; RTAg4. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-7072 RA Kojima K.K. and Fujiwara H.; RT "Evolution of target specificity in R1 clade non-LTR RT retrotransposons."; RL Mol Biol Evol 20(3), 351-361 (2003). XX DR EMBL/GenBank/DDBJ; AB090813; Positions 1 7072. XX FH Key Location/Qualifiers FT CDS 1147..3318 FT /product="RTAg4_1p" FT /translation="MSDEAQPQQVSAPYQLWPRKGSVVVMQPQPIERPATP FT MMELCYSSDDDELNSTIIAMPEPASECEAAEAAMDLEPPAAAQPTPTASPV FT PGNMVVAGPIDAGSCALLMAQLQNIGAQLTTALEELRLCREENAALRRENE FT LLLTGTRSVLELQTAANATLQQSSGQGGNRETARKRQQRLRRRERERQQQQ FT QQQQQQQQQQQQQQQQQRQQQQQCQQQRQQQPQQQQLQQPQQQLWTTVVRG FT RPSQRHRQPQQQQQQQQQQGERYVPPQLRQQRQQQQRPRQQQQQQQQQQQQ FT QGERYVPPQLRQQRQQQQHQQQQQQQQQQRQQQQRQQQRQQQQRQQQQQQQ FT QQQRQQQQRQQQQQQQQQHQQQQQQWQQQQQQQQQPRQSLPHRKQTQLQLS FT PRLQQQQQQQQQSQQQQQQQPQQLLWTTVVRSCPSQRQRQLQQQQQQQQQQ FT QQGERYVPPQLRQQRQQQQPQQQQQQRPQQQRPQQQRPQQQRSQQRKPAKP FT ELIEVSPNEGQDWESLLLLVQTAVRTDERYKPLKDHVVLGRRTSKALLRLT FT LSRKANAQYMLQQVPAIVGSAGVCRHVTEMASLVIHDVDPLAREDDLTSLI FT DSKFESGAGIVSTTMTKMADGTQRAYVRLPAMFVSELDGTKIKLGFCVSKV FT RAAPPTPRERVRCYRCLELGHWAHDCRSPDDRQNMCIRCGVVGHMAKVCTS FT QPKCLKCGGPHTIGHPDCARSALQ" FT CDS 3237..6614 FT /product="RTAg4_2p" FT /translation="MHFSAKVPQVRWSTHNWTPRLRPVGLAMTPQLRVMQV FT NLGRGERAQDIALQTAQEKRVDVLLLLELYRPPANNGRWAFDCSKKVAIVA FT TGSLPLQRIWCSNTPGLVAAEIGGTTFLSCYAPPRQPTDEFERFIEAVQLE FT TLTHSQVVIAGDFNAWHVEWGSERNSEKGEELLSAIQQLDLVVLNQGTTST FT FDGNGAATASIVDVAFATPTIAQPGTWNVCGDYSYSDHRYITYTVGTIVPV FT VNEPSSPRMRHQGRIRHADRRYKATQFSQRAFRARFSERAVSHERMVEIML FT ATCDKTMQRVTTSHSDPHRDLFWWTPLLRLLRENCDRARDRMRQTSDLQER FT SIAAAEHRTARAELGKAIKASKRNSFQELIDIAEENVFGAGYLVVLSHLRG FT GRTPPETERDRLEHIVSDLFPQHPPLVWPEAADIEGEEQPGAVADVSDDEL FT KLIARRMANKKAPGLDGIPNAAVKAAILEHTGVFTALYQDCLVNGTFPAAW FT KRQRLVLIPKPGKPSGVSCSYRPLCMLDALGKVLERLILNRLHEFLEDPES FT PRLSDRQYGFRRGCSTIGLIQRVVEAGQRAMSFGRANRRDKRFLLVAALDV FT RNAFNTASWQAIATALRTKRVPAGLQRIIHSYFQDRELVYETSEGPVVRSV FT TAGVPQGSILGPTLWNTMYDGVLDIALPPDAEILGYADDLVLLVPGTTPDN FT VKAAAEEAIISVMEWMARHHLELAPAKTEMVVISSTKAPTRITVRVGDVDV FT TSSRSIRYLGVTLQDKLSWLPHVKEVTERAGKIADATSRLLRNHSEPRASK FT AKLLASVSESVMRYAAPVWSKELQKREPGRLLERVQRKMALRVARAFRTVR FT YETATLLAGLTPICLLLDEDARVYQRLSAVNRTDTRANIRKQERQATIEQW FT QQQWDAEADTSRHTRWAHRVLPNIGSWQSRKHGDVSFHLCQVLSGHGFFRD FT YLCRNGFTSSPDCQRCSGVPETAEHAMFECPRFAEVRQQLLGEGITDPVRP FT ENLQQHLLRDAESWSRICEAAKRITASLQQAWDDERAALAAHGNEQHFEEV FT ADLEARRAEIRRARNDRRNASRRAARARQRELQRAGRPPSPPPSPRTAARR FT ADLRLRQARFRARRRQAI" XX SQ Sequence 7072 BP; 1751 A; 2015 C; 2113 G; 1193 T; 0 other; aatagtgtta ttcaatcgaa attcggtgaa tttttgcgaa attgaaaaat cgtgttttgt 60 gtatacattt gcccccccgg tagcaccaaa tttctgggtg ttaaaaaagt gcgcaaaaat 120 ttgtgtttca ggcgccggaa acacttgatt ttggataatg cttctaaagc gattaaaagt 180 gttttcgaga cgttttgtga agttttaagt gcataaaagt gggtgtttat atccccatac 240 aatttgtatg ggggactttg aacgtttata tcggagtgaa aaacgcacgg atctgtgccg 300 gatgtgtctt ggaaggtgcg cgcaatgacg cttaaggtta agtgacaaaa atcagaaccg 360 aaatcggtga agaaaccagt gaaaaaagtg aatttaaagt gtggaacaag ccgggtgaca 420 ggttacattt gctacccaag ggacaaaccc gcggtgacag ccgttaccca aggagcaatt 480 tttaaatttt cggccgagtg acggttggcc gaaaaactgt tagcggcgga aaattatgaa 540 attcttaagg gtggtagagg acgtgtccct gatcacggga aaaattattt cccccccctc 600 tccccctccc accgctcccc cctggggcta gaatagagta gggcgcccga taagcgtcag 660 ataaggtcga gatcgcccga aaattctggc tttcggcagg gcggtaaagt tggacccccg 720 cagtccggga aaattatttt ccccactccc accctccccc ctcccggtga aaaatttttg 780 aagtggcaag tcaaaaagag gtccaaaaag gttggaattc ggggttcgag ctgtttggac 840 gtataaacgc accggggaaa ttcgggactg gatcggaata attccccacg cgtgagacgc 900 accatcccga attttacccc accccctccc aggggggctt aaattggacc ttcttcacat 960 taacggcaga gttgaagcgg aaatcgacag aaacgggtag cagaacacca gctaggcggc 1020 gaagcatccc aggggtcatc ccgacaccct ggatcattcg agccccgggg tcatttcgac 1080 ccccagggtt atttcgaccc ccaatgcaac ttgccaaagc aggtgtgagc aataacaagg 1140 caggggatga gtgacgaggc ccaaccccag caagtgagtg cgccgtacca gctgtggccc 1200 aggaaagggt cagtggttgt gatgcagcca caacccatcg agcgcccggc cacaccgatg 1260 atggagctgt gctactcaag cgacgacgat gagctgaaca gcaccatcat cgcgatgcca 1320 gaacccgcgt cggagtgtga ggcagcggag gcggccatgg acttggagcc accagcagca 1380 gcacagccaa caccaacggc atcccctgtc ccgggcaaca tggtagttgc tgggcccata 1440 gacgcgggaa gttgcgccct gctgatggca cagcttcaaa acatcggcgc ccagctaacg 1500 acggcgctgg aggagctgcg actttgccgc gaggaaaacg cggcacttcg tcgcgagaat 1560 gagttgctgc tcacgggcac tcgttcggtg ctcgagctgc agactgcagc gaacgcaacc 1620 ctgcagcagt cgtcgggaca aggtggaaac cgggagacgg cccggaagcg ccagcaacgg 1680 ctgaggcggc gagagcggga acggcagcag caacagcagc agcagcagca gcagcagcag 1740 cagcagcagc agcagcagca gcagcaacgc cagcagcagc agcagtgtca gcagcagcgt 1800 cagcagcagc cgcagcaaca acagctgcag cagccgcagc agcagctttg gacaacggtg 1860 gtaagaggcc gcccgtccca gcggcatcgt caaccgcagc agcagcagca gcagcaacag 1920 cagcaaggtg aacgctatgt tccaccacag ctccggcagc agcgacagca gcagcagcgc 1980 ccgaggcagc agcagcagca gcagcaacag cagcagcagc agcaaggtga gcgctatgtc 2040 ccaccacagc tccggcagca acgacagcag cagcagcatc agcagcagca gcagcagcag 2100 cagcagcagc gtcagcagca gcagcgtcag cagcagcgtc agcagcagca gcgtcagcag 2160 cagcagcagc agcagcagca gcagcgtcag cagcagcagc gtcagcagca gcagcagcag 2220 cagcagcagc accaacagca gcagcagcaa tggcagcagc agcagcagca acagcagcag 2280 ccgcggcaaa gtttgcctca tcgcaaacag acgcagctgc agctttctcc acgactgcag 2340 cagcaacagc agcaacagca gcagtcgcag caacaacagc agcagcagcc gcagcagctg 2400 ctctggacaa cggtggtaag aagctgcccg tcccagcggc aacgccaact gcagcagcaa 2460 cagcagcaac agcagcagca gcagcaaggt gagcgctatg tcccaccaca gctccggcag 2520 caacgacagc agcagcagcc gcagcagcag cagcaacagc gtccgcagca acagcgaccg 2580 cagcaacagc gacctcagca gcagcgatca cagcagcgaa agccggccaa gcccgagctt 2640 atcgaggtat cacccaatga aggtcaggat tgggagagcc ttctgctgct tgtgcaaacg 2700 gcagttagga ctgacgagcg ttacaagccg cttaaggacc acgtcgtcct gggccgccgc 2760 accagtaagg cgttgctgcg actcacgctc agccgcaagg cgaatgcgca gtatatgctg 2820 cagcaggtcc ctgccatcgt gggcagtgct ggagtgtgtc ggcacgtcac ggaaatggcg 2880 tcactggtca ttcatgacgt cgacccgcta gcccgagagg acgatctcac ttcgctgatt 2940 gacagcaagt tcgagtcggg agcgggaatt gtgtcgacca caatgacaaa gatggccgat 3000 ggtacacagc gtgcgtacgt gcgattgcct gcaatgtttg tgagtgaact cgacggcacc 3060 aagataaagt tgggattttg cgtcagtaaa gtcagagccg cgccaccgac ccctcgagag 3120 cgtgtgcgct gctatcgctg tctcgagctg ggtcattggg cccatgactg ccgttcaccc 3180 gacgaccggc agaacatgtg catacgctgc ggcgttgtgg ggcacatggc aaaggtatgc 3240 acttctcagc caaagtgcct caagtgcggt ggtccacaca caattggaca ccccgactgc 3300 gcccggtcgg ccttgcaatg accccacaac tgcgagtaat gcaggtgaac ttgggcagag 3360 gagagagggc ccaggacatc gccctccaaa ctgcccaaga gaagagagtg gacgtgctgc 3420 tgctgttgga gctgtatcga ccgcctgcca acaacggtag atgggccttc gactgttcga 3480 agaaggttgc catcgtcgca actgggtctc ttcctctcca gaggatttgg tgcagcaaca 3540 caccgggact cgtcgctgcc gagataggcg gcaccacttt cctcagctgt tacgctccac 3600 ctcgtcagcc caccgacgag ttcgagcgct tcattgaagc agtacagctc gagacgctta 3660 cccattcaca agtcgtcatt gccggcgact tcaacgcctg gcatgtggaa tggggaagcg 3720 agcgtaacag cgagaaggga gaagagctgc tcagtgccat ccagcagcta gacctggttg 3780 tgctaaatca gggcacgacg agcaccttcg acggcaacgg agcggcaaca gcgagtatcg 3840 ttgacgtggc gtttgcgaca ccaaccatcg cgcagccggg aacgtggaac gtgtgcggtg 3900 attactcgta ctccgaccac cggtacatca cgtacactgt tggcaccata gttcccgtcg 3960 taaacgagcc ctcatcacca cggatgagac atcaggggcg cattcgacac gcggatcggc 4020 ggtataaggc gacgcagttc tcgcagcgag ccttccgagc gcggttctca gaacgggcgg 4080 tcagtcacga acgcatggtt gagatcatgc tcgccacgtg tgacaaaacg atgcagcggg 4140 ttacaacgtc gcatagtgac ccccatcgtg acctgttctg gtggacgccg ctgctcaggc 4200 tgctgcgaga gaactgcgat cgcgcccgcg atcggatgcg gcagaccagc gatcttcaag 4260 agcggagcat tgccgcagcg gaacatcgca cagcgagggc ggagctgggg aaggcgataa 4320 aggccagcaa gaggaactcg ttccaggagc tgatcgatat cgccgaagaa aatgtgtttg 4380 gagccggata tctcgtcgtt ctgtcccacc tccgtggtgg acggacgcca cccgagacgg 4440 agcgggacag gctcgaacac atcgtgtccg atctcttccc ccagcacccg cccctcgtct 4500 ggccagaagc ggcagacatc gagggagagg agcagccagg agcagtagca gatgtttcgg 4560 acgatgagct caaactcatc gcacgtcgca tggccaataa aaaggccccg ggactcgatg 4620 gtatcccgaa tgcggcggtg aaagcagcca tcctcgagca cacgggggtt ttcacagcgt 4680 tgtaccagga ctgcctcgtt aacggcacgt ttcctgcagc gtggaagagg cagcgccttg 4740 tactcatccc gaagccagga aaaccctccg gagtgagctg ctcgtaccgg cccctgtgta 4800 tgttagatgc actgggcaag gtgcttgaac gcttgatcct gaacaggctg cacgagttcc 4860 tagaagatcc ggaatcaccg cgactgtcgg accggcagta tggtttccgc agagggtgct 4920 cgaccatcgg tctcattcag agggttgttg aggccggcca gcgtgcgatg tcgttcggtc 4980 gagcgaaccg acgcgacaaa cggttccttc tagttgctgc gctagatgtg aggaacgcgt 5040 ttaacacggc cagctggcag gccatcgcca ctgcgctgcg gacgaaacgt gttcccgccg 5100 gcctccaacg tatcatacac agctatttcc aggaccggga gctggtgtat gaaacctccg 5160 aaggcccggt agtgcggtcc gtcacggcag gggttccaca ggggtctatc ttgggcccca 5220 ccctgtggaa cacgatgtac gacggtgtgt tggacatcgc cctgccaccc gatgcggaga 5280 tcctggggta tgccgacgac ctggtgctgc tggtcccagg cacaacccca gacaacgtga 5340 aagctgctgc ggaagaggcg ataatatcag tgatggagtg gatggctcga caccacctcg 5400 agctggcgcc ggcgaaaacg gagatggtcg tgatctccag caccaaagcc ccaacgcgga 5460 tcaccgtccg agtaggtgac gtggacgtca cctcgtcccg ctcgatccgc tatctcggtg 5520 tgaccctcca ggacaagttg tcatggctgc cgcacgtcaa ggaggtcacc gagagggctg 5580 ggaagatcgc cgacgccaca tccagactgc tgcgaaacca tagcgaacca agggcatcga 5640 aagcgaagct gctagcttcg gtgtccgagt ccgttatgcg ttatgcagca ccggtatgga 5700 gcaaggagct gcaaaaacgt gagcctggtc gcctgctgga gcgtgttcag cgaaagatgg 5760 cactgagggt ggcacgagca ttccgtaccg tgaggtatga gactgccacc ctcctagctg 5820 gtctgacccc catctgcctg ctgttggatg aggacgcccg ggtctatcag cgactaagtg 5880 ccgtcaaccg caccgacacg agggcgaaca tccggaagca ggagcgacag gccacgatcg 5940 aacaatggca gcaacagtgg gacgcggaag ccgacaccag ccggcacacg cgttgggcgc 6000 accgtgtgct acccaacatc ggcagctggc agtcaaggaa acacggagat gtgtcgttcc 6060 atctgtgcca ggtactctcg ggacatggct tcttccggga ctacctgtgt cgcaatggct 6120 tcacatcgtc ccctgactgt cagcggtgca gcggcgtccc tgagaccgcg gagcacgcga 6180 tgttcgagtg cccgaggttt gctgaagttc gtcagcagct actcggcgag ggaattacgg 6240 acccggtccg tccggaaaac ctccagcagc acctgttgcg cgatgccgaa agctggagcc 6300 gtatctgtga agctgctaag cggataacgg cttcacttca gcaagcctgg gacgacgaga 6360 gagcagccct agcagcccat ggcaacgagc agcacttcga agaagttgcc gatctggagg 6420 cacggcgagc agaaatccgt cgagcacgga acgaccggcg aaatgcgagc cgccgagcag 6480 ccagggcacg gcaacgagag ttgcagcgag caggacgtcc cccatctcca ccaccatcgc 6540 ccagaactgc ggcacgtcgt gcagatcttc ggctgcggca agcgcggttt agagcgagaa 6600 ggcgtcaagc gatataggac gcgagacgcc tgtatgggga gcacagtcct gcatcaacat 6660 cgtcatcaag cagcagcgac gacgattcag acggccgagg aagcgcggat atcgcagccg 6720 gaccgtctgg aatgcgcaac cgcgcacatg aacgcgaaaa cgaagccacg gacggtggcc 6780 tgagtgctgc agaagaagcc gcagcggtcg aggcggaagt tgcctcccgc tagacgttct 6840 gccttcttta gaaaagaagc gtctactagc aagaacgagt gctctactaa gggagaatag 6900 aagttgaact gaattgaaca ataaaaaaaa cggaaggtgc atcttgcacg gaataggttg 6960 aggcaattaa gtctagcatc cccctgcagg gtacgccctc gcgggtaata atgtaggggt 7020 gagggagggt ctgaattcca ctgaataaag aaacccgctt tgaaaaaaaa aa 7072 // ID RTE-1_AG repbase; DNA; ANG; 3314 BP. XX AC . XX DT 13-DEC-2002 (Rel. 7.11, Created) DT 13-DEC-2002 (Rel. 7.11, Last updated, Version 1) XX DE RTE-1_AG is a RTE-like non-LTR retrotransposon - a consensus DE sequence. XX KW RTE; Non-LTR Retrotransposon; Transposable Element; KW reverse transcriptase; endonuclease; RTE clade; AGRP1; RTE-1_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1522-2432 RA Reiss A.R.; RT "A study of repetitive DNA elements in the malaria vector, RT Anopheles gambiae."; RL Thesis (1991). XX RN [2] RP 1522-2432 RA Reiss A.R., MacIntyre J.R. and Hagedorn H.H.; RT "A repetitive element of the Malaria vector, Anopheles gambiae."; RL Unpublished (1993). XX RN [3] RP 1-3314 RA Kapitonov V.V. and Jurka J.; RT "RTE-1_AG, a family of RTE-like non-LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 2(11), 24-23 (2002). XX DR [3] (Consensus) XX CC RTE-1_AG is a family of RTE-like non-LTR retrotransposons. CC The RTE-1_AG consensus sequence was reconstructed based on CC multiple alignment of ~100 copies identified in the CC sequenced portion of the genome. Given the ~2% divergence CC of these copies from the consensus sequence, transposition of CC RTE-1_AG occurred less than 1 million years ago. The CC RTE-1_AG family is composed of several subfamilies. CC The consensus sequence encodes a 972-aa RTE-1_AGp protein CC (positions 373 3288), which is composed of the AP endonuclease CC and reverse transcriptase domains. The 3' terminus is composed CC of the TAAG microsatellite. XX FH Key Location/Qualifiers FT CDS 373..3288 FT /product="RTE-1_AGp" FT /translation="MGSWNVRTLSEAGALKKLDDALATLSMDLVALQEIRW FT LGNGVHNRRGKHCYDIYYSCHDRHRVLGTGFAVGPRLKPAIMDFKAINDRL FT CTLRMRGKFFNISLINVHAPTEDKEEEEKDLFYGRLARIVDACPRHDLIII FT LGDFNAKVGREPMYRQYTGCHSLHEHSNDNGSRLVQFAAANNLVVGSTKFA FT RKKIHKITWAHPGGESFNQIDHVLISRRRQSSLLNVRTYRGANIDSDHYLV FT GLVIRCRIARPRANGGGENTQARLNTDSLRDIAVQQEFKTALEESLLPEDR FT YETTSERWNALKTKIINCARNILPPRRGNTKSGWFDDECRQVTERKNTAYR FT AMQQRHRTRACAEEYSRLRREEKRVHRSKKHALEEQNMRELEQTREAYGPT FT RKFYQAIAGHRNNVVPKVTCCRNKDGDLVSNQPEVLSRWAQYFDELLNDQF FT SEQLEAPLADNVMLLPPSIEETRKAIRRLKNNKAPGTDGIAAELVKNGGAR FT LENEIHQIVTEVWDSESMPCDWNLGIIYPIYKKGDRLDCNNYRGITVLNTA FT YKIFSLILQDRLVPHVEEIVGNYQRGFRNGKSTTDQIFTMRQILEKMAEYK FT NDTYHLFIDFKAAYDSIARVKLYDAMSSFGIPAKLIRLVRMTMTNVTCQVR FT VDGKLSGPFATTKGLRQGDGLACLLFNLALERAIRDSRVETTGTIFYKSTQ FT ILAYADDIDIIGLRLSYVAEAYQGIEQAAENLGLQINEAKTKLMVATSADL FT PINNPNLRRRDVQIGERTFEVVPEFTYLGSKVSNDNSMEVELRARMLAANR FT SFYSLKKQFTSKNLSRRTKLGLYSTYIVPVLTYASETWTLSKSDEALLAAF FT ERKMLRRILGPVCVEGQWRSRYNDELYEMYGDLTVVQRIKLARLRWAGHVV FT RMETDDPARKVFLGRPQGQRRRGRPKLRWQDGVEASAIKAGITDWQTKARD FT RERFRTLLRQAKTAKRL" XX SQ Sequence 3314 BP; 968 A; 825 C; 878 G; 643 T; 0 other; tgctctgtaa tgggatggga tccgaagccc cattgaggga taatacaggc tctcccatcc 60 aactcctatt ccgacacgtc ctcgtcgtgc agagtggtaa catgtgtcac cacatatcca 120 agcttgggta cacgggcttg acccaacccc ttgggcggat ggtggcatat ggcgaaccag 180 gaggggggtg ggtatcccgg gaaactgggt gcctgaacgt cgggggggca acccgacgct 240 aaacaaaacg gtcacgtggg cgcggggcca gtcacccagt cccatgaatc aacgactacg 300 gaaagcacca gaaattttcg aaacggacct aacgataccg accctacgca atgcccccgg 360 actactctaa aaatgggctc atggaacgta cgcactctaa gcgaagccgg agccttgaaa 420 aaacttgatg atgccctagc cacactgagc atggacctcg tagctctaca agagattcgg 480 tggctaggga acggtgtgca caacaggcgt ggtaagcatt gctacgacat atactacagc 540 tgccacgacc gccaccgcgt gctcggaacg ggtttcgccg taggtccccg gttgaaaccc 600 gcaatcatgg atttcaaggc tataaacgat aggctatgca ccctgcgcat gcgaggcaaa 660 ttctttaata taagcctcat aaacgttcac gcccctaccg aagataaaga ggaagaggag 720 aaggaccttt tttacggccg cctcgctaga attgtagatg cgtgccccag gcatgacctc 780 ataatcatcc tgggggactt caacgcaaaa gtcggtaggg agccaatgta ccgccaatac 840 actggctgtc acagtctgca tgagcacagt aacgataatg gtagtagatt ggtccagttc 900 gccgcagcga acaatctggt tgtaggaagt accaaatttg cgcgcaagaa aatccacaag 960 attacgtggg cgcacccggg tggagaatcc ttcaaccaga tcgaccacgt gttaataagc 1020 cgccgacgac agtcgagtct gttaaatgtc agaacatatc gaggagccaa tatcgattcc 1080 gatcactact tggttggctt agtgatacgt tgtagaatcg cccgcccccg cgccaatggg 1140 ggcggagaaa acacgcaggc tcggctcaac acggactctc taagggacat tgctgtccaa 1200 caggaattca aaaccgcttt agaagagtct ctactaccag aagacagata cgaaactacg 1260 agcgagaggt ggaacgctct aaaaacaaaa ataataaact gtgcaagaaa tatactccca 1320 ccacgtcgtg gcaacaccaa atctggctgg ttcgacgatg aatgcagaca agtgaccgaa 1380 cgtaagaata ctgcataccg agcaatgcag caacggcata gaacgcgggc atgcgcagag 1440 gaatattcac ggcttagacg cgaagagaaa cgagttcacc gctccaagaa gcatgctttg 1500 gaagagcaaa acatgcggga actcgagcaa accagagagg cgtacggacc gacacgaaag 1560 ttttaccaag cgatagcagg tcaccgaaac aacgttgtac ctaaggtaac ctgctgtcgc 1620 aacaaggatg gagatctggt cagtaaccag ccagaggtcc tctcgcggtg ggctcagtac 1680 tttgatgaat tactcaatga ccagtttagc gaacagctag aagcgccact agcagataat 1740 gtcatgctac tgccacctag catagaagaa acacgaaagg ctatccgtcg gctgaaaaat 1800 aacaaggcac ccggaaccga cggaattgca gccgaactgg tcaagaatgg aggtgcacga 1860 ctagaaaacg agattcatca aattgttact gaggtgtggg atagcgaatc gatgccttgt 1920 gattggaatc tcggcatcat ctaccccata tacaagaagg gagacaggtt ggactgcaac 1980 aactacaggg gtattacggt gttgaatacc gcctataaaa tattctccct gatccttcag 2040 gatcgccttg tcccgcacgt cgaagagata gtaggaaact atcaaagagg attccgaaac 2100 ggaaaatcaa ccactgatca gatcttcacc atgcggcaga tcttggagaa gatggctgaa 2160 tacaaaaacg acacatacca tctcttcata gacttcaaag ccgcatacga tagcatagcc 2220 agggtaaaac tgtacgacgc tatgagctca tttggaatcc cggccaaact gataaggcta 2280 gttagaatga ctatgaccaa cgtcacatgc caggtgaggg tggatggaaa actctcagga 2340 ccttttgcta ccaccaaggg tctgcgccag ggggacgggc ttgcctgtct cctattcaac 2400 ttggcgctag agagggccat ccgcgactcg agggtggaga ctacgggaac catcttctat 2460 aagtcaaccc agatcctggc atacgctgat gatatagaca tcattggtct gcggctctcc 2520 tatgtagcag aagcctacca agggattgag caggcggcag agaacctcgg attgcagata 2580 aacgaggcaa agaccaaact gatggtggca acatcagcgg acctaccaat aaataatcca 2640 aatctacgta ggcgtgatgt acagataggt gaacgcactt ttgaagtcgt cccagaattc 2700 acctatcttg ggtcaaaggt cagcaacgac aacagtatgg aagttgagtt gcgcgcaagg 2760 atgctggctg ccaaccggtc attctacagc ctgaaaaagc agttcacctc aaagaacctg 2820 tcgcgacgga cgaagctggg actatatagt acctatatag taccagtact cacatacgcc 2880 tctgagacat ggacactgtc caaatctgac gaagccctct tagccgcgtt cgagaggaag 2940 atgctcagaa ggatacttgg ccccgtatgt gtggaaggac aatggaggag ccgctataat 3000 gacgagctat acgagatgta cggcgacctc actgtcgtac agcgtattaa gctcgccagg 3060 ctccggtggg ctggccatgt tgtacgcatg gaaacggacg acccagcccg taaagtcttt 3120 ttaggccgtc cacaaggaca gaggaggcgt ggtaggccca aattgaggtg gcaagatggc 3180 gtggaggcgt ccgccattaa ggccgggata acggactggc agacgaaggc gcgagaccgt 3240 gagcggtttc ggacactcct gaggcaggcc aagaccgcaa agcggttgta gcgccggata 3300 agtaagtaag taag 3314 // ID PegasusA repbase; DNA; ANG; 3696 BP. XX AC . XX DT 16-JUN-2003 (Rel. 8.05, Created) DT 16-JUN-2003 (Rel. 8.05, Last updated, Version 1) XX DE PegasusA is a hAT-like autonomous DNA transposon - a consensus DE sequence. XX KW hAT; DNA transposon; Transposable Element; 8-bp TSD; KW Autonomous DNA transposon; MITE; Pegasus; PegasusA; KW HAT superfamily; transposase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-159 RA Besansky J.N., Mukabayire O., Bedell A.J. and Lusz H.; RT "Pegasus, a small inverted repeat transposable element found in RT the white gene of Anopheles gambiae."; RL Genetica 98, 119-129 (1996). XX RN [2] RP 1-3696 RA Kapitonov V.V. and Jurka J.; RT "PegasusA: a family of autonomous hAT-like DNA transposons from RT African malaria mosquito."; RL Repbase Reports 3(5), 97-97 (2003). XX DR [2] (Consensus) XX CC PegasusA is a family of autonomous DNA transposons that belongs CC to the hAT superfamily. The PegasusA consensus sequence was CC built from 4 copies ~3% diverged from it. PegasusA encodes the CC 499-aa PegasusAp hAT-like transposase (exons 312-600, 1347-2053 CC and 2317-2820). CC Nonautonomous derivate of PegasusA was described as PEGASUS [1]. CC PegasusA is flanked by imperfect 16-bp terminal inverted repeats CC (4 mismatches). XX FH Key Location/Qualifiers FT CDS 0..0 FT /product="PegasusAp" FT /translation="MSKTSGEKRKLREIIAEGCVGFENGHFVCTIDEVNGC FT KYRQKNEKYEPGNFIRHIRSMHPDLAKSRGLLQEDGEVPIKKRKVSKVPVA FT IDRQKLLEGRPQENCSLPNHYLDLTEAIEKEFYQYITLVRCAVHTMQLSVV FT DVVKTFDGEIRKCTAVSNNCRKIMYKTVFNNESLPPLYSKTRWGGIFEMLN FT HFFKQEDFFNELGQQHSELGEIYYINEKVFPYLNISFFSDLTEQWDFIKEY FT VVAFVPVYISTKAMQAKHTSLNDFYLSWMKTILKVGSIPGNRFVEPLTKAL FT KNRLKNLKESIVFKAALLLDPRFNYLNSKFFNQEEKEEIRSFIISTSERIH FT LCKAKVQPSSSSSPSNAPNGTLNSSSDFDLYFTELYGGSLPEQNAEEIQTS FT ANKILRQLISLDAEAHQNSHFDVWNFWLMRKSTHPELYEVATLLLSVPSNQ FT VSVERAFSALGLVLSDKRTRMNDDTLENILLIKLNQPLLEKILPDLYEWNK FT EDI" XX SQ Sequence 3696 BP; 1274 A; 635 C; 645 G; 1141 T; 1 other; cagtgttgcg aacggtgata gtcgtcgaca caaaatttta caaaatgata ctttttgaag 60 cagcgcattc tagatacacg ctttcgttcg acaatcatgg aaaaaaatat catttgacac 120 gatgatattt ttttgtagta ttcatttgat tgtgattttc ccgggataca agcaacttag 180 gtcatgttga acgcagttgg ctttgcgttt gtcgtcctaa aatcagtggc gttagtgaac 240 agaacaagct tattctgagt gaaacaagga agtaaagaaa cacattccaa ttctaatatt 300 cttaaataaa aatgtcaaaa acatcaggag aaaagcggaa actccgtgaa attatagcag 360 aaggttgtgt tggtttcgaa aatggacact tcgtttgtac tatcgacgaa gtaaatggtt 420 gtaaatatag gcaaaagaat gaaaaatatg agccgggtaa ttttatacgc catatacgat 480 cgatgcaccc agatttggca aaatcccgag gattgctgca agaggacggt gaagtcccca 540 taaagaagag gaaagtttcc aaagttccgg tagccattga ccggcaaaag ctcctggaag 600 gtatttaata tatttaattt ttttttctgt ggcgattacg tcaaataata tttactcaag 660 ctttattaat aacaataagt attaactaaa tgttatttaa atattaaacg ctctctaaat 720 tgtcgtttct aaaggtatga tgaagctaat atgttgccat aacgtaccta tgatgttcgt 780 cgaatgggac ggcttagggt agatcctgaa acctatttgt gatgcgttga agatgaacct 840 caaccgtgct aatattgttt gtcatcttgg agctgctgct cgaaaaattc gccaggaact 900 taccacaatt ttgaaaggaa aattcttatg cttgaaaatt gactgcgcaa ctcgtcttgg 960 acgccacata ttggggatca acatccaata ttattgtgaa ctacaaaagg atgtcatcat 1020 ttatacaatt ggtaataatc ctcattctca agttaatctt tttaaaaaca cgcatgcgat 1080 ttgcagcgag tttcatgttt tacctatcgt gaaagtattg atgaagtatg aataactctt 1140 tgtttaaatt aaattcaata ggaatggttg agctgaataa tagacacacc ggaaaatttt 1200 tgaaaacaaa gatcctagaa atactcactc aatacgaaat ttcattggaa caaattttca 1260 cggtcacctg tgataatggt gcgaacatga tcgctgccgt taagcatctt caatcagatg 1320 cccaggtcat gttcaaccca ctwgaggacg cccacaagaa aactgttcac ttccgaatca 1380 ttatttagat ttgacagaag ccattgaaaa agaattttac cagtatatca ctcttgtaag 1440 atgtgccgtt cataccatgc agctttcggt agtagatgtg gtcaagacat tcgatggaga 1500 gattcgtaaa tgtactgcag tatcaaataa ctgccggaaa attatgtaca agactgtttt 1560 taacaatgaa tctctcccac cgctctactc aaaaaccaga tggggaggaa ttttcgaaat 1620 gttgaaccat ttttttaaac aggaggattt cttcaacgaa ctgggtcagc agcattcgga 1680 attaggtgag atttattata taaacgaaaa agtttttcca tacctcaaca tttcattttt 1740 ttcagattta acagaacaat gggattttat aaaagaatat gtagtggcgt ttgtgcctgt 1800 ctatatttca acaaaagcca tgcaagccaa acatacatca ctgaacgatt tttatttatc 1860 ctggatgaag accattttga aagtcggttc aattccagga aatcgtttcg ttgaaccgct 1920 tacaaaagcc cttaaaaatc ggttaaaaaa tcttaaagaa agtattgttt ttaaggccgc 1980 gcttctactt gacccaagat tcaattattt gaactctaaa ttttttaatc aagaagaaaa 2040 agaagaaatt cgcgtgagta acatatttca gacttcattc taatttctat attctacatt 2100 acataaaaca atattttaat tacaattact taatgttgcc agcttagcct ttaaaaaata 2160 tgtttgtgac acaaattaaa gcagtacaaa tcgagttgtc ttatgcgtta ggcaataagt 2220 tgatacgtgt gacgtggatc caagatgacc acaaataact caatatttaa attaattttg 2280 tttccatatt aaaatgatat ttacgtataa ttatagagct tcatcatctc tacatcggaa 2340 cgcatccatc tatgtaaagc taaggttcaa ccatcttcat catcatcacc atctaatgct 2400 ccgaatggga cactaaatag ctccagtgat tttgatttgt attttactga actttacgga 2460 ggatcattgc cagaacaaaa tgctgaagaa atacaaacga gcgcgaacaa gattctacgg 2520 cagctgatat cattggacgc agaagcccat caaaactcgc attttgatgt ttggaatttt 2580 tggttaatgc gaaaatccac tcatcctgaa ctatatgaag tagcgacatt acttttgtca 2640 gtaccatcaa accaggtatc tgtggagcgt gcttttagcg cacttggttt ggttttatct 2700 gataaacgga ccagaatgaa cgatgatacg ttggaaaaca ttttactgat caagctgaac 2760 cagcccttgc ttgaaaaaat tctacctgat ttatatgaat ggaataagga agatatttaa 2820 ttactttaag ggggttttaa ttggggttta atattgttac acctaatata atgaaaaaat 2880 ataacagtat aagtataaca ataataaaat aaataaaacc attttttaat tgaattggcc 2940 tttgttacat aagaatttca ttctgcagga aatctttttc aatatcggca tcataaaata 3000 aaaatataac gaaccctcct agaacatttt ttttaaccgg caagtccaaa aaatgcaaat 3060 gtaagcttaa actcacgcta acattcaatt acaactcaca tatattttaa caaaccacat 3120 tgagacattt caaagataaa aaagaaaaat cagaaagggg tgaaaaactc agagttgaaa 3180 aaccttcaaa aactcttcca tgtattcttt ttttatacgg ggccatggtg gcagggaatc 3240 ttggaagtgg gtgctgtttg tgttagccaa taaactacca agcagtttta taagcaacat 3300 ttcgtggatt tctgtaccca agttccttaa aattccactc aataaattgt tttgtaatgc 3360 ctgaaaaatg taccaatctg taaaattaat gaaattgatg ctatgttaaa agcgtacact 3420 gggaaagtta aaatataaaa tttttaaaat tttatgtgaa cacctacagc cgacatcaca 3480 tttcagcgat ttgtatcgag caaaaaaatg tcaaatgaca ttcattctac tctttttgca 3540 atcatgtcat cgcaaaagca ttgattgtta ttgcacaaat gattgtcgct cttggaaagt 3600 cgtgtcatgg taaccatgaa tatcatgatc aaaaaatgat tttgacagcc gcttactgcg 3660 aatatcataa aaaaatgtcg acaattaaca acactg 3696 // ID CR1-1_AG repbase; DNA; ANG; 5401 BP. XX AC . XX DT 13-DEC-2002 (Rel. 7.11, Created) DT 19-MAY-2005 (Rel. 7.11, Last updated, Version 2) XX DE CR1-1_AG is a CR1-like non-LTR retrotransposon - a consensus DE sequence. XX KW CR1; Non-LTR Retrotransposon; Transposable Element; KW reverse transcriptase; endonuclease; CR1 clade; DNA/RNA-binding; KW CR1-1_AG. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5401 RA Kapitonov V.V. and Jurka J.; RT "CR1-1_AG, a family of CR1-like non-LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 2(11), 1-1 (2002). XX DR [1] (Consensus) XX CC CR1-1_AG is a family of CR1-like non-LTR retrotransposons. CC The CR1-1_AG consensus sequence was reconstructed based on CC multiple alignment of ~20 copies identified in the CC sequenced portion of the genome. Given the ~2% divergence CC of these copies from the consensus sequence, transposition of CC CR1-1_AG occurred less than 1 million years ago. CC Integrations of CR1-1_AG have not produced target site CC duplications. CC The consensus sequence encodes two proteins: a 440-aa CC CR1-1_AG-ORF1p CC (positions 425 1745) and 941-aa CR1-1_AG-ORF2p (positions CC 1746-4568). CR1-1_AG_ORF1p is DNA/RNA binding protein composed CC of the PDH domain (aa positions 3-57) and gag-like zinc knuckle CC regions (aa positions 334-442). CR1-1_AG-ORF2p is composed CC the AP endonuclease and reverse transcriptase domains. The 3' CC terminus CC is composed of the CAT microsatellite. XX FH Key Location/Qualifiers FT CDS 425..1768 FT /product="CR1-1_AG-ORF1p" FT /translation="MECLKCSAVVGTSDDPIICSGSCGFIFHRRCITPTLN FT KPAVKLINENRNVVYMCDICLDQSAGLVHMDTDATKSNDLLAQTLRDLEAN FT VSVWISSALERGIETLKTELCAQVERKLETTLRETLSAIEASKMSKAALRA FT TSDTPQTSKTVQDVNLETWATVTKKRKRTNSGDSNVQTIINRFDEGNNKVT FT PKIKKINDVKEPMGKNKENNKTLVIVPKVVQSCDRPRADLSARLDPRKQQL FT SEFRNGRDGQVYAQCPALANLDSIRKEVEDILGDDYSTSLPMARVKIIGMS FT EKYSSSDLVDLLKSQNEGIPWKQENVIGMFESKIYKYQIHNVVLEIDHETD FT KCLAKLDKINIGFDRCKISRSIHVMRCFKCGQFSHKSTDCQNKEACSKCSG FT EHRTSDCTSSILKCVNCVLANTSRNLKLQVQHAANSYECPLFKKQVERRMQ FT LSQ" FT CDS 1746..4568 FT /product="CR1-1_AG-ORF2p" FT /translation="RDECNFLNSRGGVELGRFREILYFNVAGLSSNYAMFR FT ETVEKVQPLLVLISETHVTEKEAFEQFYLKGYRVVSCLSHSRHTGGVAAYA FT RSDVVLKVILNESLEGNWFLGVAVSRGMTVGNYSILYHSPSASDSRFVDIL FT EEWLDRFLDLSKLNIIVGDFNIDWLNVEKSAKLKSLMDSVNMNQKVNEFTR FT IARQSRTLIDQVYSSIDSIKVTTDPLLKISDHETLVLNINDERCKTIQRKV FT KCWNRYSKHALCNNVSQGLQCGASDFDEAADLLWNTLKHAMSTLVEEKTIV FT SRETSRWYTLDLARAKRKRDKVYKKFIRTNRDNDWSEYTKLRNSYSRDLKN FT RRSDFFSNEINKHKKNSKELWKVLKSMLQPDESCVSVVKFNGVIEADDSII FT CNKFNSFFVNSVLDINQNIASVSEPSYYVDSATPRCHFRFQKITLEQLKTI FT CFNLTKTAGIGNVSSTTIQDCYHVIGEDLLMVINQSLERGCFPKSWKESLI FT IPIPKVNGAANAEDFRPINMLHVLEKVLETVVKEQLVQFLNRNELLIREQS FT GYRQGHSCETALNLVLARWRVLMDRRESIVAVFLDLKRAFETISRPLLLST FT LRRFGIVGRELSWFESYLKERTQRTLFGSSVSEPIENTLGVPQGSVLGPIL FT FIMYINDMKQVLKACEINLFADDTVLFISHKEIKQAESLMNIDLNALDGWL FT KYKKLALNINKTCYMVMSAGVLEEPPSIVINSELIERVRQAKYLGVILDDR FT LKFHAHIDWVIAKVAKKCGVISRLAKDLDFFGKVHLYKSLISPHFDFCSSI FT LFLGNKGQIKRLQRLQNRIMRLILGCGRRTPSAVMLNILQWMSVEQRIVYQ FT TMTFIYKMLKGLLPGYLGESIVRGSDIHRHHTRRANEPRVPNLHSQSARNS FT LFFKGIQRYNSLPDEIKNARNLPDFKRKCVIYVEQTV" XX SQ Sequence 5401 BP; 1641 A; 926 C; 1288 G; 1546 T; 0 other; tgatgagtga ctgttgacaa gtgcagttcg ttgacaagtg tgactgttta gctctgtgtt 60 gtgacgtgaa ataagttgtg taaaatagag ctaagtgctc aaatttttag atacgtgtgt 120 agataaatgt tcgtggatgc tctcggtgtc agtgtgtttt gcaaaggtaa aagtgacctt 180 gttaaaaaaa caaaagattg gtgcggtcaa cagcgttccg ccgttggtgg cagtaaccga 240 tccttgttgt gtagataagt gtttgtaccg tttttttatc catgtatgtg tgacaccggg 300 agatacgtag agtagcatat tagtatagta gtgtttgtgt cgcttttggc cggcgtttga 360 aaccaggtaa gcgaaaaaaa acacactttt ataagacttc gcccgtattt gttttttaca 420 cggcatggag tgtttaaaat gctccgccgt ggtgggaacc agcgatgacc cgataatttg 480 ttcagggagt tgtgggttta tttttcaccg tcggtgtatt acacccacac tcaacaagcc 540 tgcggtcaaa ctaattaatg agaaccgcaa tgtcgtatat atgtgtgaca tttgtttaga 600 tcaaagcgcg ggcttggttc atatggatac tgatgcaact aaatcaaatg atttgcttgc 660 acaaacactg agggatttgg aagccaatgt gagcgtgtgg atttctagcg ctttagagag 720 aggaatcgag actctcaaaa ctgagctttg cgcgcaagtg gagcgtaagt tggaaacaac 780 tttgcgcgaa acattaagcg ctatagaagc ctcgaaaatg tcgaaggcgg ccttgcgtgc 840 aacttctgac actccgcaaa ccagcaaaac agtgcaggat gtaaatttag aaacatgggc 900 tacagtaacg aaaaaaagaa aaaggacaaa tagtggagac agcaatgttc aaactattat 960 taatagattt gacgagggaa acaataaagt tactcccaaa attaagaaaa ttaacgatgt 1020 gaaggagcct atgggaaaaa ataaagaaaa taataaaaca ctggttattg ttcctaaggt 1080 ggtgcagtct tgcgatagac caagagctga ccttagcgcc agattggatc cgaggaagca 1140 gcaattgtcg gaattccgca acggcagaga cggacaagta tatgcacaat gtcctgctct 1200 ggcgaattta gatagcatta gaaaagaagt agaagacatt ttaggagacg attattcgac 1260 atccttacct atggcacgcg ttaaaataat tggaatgagt gaaaaatatt cttcttctga 1320 cttagtagat cttttgaaat ctcaaaatga gggaattccc tggaaacagg agaatgtaat 1380 tggaatgttt gagagtaaga tctacaagta ccagatacat aatgtggttt tggaaatcga 1440 ccatgaaact gataagtgtc tggcaaaact tgataaaatc aatattggat ttgatcggtg 1500 taaaatttct agatccattc acgttatgcg ctgctttaaa tgtggtcaat ttagccataa 1560 aagcactgac tgccaaaata aggaagcgtg ttcaaagtgc agtggcgagc accgaacgtc 1620 ggattgcacc tcgtccatcc tcaaatgtgt aaattgtgtt ttggctaaca catccaggaa 1680 cctgaaacta caggtacaac atgcggccaa tagctatgaa tgcccgctgt ttaaaaaaca 1740 ggtagagaga cgaatgcaac tttctcaata gcaggggagg ggtggaatta gggcggttca 1800 gagagatttt atatttcaat gttgccggtc tttcatctaa ctatgctatg tttcgtgaga 1860 cagtagaaaa agttcaaccc ttgttggtct tgatctctga aacccacgta accgagaagg 1920 aggcattcga gcaattttat ttaaaaggat atagggtagt gtcgtgttta tctcattcac 1980 gtcacacagg aggtgttgca gcttatgcca gaagtgacgt tgtccttaaa gtgattttaa 2040 acgagtcatt ggaaggcaat tggtttctcg gtgtagcggt ttctcggggt atgacggtag 2100 gcaattatag catattgtat cactcaccta gtgcgagtga ttcgaggttc gtagatattt 2160 tggaagaatg gttagacagg tttttggatc ttagtaagtt gaacattatc gtcggtgact 2220 ttaatattga ctggttaaat gttgaaaaat ctgcgaaact gaaaagttta atggattcag 2280 taaacatgaa ccaaaaagtc aatgaattca cacgaattgc taggcagagc aggacattga 2340 ttgatcaggt ttacagtagt attgactcaa tcaaagtcac tactgatccg ttattgaaaa 2400 tatcggatca cgaaacactt gttttgaaca taaacgatga acgttgtaaa acgattcaac 2460 ggaaagttaa atgctggaat aggtattcga aacatgctct ttgcaataat gtgtcacaag 2520 gcttgcagtg tggtgcatct gattttgatg aggctgctga cttgttatgg aacacattga 2580 aacatgcaat gagcaccttg gtggaagaaa aaacaattgt ttctagagaa actagtaggt 2640 ggtatacttt ggatctcgca cgtgctaaac ggaaaagaga caaagtgtat aaaaaattta 2700 ttagaacgaa tagagataat gattggtctg agtatactaa acttagaaac agttatagta 2760 gggatctcaa aaatagacga agcgatttct ttagcaatga aataaacaag cacaagaaaa 2820 atagcaaaga gttatggaaa gtcctcaaaa gcatgttaca acctgatgaa tcatgcgttt 2880 cagttgtaaa atttaacggt gtgattgagg ctgacgactc catcatttgc aacaagttta 2940 actcgttctt tgtgaacagt gttttagata ttaatcaaaa cattgcttct gtcagtgaac 3000 ctagctatta cgtagatagt gctactccac gatgccattt cagatttcag aaaattactc 3060 ttgaacaact aaaaaccatt tgtttcaacc tgacaaaaac ggcaggtata gggaatgtaa 3120 gttcaacaac catacaggat tgctatcatg tgatcggaga ggaccttctt atggtgatta 3180 atcaatcact agagagggga tgttttccga aatcatggaa agaatcattg attataccta 3240 ttcctaaagt gaacggagct gccaatgcgg aagattttcg ccccataaac atgttgcatg 3300 tgctcgaaaa ggtgctggag acagtagtta aggagcaatt ggttcagttt ctgaacagaa 3360 acgagctgtt gatccgagag caatcaggat atcggcaagg acactcttgt gagactgctt 3420 tgaatcttgt actggcgagg tggagggtgt tgatggatcg gagggaatcg atagttgctg 3480 ttttcttgga tctaaaacgg gcatttgaaa caatatctag gccattgttg ctttctacct 3540 taaggcgttt tggtattgtg gggagggagc tcagttggtt cgaaagttat ttaaaagaaa 3600 gaactcagag aactttattt ggtagctctg tatcagagcc tatagaaaac acccttggtg 3660 ttccgcaagg tagtgttctt ggaccaattt tgtttatcat gtacatcaat gacatgaaac 3720 aggttttgaa ggcttgtgag atcaatcttt ttgccgacga tactgttttg ttcatctcgc 3780 acaaagaaat caagcaagca gagtctctga tgaatatcga tttaaacgct ctggatggat 3840 ggctgaagta caaaaagctg gcattaaaca ttaacaagac ttgttacatg gtgatgtctg 3900 cgggtgtatt ggaagaacct ccatctatcg taataaattc ggaactaatc gaaagagtta 3960 gacaggctaa atacctggga gttatcctag acgacaggtt gaagttccac gctcacattg 4020 actgggtcat cgctaaagtg gcaaagaagt gtggagtgat aagtagattg gcgaaggatc 4080 tcgatttttt tgggaaagtt catctctaca aatcattgat ctcgccacac tttgacttct 4140 gctcatccat tttgtttctt ggcaacaaag gtcaaattaa aagacttcaa aggttgcaaa 4200 accggattat gaggttaatt ctggggtgcg gtcgacgtac gccgtccgcg gttatgctga 4260 atattcttca atggatgtca gtagagcagc ggattgtgta ccagaccatg acttttatat 4320 ataaaatgtt aaagggcctg ttgcctgggt acctggggga gagcatagtt cgggggtccg 4380 atatccatcg gcaccacaca cgcagggcaa atgagccgag ggtacctaac ttgcattccc 4440 aaagtgccag aaactctttg tttttcaaag ggattcaacg gtacaacagt ctaccagatg 4500 aaattaagaa tgcgagaaac ttgccggatt tcaaacgtaa gtgcgtcata tatgttgaac 4560 aaactgtata atgtgaaata tgtgtagatg tcccatgtca ttatgtaatg tgtaactgca 4620 gttgtcatca cgatctttat gatgatgata agatttttct ttatatacta taattaaaat 4680 tagaaaaaat ataagaaaga gtcaacatag gtttgagaca cgcgcgcgta caagtggata 4740 ttcgggatct atttggggga atttacggtt tgcacacggt ctgggaggtc acaacaggat 4800 tcactcattg atgattgtaa gtggccaatt ccagatacat taggttcacc tgcttcggaa 4860 ggtagtgccc tagagccaat cattggcact agtggaactg gccatatgca tgttgcagtt 4920 gtgttctgat aagtagtgcg ccggatccat tagttggttc ttggcgagct acgtaaggga 4980 tacattagat tctcctgctt cggaaggtag tgccctggag cctgccattg gtgccagaga 5040 ccctggggcc agcggttggc actagtggaa ctggccatat gcatgtcgca gttgtgtcct 5100 gataagtggt gcgccagatc catttattgg ttcttggcgg gctacgtaaa ggatgctagg 5160 gaataccatt gtattcgatg gagtatgacc cgttttttct tgatgaaact atgttggtca 5220 tcttgggtgt gtgtatgact cggacagttc ctctgagagt tttccgatgc caccacttgc 5280 tcgtacaaaa tattttaata tacctatcag agtaatatta tcgtaaagat acttccgtcc 5340 ttctcaaacc tatgttgggg aaagaggtgg gacttatcat catcatcatc atcatcatca 5400 t 5401 // ID BEL16-I_AG repbase; DNA; ANG; 5510 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE BEL16-I_AG is an internal portion of the BEL16_AG LTR DE retrotransposon - a consensus sequence. XX KW 5-bp TSD; BEL16-I_AG; BEL16-LTR_AG; BEL16_AG; Bel clade; KW LTR retrotransposon; RING Zn-finger; integrase; peptidase; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5510 RA Kapitonov V.V., Pavlicek A., Drazkiewicz A. and Jurka J.; RT "BEL16_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(3), 39-39 (2003). XX DR [1] (Consensus) XX CC BEL16_AG is a young family of Bel/Pao-like LTR retrotransposons. CC BEL16-I_AG, an internal portion of BEL1_AG is flanked by CC BEL16-LTR_AG CC LTRs. The BEL16-I_AG consensus sequence was reconstructed based CC on CC multiple alignment of 18 copies; they are ~1% divergent from CC the consensus sequence. CC The consensus sequence encodes a 1736-aa BEL16_AGp Bel-like CC protein CC (pos. 231-5438). CC BEL16_AGp is composed of the peptidase A16 (pos. 145-304), RING CC Zn-finger (pos. 335-400), reverse transcriptase and CC integrase domains. XX FH Key Location/Qualifiers FT CDS 0..0 FT /product="BEL16_AGp" FT /translation="MSVSEDFAGFSDASGSNNTMQYTQAEDLDMIILKRER FT DRVVQSLTRIDTFLAQYKESDFPELTPRLDLLNERWREFQALARNIGAKDY FT SEDNDVLYGEIEDKVMLLKGKLLGKLRAGAEIPSVKQERVCDDYNSVRLPQ FT LTLPQFSGKYDEWLPYHDMFVVTVHENEKLSQVEKMLYLKGSLKGEALKVV FT DTLQACNSNYDVAWDALKKRYSNEYILKKRHVNAMLQWPRMKVMNTVGIHG FT LIDCFERNLQILKQLGEVTEQWGCLIIQIIISKLDESTQQKWERHVEESEQ FT KTVTDLLNFLRTQTRIMDAFAVDRPMAAGKSTSERRVASNVAAEAKCAKCD FT GSHMVENCDSFRSLTLPRRREVVEAKKLCLNCLRQGHFQAKCWSRARCNIC FT NRKHHSLLHGENVSETIGPSVREELPSTSGTQNVVVNASSNRECTSVLLST FT AIVSVRACGNKWLSARALIDSGSQVNLMTKGLAARLKLPQYESKTALSGVG FT HSKVDITTSVTTVIRSKSCNYQERMQFLVLPRISSYRPVTGNQISRQNLPM FT NFVLADPNFDSDAEVDLLLGSEFYATFLKPDNRGKIRLELPALPTFISTVF FT GWVATGKVPLASEGSNYVTCGTCTRLDDLIERFWIIEEIREPLQHSQEERD FT CEAHFVQTHQRDSEGRYIVKLPFKCDLQNQLGPSSAIARKRFLQLERRFNR FT DPWLKQKYTAVINDYIDKGILVKVAANPDSEEAHGHYFLPHHPVIKASSTS FT TKVRPVFDGSASTDGGKSLNDLLMTGPVIQENLLALLLKFRMRSVALVADI FT KQMYLQVKVHPDDTRFQRVLWRGSSVESIEVYELQRVTFGLAPSSFLAIRV FT LQQLAIDEGENFPLARQALLEDFYVDDYIGGASSEEEAVRLQAELTLLLRK FT GGFHLTKWNSNKPDVLSSVSAEDRATSNVKMFEVPEEPIKTLGIAWLPESD FT QLYIDSNIQMNNESWSRRKVYSLVARIYDPLGLVAPVTSWAKINMQSLWLA FT TDDWDEEIPAVMQERWYAFQSQLGLLKEVKFSRHAVVHNPVAVQLHCFSDA FT SEAAYGACVYVRTIGSSGEVVVELLAAKSRPAPLKRVSLARLELCGALLAA FT RLQKVVRQALRIPDVETFMWTDATIVLHWIRAPSHSWATYVANRVSEIQEL FT THGYKWMHVKGVDNPADIVSRGAMPNELLASKLWFHGPGWLQLSEEEWKKN FT ASGVLAIPEEELLERRKSSLVAAVSSESDDWCDRFSNYDKLLRITAYCMRF FT IRCCQRKLDPKHKGVLLVSELAEAKIRLVKREQRIYFAAEIKELSAGQTVR FT PKSSLKTLGAFLDGDGLLRVGGRLHRAKAMQVCSRFPLALPKKSRFTRLMA FT EYYHRLALHGGPTATLSALRREFWPIQGRSLVNSVCRGCLVCFRMNPALVQ FT QPPGQLPVSRAMPARPFSIVGVDFCGPIYLKPVHRRAAAEKAYISIFVCFS FT VKAVHIELVESLSTHAFLAAFRRFVARRGLPSEVYSDNGLNFQGASKVIDD FT FYTLMNSDSAVEDISRYAVGAGVKWHFIPPHAPNFGGLWEAAVKAAKRVLL FT KVVGDRQLAFGEMSTVLAQVEAQLNSRPLTPLSEDPEELDVLTPGHFLIGA FT PMNALPEPDVGDVPINRLKRYEELRRVVQNHWARWRREYFSELHNEHQRGK FT AVVELKVGQMVLLKEDGKTPHHWPMGRIAEVFPGPDGVVRVVSIRTRNGLY FT KRPANRISLLPFERVN" XX SQ Sequence 5510 BP; 1469 A; 1034 C; 1539 G; 1468 T; 0 other; ttggtgccgt gaccaggatg gtctaatatt tgcggtttgg tttgactggt taaagtgaag 60 aactttgtgt gatttctttt ggcaaaaatc gcgtaacgcg tgtgaaatct gaaagttccg 120 tatcgagcgt tggtggattt ctgaataaga aaaaatcgcg aaagcgtgag aaatcattgt 180 gctagtgggt gttctttgcg tgtgtatgca ttttgtgtcg cgtattagaa atgtcggtta 240 gtgaagattt tgccggtttt tctgacgcgt cgggttcgaa caatacaatg caatacacac 300 aagcagaaga tctggatatg attatcctaa agcgcgagcg cgaccgagtc gtgcaatcgt 360 tgacaagaat cgacacgttt ttggcgcagt acaaagagag tgatttccca gaattgacgc 420 ctcgtctcga tctattgaac gagcgttgga gggagtttca ggctttggct cgaaatattg 480 gcgcaaaaga ttacagtgaa gataacgatg tgctttatgg tgagattgaa gacaaagtca 540 tgctattgaa aggaaaatta ttgggaaagc tgcgggctgg agcggagata ccgagtgtga 600 aacaggaacg cgtgtgcgat gattataata gtgttcgttt gcctcagttg acgcttcctc 660 aattttccgg aaaatatgac gaatggctcc cgtatcacga catgtttgtt gttactgttc 720 atgagaacga gaaattgtcg caggttgaaa agatgcttta tttgaagggt tctctgaaag 780 gtgaagcgct gaaggtggtg gatactttgc aagcctgcaa ttcgaattat gacgtagctt 840 gggatgcttt gaaaaagcga tattctaacg aatatatttt gaaaaagcgt catgttaacg 900 cgatgctaca atggccacgg atgaaagtca tgaacacggt gggtattcat gggcttatcg 960 attgtttcga aaggaactta caaattctaa agcagttagg tgaagtgacc gaacagtggg 1020 gatgtttgat tattcagata atcatttcaa agttggacga aagcacccaa caaaagtggg 1080 aaaggcacgt tgaagaaagt gagcaaaaaa cggtgacgga tttgttaaac tttttgcgca 1140 cacagacgcg cataatggac gcgtttgcag tggataggcc aatggcggcg ggcaaatcta 1200 caagtgaacg tcgtgttgcg tctaatgtgg ctgcagaagc aaagtgcgca aaatgtgatg 1260 gatcgcacat ggtggaaaat tgtgactcgt ttcggagttt aacgttgccg cgtcgtcgtg 1320 aagtggttga agctaagaag ctttgcctta attgtttgag gcaagggcat tttcaagcca 1380 agtgttggtc acgtgcgcga tgcaatattt gcaatcgtaa acatcattcc cttctacacg 1440 gtgaaaatgt aagtgaaacg atcggtccgt cggttcgtga agaacttcca tctaccagtg 1500 gcacgcaaaa tgtggtggtg aacgcgtcat cgaacagaga gtgcacttct gtgttattat 1560 caacggcgat agtgagtgtg cgtgcgtgtg gtaacaaatg gttgtcagca agagcattga 1620 tcgatagtgg gtctcaggtg aacctgatga caaagggctt ggctgcgcgg cttaagttac 1680 cgcagtacga aagtaaaacg gcattatcgg gagttggaca ttcgaaagtg gacataacga 1740 cgtcggtaac gactgtcata cgttccaaaa gttgtaacta tcaagaacgt atgcagtttc 1800 tagtgttgcc gagaatttct agctacaggc cggtaactgg aaatcaaatt agcaggcaga 1860 atctcccgat gaattttgtg ctcgcggatc ctaatttcga tagtgatgct gaagtggatt 1920 tattgttggg ctccgaattt tacgcgactt ttttgaaacc ggacaaccgt ggtaaaatta 1980 ggctcgagct accagcgctc ccaacattta ttagcactgt ttttggatgg gttgcaactg 2040 ggaaagttcc gctggcttct gaaggtagca attatgttac ttgtggcacg tgtactaggt 2100 tggacgattt aattgagcgt ttttggatta ttgaagaaat acgtgagccg cttcaacata 2160 gtcaggaaga aagggattgc gaagcacatt ttgtgcaaac tcatcagcgg gatagtgaag 2220 ggagatatat agtgaagctg ccatttaagt gtgacttaca gaaccaattg ggaccgtcca 2280 gtgcgatagc gagaaaacgg ttcttgcagt tagaacgacg tttcaatcgg gacccatggt 2340 tgaaacaaaa gtatacggcg gttatcaacg actacatcga caaagggatt ttagtcaagg 2400 tggctgcgaa ccctgattct gaggaagcgc atggtcatta tttcttaccg catcatccgg 2460 taatcaaggc gtccagtacc agtacaaagg tacgacctgt gtttgatgga tcggcctcta 2520 ccgacggtgg taagtctctt aatgatttgt taatgactgg tcctgtgatt caggagaact 2580 tgttggcgtt gttgctgaaa tttcggatga ggagcgtggc attagtggca gacataaaac 2640 agatgtatct gcaagtcaag gtgcatcccg acgacactcg ttttcaacgt gtattatggc 2700 gaggctcatc tgttgagtcc atcgaagtct acgagttgca gcgggtcaca tttggacttg 2760 ccccgtcttc ctttctggcc attcgagtac tgcagcaatt ggcaattgat gaaggggaaa 2820 actttccctt ggcgagacag gcgttgttag aggacttcta tgtcgatgac tacattggtg 2880 gtgcctctag cgaagaagaa gcagttaggt tgcaagctga gctgacgcta ttgttgagaa 2940 agggcggatt tcatctaact aaatggaatt ctaacaaacc agatgtttta tctagcgttt 3000 cagcggaaga cagagcaaca tccaacgtta aaatgtttga agttccagag gagccaataa 3060 aaactctagg tatcgcgtgg ctaccagaat cggaccaact gtacatagac tcgaacattc 3120 agatgaacaa cgagagctgg tcccgtagaa aggtttactc tttggtagca cgtatatacg 3180 accctttggg gctggtggct cctgtgacat cttgggccaa gataaatatg caatcgttgt 3240 ggttggcaac tgatgactgg gatgaagaaa taccggctgt catgcaagaa cgatggtatg 3300 cttttcaatc acaactcggg ttgctgaagg aggttaagtt ttcgcgccat gctgttgtgc 3360 ataatcctgt tgctgttcaa ctacattgct tttcggatgc atctgaagcg gcttatgggg 3420 catgcgtgta tgttagaacg attggtagca gcggggaagt ggtagttgag ttgcttgctg 3480 caaagtctcg tcctgcgccg ctgaaaagag tcagtttggc ccggttagaa ctttgtggag 3540 cattactggc agcaaggttg cagaaagtgg tacgccaagc gttgagaatt ccagacgtgg 3600 aaacctttat gtggactgat gcgacaatcg tgttgcattg gattcgagca ccatcccatt 3660 cttgggctac gtacgtagcg aatagggtat ccgaaattca ggaattgacg catggctaca 3720 aatggatgca cgtgaagggc gttgacaatc ctgccgatat tgtatcgcgt ggagctatgc 3780 cgaacgagct gttagcatcg aagctgtggt tccatggtcc cgggtggtta caactatcag 3840 aggaagaatg gaagaagaat gccagcggtg tgttggcaat tcccgaagag gagttattgg 3900 aacgaaggaa gagctcattg gtggccgcag taagtagcga gagcgatgat tggtgtgata 3960 ggttttccaa ctatgacaaa ttactgcgga tcactgcgta ttgtatgaga tttattcgtt 4020 gttgccaacg aaagctggat cctaaacaca aaggtgtttt gttggtgagc gagctagcag 4080 aggcgaaaat tcgactggtg aaaagagaac aacggatata ctttgcggct gagatcaagg 4140 agttgtctgc tggacaaacg gtacgtccca aatcatcact gaagacatta ggagcttttt 4200 tggacggtga tggtttgctc cgagttggtg gccgcttgca tcgcgctaaa gccatgcaag 4260 tttgtagcag atttccgttg gcgctaccca agaagtcacg atttactagg ctaatggcag 4320 aatattatca tcgattggca cttcatggtg ggccaactgc aacattgagc gcactcagga 4380 gagaattttg gccaattcaa ggacgatctt tggtcaatag tgtttgcaga ggctgtctgg 4440 tatgcttcag gatgaatccc gcgttagttc aacaaccacc aggacagcta cccgtgtcgc 4500 gtgctatgcc agctcgacca ttttcgatcg taggggttga tttctgtgga cccatttact 4560 tgaagccggt gcatcgccga gcagcagctg aaaaagcata tatttcaatc tttgtgtgtt 4620 tttcagtaaa ggctgttcac atcgagcttg tggagtctct atcaactcat gcatttctag 4680 cggcgtttcg tcggtttgtg gcaagacggg gtttgcccag cgaggtctat tccgacaacg 4740 gtctcaactt ccaaggagcg agtaaggtga tcgatgactt ctacacgttg atgaacagcg 4800 attcggcggt ggaggatata tcgaggtatg ctgttggcgc tggcgttaag tggcacttca 4860 tcccacccca tgcaccgaac tttggcggcc tttgggaggc agcggtaaaa gcggcaaaac 4920 gcgtcctact gaaggttgtt ggtgatcggc agctggcgtt tggggagatg tcgacggtac 4980 tggcacaagt ggaagctcaa ctcaacagca gaccgcttac accgttgtcg gaggatccgg 5040 aagaacttga tgtattgacg ccggggcatt ttctaatcgg ggctccgatg aacgctctac 5100 cggagcctga cgtgggtgat gtaccaatca atagattgaa gcggtatgag gaattgcgta 5160 gagtggtaca gaatcattgg gcgcgttggc gtagggaata ttttagcgaa ctacataacg 5220 aacatcaacg cggcaaggca gtagtagagc taaaggtagg acaaatggtc ctgttgaaag 5280 aggatgggaa gactcctcac cattggccaa tgggacggat tgctgaggta tttcctggcc 5340 cagatggcgt agtgagagtc gttagtatca ggactaggaa cggcttgtat aagaggccag 5400 cgaataggat tagtcttctt ccgtttgaga gagtgaatta gatatcataa agtcaggcat 5460 tttgtgaagt aatggaaaga ggtaaatttg gtaaatttag gtggccgcta 5510 // ID AGRP1 repbase; DNA; ANG; 871 BP. XX AC L11898; XX DT 28-SEP-1995 (Rel. 1, Created) DT 28-SEP-1995 (Rel. 1, Last updated, Version 1) XX DE Mosquito repetitive sequence. XX KW Repetitive sequence; AGRP1. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-871 RA Reiss A.R.; RT "A study of repetitive DNA elements in the malaria vector, RT Anopheles gambiae."; RL Thesis (1991). XX RN [2] RP 1-871 RA Reiss A.R., MacIntyre J.R. and Hagedorn H.H.; RT "A repetitive element of the Malaria vector, Anopheles gambiae."; RL Unpublished (1993). XX DR GenBank; L11898; Positions 1 871. XX SQ Sequence 871 BP; 263 A; 209 C; 213 G; 186 T; 0 other; ctcgagcaaa ccagagaggc gtacggaccg acacgaaagt tttaccaagc gatacgaggt 60 caccgaaaca acgttgtacc taaggtaacc tgctgtcgca acaaggatga agatctggtt 120 agtaaccatc cagaggtcct cttgcgatgg gctcagtact ttgatgaatt actcaacgac 180 tagttcaacg agcagctaga agcgcactag ccgatagtgt catgcactgc cacctagagg 240 aaacacaaag gcataccgtc ggctgaaaaa tccaccggca cccggaaccg accggaattg 300 cagcgaactg gtcaagaatg gaggtgcacg acgtagaaaa cgaagtaatc atcattactg 360 aggtgtggga tagcgaatcg atgccttgtg actggatctc ggcatcatgt accccatata 420 caagaaggga gacaggttgg actgcaacta caggggtatt acgatgttga ttaccgcgta 480 taaaatattc tccctgatcc ttcagtatcg ccttgtcccg cacgtcgaag agatagtagg 540 aaactatcaa agaggattcc gaaacggaaa atctaccatt gatcagatct tcaccatgcg 600 acagatcttg gagaagatgg ctgaatacag acacgacaca taccaactct tcattgactt 660 caaagtcgca tatgatagca tagcctgggt aaaactatac gacgctatga gctcatttgg 720 aatcccggcc aaactgataa ggctagttag aatgactatg accaacgtca catgccaggt 780 gagggtggat ggaaaactct caggaccttt ttgccaggct tgcctgtctc ctattcaacc 840 tggcgctaga gagggccatc cgcgactcga g 871 // ID MinoAg1 repbase; DNA; ANG; 5660 BP. XX AC AB090816; XX DT 14-SEP-2005 (Rel. 10.09, Created) DT 14-SEP-2005 (Rel. 10.09, Last updated, Version 1) XX DE Anopheles gambiae retrotransposon MinoAg1 DNA, complete sequence. XX KW Non-LTR Retrotransposon; Transposable Element; MinoAg1. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5660 RA Kojima K.K. and Fujiwara H.; RT "Evolution of target specificity in R1 clade non-LTR RT retrotransposons."; RL Mol Biol Evol 20(3), 351-361 (2003). XX DR EMBL/GenBank/DDBJ; AB090816; Positions 1 5660. XX FH Key Location/Qualifiers FT CDS 590..1954 FT /product="MinoAg1_1p" FT /translation="MPSALRSGGLPAHRLSSSLELKQKKSATGTNLPSSPE FT MLILRQNLEETRKKNESLQEQLTQLRWLMEEKLREQREDAQRREEEARRRE FT EAAKADNEKLRVEQQETHTTLIAISAQLRDLQQKNQMKRQQQHQPPQQPGP FT STSAVSLRNVEVQAQPEEDIDHSSFVEVVRRKPRGINSGKSSSQQREQQQR FT SLQQQQQQQQQQQQQQQEQQQQQQQQRKIRRPKADLIEVVPQEGLTWDSVY FT RKVRDTVRDDPAHKNLEEHIGMGKRTRADLLRIELSRSADSTLVLQEVQEI FT IGGSGVARVVTEMTELLVTHIDPLAEEQELKAALKEELQVNAGVTAVSMWQ FT LFDGMKRARLCLPTKAAKQLAGRKLRLCGCISSIMEAMPVSVDRQRCYRCL FT ERGHLARDCQSPVDRQQACIRCGADGHYAKSCTSEIKCAACNGPHRIGHIS FT CARPAARCLH" FT CDS 1756..5358 FT /product="MinoAg1_2p" FT /translation="MLPLSGKRPSGARLSISRGSPTGVYSVRRRWSLCQKL FT HFRDQVCCVQRSPPHWPYLLCSSCSAMPALKILQLNVDHCREGQALALQAA FT REHRADILLLSDMHRPPENNGRWAYDVSKKVAVVATGSYPIQRVWGSELRG FT LVAAQVAGITFISVYAPPSLSAHEFERLLGAIELEAASHSRVVVAGDFNAW FT HEEWGSGRNGQRGIELLQTVETLGLSILNQGRKPTFIGRGFAASSVIDVSF FT ASREIVRPDTWSVIPFSRSDHELIAFEVKQPDENPAGAQQHLSHRPQRSTR FT KNPAGRQHDRCDSRRWKTTQFNRQSFRVALRANNFQERAVSHIGMIEALVD FT ACSETMQRISGLHKSPHHDMYWWTPAIKRLRDDCLAARERVQLSHDLEERS FT MAAAAHRTAKSLLEKAIRTSKRQQFQELIDIAETNDYGTGYRVVMSRLRGS FT PGPPETDRAELERIVSDLFPTHPPVSWPVSSDAPTTVPSDSRVEPQELLSI FT AAGMVTRKAPGLDGIPNAAVKVAIEEYTEEFCRLYQDCLSRGTFPPQWKRQ FT RLVLLPKPGKPPGESSSYRPLCMLDALGKVLERLILNRLNRHIEQQDSPQL FT SDAQYGFRRGRSTISAIQRVVDAGTAAKLFRRTNTRDKRCLMVVALDIRNA FT FNTANWQAIADALQSKNVPSYLMKIIGAYFEGRKLLFDTSEGPVERHISAG FT VPQGSILGPTLWNMMYDGVFGVGLPLGAEIIGYADDLVLLVPGTTPTTAAA FT AAEEAVAAVKQWLLEHRLELAHSKTEMTVISSLKQPPEEVFITFGGVNVPF FT SRSIKYLGVRVHAHLSWVPHVKEITLKATRIVHAVNRLMPNLHGPRTSMSR FT LLANVADSTMRYAAPVWHEAIGNQECCRLLRRVQRKSAIGLARTFRTVRYE FT TAVLLAGLLPICLAIKEDTRVHSRRGTGLNCAIRREERQRSMEEWQATWGA FT DAANERASRYVRWAHRVIPDVGSWQSRKHGDVTFHLSQVLSGHGFFREHLC FT GMQLTSSPDCTRCPGVAESAEHAMFECPRFDSTRTELLHGVVPETLLEHML FT QSPENWSNVCEATKRITSALQQDWDETRRELAEQGAPRVADNQHNQDNDRT FT SLYSARNTSEEQRGRRHPTPSPPPRAVGRRAEVRSLGERYRRQLLVVEERR FT QISGHSVGGVTSQEDGTLVEASRQGMNGAEKTAATEADVASR" XX SQ Sequence 5660 BP; 1507 A; 1467 C; 1639 G; 1047 T; 0 other; ggaagctgtt ggatgctgtc aggagttgtt tacaaggacg aacgtcaaaa tacgagctcc 60 tatataatag cggaaaatcg tgtgaaagcg tcaataagtt gtgcggaata gtgcggaacg 120 tgcttagtaa gcctagggcg cttgcccgtg aagtggagat tgccagctag gtatagagaa 180 gtgcaagaga acagcatcgt acggattcgg cgccggcgac cccggctggg gcggtctcaa 240 gccggtgaca gttcaggacg gcgacagccc aatcaccaat catcaacaat tcccaagcag 300 aaaatttgaa ttggccgttg gtggtgtaag atcggagaat catcttaata cgttcctcgt 360 atcacggcct agaagttaaa tctggtggat gaggaaaatt ggcacgttca ccaaaaatac 420 tcccactatc ccccatcaca ccccccccat aatccccttt actcggtgtg ccttaacacg 480 gtcttaacag gggcggtaaa gaccttcctc gatctcggtg aggcaaacac agatagtact 540 tgagaaaaaa aaaagaaaaa cccaaaacaa agctggaagg gtaagaataa tgccgtcggc 600 actccgctcc gggggtttac cggctcaccg gctgtccagt tcgttggaat taaaacaaaa 660 aaagagtgcc acgggaacca acctcccttc atcacccgag atgttgatat tgcggcagaa 720 tctggaagag accaggaaga aaaatgagtc tcttcaggaa cagctaactc agttgagatg 780 gctcatggag gagaagctcc gcgagcagcg agaagatgca cagcgtagag aggaagaagc 840 gcgtcgcagg gaagaggccg caaaagccga caatgagaag ctgcgggtgg aacagcagga 900 gactcacact acattaattg caatatcggc acagttgaga gacctgcaac agaagaacca 960 gatgaaaagg cagcagcaac atcagcctcc ccagcaacca gggccatcga cgtcagccgt 1020 ctcattgcgg aacgtagagg tgcaggctca accagaggaa gacattgacc acagctcgtt 1080 tgttgaggta gtgcgccgca agccccgcgg gataaacagc ggcaagtcct ctagtcagca 1140 acgtgagcag cagcagagat cgcttcagca gcagcaacag cagcagcaac agcagcagca 1200 acagcagcaa gaacagcagc agcaacagca acagcagcgg aagatacgtc ggccaaaggc 1260 agacctcata gaagttgttc cccaggaggg ccttacatgg gatagtgtgt accgcaaggt 1320 acgcgacaca gtgcgagatg acccagcaca caagaatctc gaagaacata ttgggatggg 1380 taagcgcacg cgagcggacc tccttcgtat agagctcagc cggtcggcag actccactct 1440 agtgctacag gaagtgcagg aaataatcgg agggtctgga gttgctcgtg tcgtaacgga 1500 gatgacggaa ctactagtta cccacattga cccacttgcc gaggagcagg aattaaaagc 1560 agctctcaaa gaagagctgc aggttaacgc tggcgtgaca gctgtaagca tgtggcaact 1620 ctttgatgga atgaagcggg caagactttg tttgccgacc aaagcagcca aacagcttgc 1680 cggacgaaag ttgagactgt gcggttgcat cagcagtata atggaagcca tgcccgtctc 1740 ggtagatcga cagcgatgct accgctgtct ggaaagaggc catctggcgc gcgattgtca 1800 atctcccgtg gatcgccaac aggcgtgtat tcggtgcggc gcagatggtc actatgccaa 1860 aagctgcact tccgagatca agtgtgctgc gtgcaacggt ccccaccgca ttggccatat 1920 ctcttgtgct cgtcctgcag cgcgatgcct gcactaaaaa ttctacaact caacgtagac 1980 cactgtcggg agggccaggc cttagcacta caagcagcgc gagagcaccg tgctgacata 2040 ttgcttctgt ctgatatgca caggccgcct gagaacaacg ggagatgggc gtatgatgta 2100 tccaagaaag tagcggtagt agccaccggc tcgtacccta tacagcgagt gtggggcagc 2160 gagctacgcg gactagtcgc tgctcaggta gccggtatca cattcatcag tgtatacgcc 2220 cctccaagcc tatcagcaca tgagtttgag cgactattgg gggccattga gttggaagcc 2280 gcgtctcatt cccgcgtcgt agttgcgggg gacttcaatg cctggcacga agagtggggc 2340 agcgggagaa acgggcagcg tgggatcgag ctactgcaaa ctgtggaaac actgggactg 2400 tcgatcctca atcaaggtcg caaaccaacc ttcatcggac gaggtttcgc ggctagtagt 2460 gtcattgatg tctcgtttgc gagtcgggag attgttcgcc ccgacacctg gtcagtgatc 2520 cccttctcga ggtcggatca tgaattaata gcgttcgagg tcaaacaacc agatgagaac 2580 ccggccgggg cgcaacagca cctgtcgcac cgaccacaaa ggtcgacacg caagaatcca 2640 gcaggccgac agcatgaccg ttgcgatagt aggcgatgga aaacgaccca attcaaccga 2700 cagtcatttc gagtagcact acgcgccaac aacttccagg agagagcggt gagccatatt 2760 ggcatgatcg aggcacttgt cgacgcctgc agtgaaacta tgcaacggat ctctgggctg 2820 cacaagagcc cacatcacga catgtattgg tggaccccag cgatcaagcg cctacgtgat 2880 gattgccttg ccgcgcgaga gagagtacaa ctgtctcacg atttggagga gaggagcatg 2940 gcggcagcag ctcaccgtac agcgaaaagt ctgctcgaga aggccatccg taccagcaag 3000 cgccaacagt tccaagagct gatagacatt gccgagacca acgattacgg aaccggttat 3060 agggtggtga tgtcccgact gcggggtagc cctgggccgc ctgaaacgga tcgcgccgaa 3120 ctggagagga ttgtctctga cctgttcccc acgcacccac ccgtatcatg gcctgtctct 3180 tcagatgctc caacgaccgt cccatcagac agcagagtag aaccgcaaga actactatct 3240 attgctgccg ggatggtaac gaggaaggcc ccaggtttgg acgggattcc gaacgctgcg 3300 gtgaaagtgg cgatagagga atacacggag gagttttgtc gcctgtacca ggactgtctc 3360 tctcgcggca ccttcccgcc gcagtggaaa agacagcgac tggtactact cccgaagcct 3420 ggcaaacccc caggagagag cagctcgtat cgaccgctgt gcatgcttga cgcacttgga 3480 aaggtactgg agcggctcat tctcaatcgc ctgaatcgtc acatcgagca acaagactca 3540 ccgcagctgt ctgatgccca gtatggattc cgccgaggac gttctaccat cagcgcgatc 3600 cagcgtgttg ttgacgcggg cacagcggcc aagttgttcc gccgcacgaa cacccgcgat 3660 aaacgctgcc tgatggtggt ggcactggac atccgcaatg cattcaacac tgccaactgg 3720 caggcaatcg ccgacgcgct gcaaagtaag aatgtcccgt catacctgat gaagatcata 3780 ggagcctact ttgaaggacg caagctgctg tttgacacta gcgaaggccc tgtcgaacgt 3840 cacatcagcg caggagttcc acaggggtcc atactgggcc ctacactgtg gaatatgatg 3900 tatgacgggg ttttcggagt tgggctgccg ctgggggcag agatcattgg ctatgctgat 3960 gacctggtgc tattagtccc aggcacaact ccgacaacag cagcagcagc agcggaggaa 4020 gcagtggcag cagtgaaaca gtggcttctc gagcaccgct tggaactggc tcattctaag 4080 acggagatga cggtgatctc tagcctcaag cagcctccag aggaagtttt tatcactttc 4140 ggcggagtga acgtgccgtt ctcgcgctcc ataaagtact tgggggtgcg tgtacatgcc 4200 cacctatcat gggtacccca cgttaaggag ataactctga aggccacgcg gattgtgcac 4260 gccgtcaatc gactcatgcc aaacctccat gggccaagga cctcgatgtc tcgcttgctg 4320 gcaaatgtgg ccgactcgac catgcgctac gcagcacctg tatggcacga agcgattggc 4380 aatcaagagt gctgcagatt acttcgtcgg gtgcaacgta agtcggcaat tggcttggcc 4440 agaacgttcc gaacggttcg ttacgagact gcagtgttgc ttgcgggact cttgccgatc 4500 tgcctggcaa tcaaggagga cacccgagtg cacagtcgcc gtggaacagg tttaaactgt 4560 gcaatacgga gggaggagcg ccagcggtcc atggaagagt ggcaagcaac gtggggcgca 4620 gacgccgcca acgaaagagc cagcagatac gtcagatggg cacaccgcgt aattccggac 4680 gtgggatcct ggcagtcacg gaaacacgga gacgtcacgt tccacctatc ccaggttctt 4740 tccggccacg gttttttccg ggagcacctg tgcggtatgc agctcacgtc gtccccggac 4800 tgtacgcgat gccccggcgt tgcggagagc gcagaacatg ctatgtttga gtgtccgcga 4860 ttcgactcga cccgaacaga gctgctgcac ggagtcgtcc cggaaacgct gcttgaacac 4920 atgctccaga gcccagagaa ctggagtaat gtatgtgagg ccaccaagcg gataacatca 4980 gcgctacagc aggattggga cgaaacccgg cgagagctgg ccgaacaagg cgccccacgt 5040 gtggccgata atcaacataa tcaagataat gaccgcacct cgttgtacag tgcgaggaac 5100 accagcgaag agcaacgggg tagacgtcac ccaacaccat cacctccacc cagagcggta 5160 gggaggcggg cagaagtccg atctcttggg gagcgctatc gcaggcagct actggtagtt 5220 gaagagcgtc ggcaaatatc cggccacagt gtcggtggag ttacctcgca ggaggatgga 5280 acattggtcg aggcgtctcg acaaggcatg aacggtgcag aaaagactgc agcgactgag 5340 gcagacgtgg catcacgcta gatgatgttg gtgatctgtg gaggatcttg ggaggaccaa 5400 agtatggtca aaagggcgat ctccgcatcc tgagcggcag ggaggacaaa acgaatatga 5460 aagggaaaga aggggggaaa taaacatgtg ttaggtgcct ggcgcacggg aaagagaggc 5520 tccgaggagc agtaaaagcc ctccctcata gacccctcgc ggggcaagag ggaagggagt 5580 gggcgaggac aggggatata atgtgtaaat caataagaac aataaaacct gtccgtgatc 5640 taaaaaaaaa aaaaaaaaaa 5660 // ID Outcast repbase; DNA; ANG; 6411 BP. XX AC . XX DT 20-JUL-2009 (Rel. 14.07, Created) DT 20-JUL-2009 (Rel. 14.07, Last updated, Version 1) XX DE Outcast is a non-LTR retrotransposon - consensus. XX KW Outcast; Non-LTR Retrotransposon; Transposable Element; KW Nonautonomous. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-6411 RA Biedler J. and Tu Z.; RT "Non-LTR retrotransposons in the African malaria mosquito, RT Anopheles gambiae: unprecedented diversity and evidence of recent RT activity."; RL Mol. Biol. Evol 20(11), 1811-1825 (2003). XX DR [1] (Consensus) XX FH Key Location/Qualifiers FT CDS 1633..5223 FT /product="Outcast_1p" FT /translation="MEHLNILQTNIQSIRKNRDELTHILHDQKYHVACLQE FT TWLKNEDKITIKGFNTIRTNREDGYGGSCILIKKGIKYKPIKLIDESDEIQ FT ITTILIHSGNLIIISIYIAPNTTKQIIKDTLAKITHNTQNYTNIIIAGDFN FT AHHTYWGDDRIDQKGNTIIEEIDNSNLIILKNNTYTYVPTDHNKRQTSIDL FT TIISKKIHNVINKTILEKHIGASNHKIIKITIKKHSTEPIQHTIINMNKVI FT TNIKHMKGKDIINIKHFTKKVYKIIQNSKQKINFTLKSWWNNEIKKALQDK FT NIARQTFNSTKLIEHAIEFRKRTAIFKLKIKQAKEKQTERNMEKINKDTSS FT KELWNLLGNISNINTTNKESNLIHNEETYAKEFMNLNFTNNKKSKFRTFFS FT TNLPIDTELINSDIWKNILKQKKNTTPGLDKITYQMLRNINDESLTQIITD FT GNKMWENGKINKELKQIKIIAIPKPKKNINDPNNYRPIALIPTITKVLNSA FT VLLKLNKYIETKNILPEKSFGFRNKRTINQCINYFNNEINHNQRLNRISGA FT IFIDLEKAYNNVSTKILVEQMIKSDIPPQIIKWTYSFLRNRTLIIESNKKT FT YKMLVTDGLPQGDVMSPTLFNLYTKDIHKAINQTNTKNTIIQYADDFVILS FT SGVNERELQNNLQNALNAFAIETEQLKFNINSNKTKFMIFGKSHHTIQLNI FT HNNSIEHTNTYKYLGTVIDPKLNFHKHIELLRNKATNRLNFLKIISTQKNN FT INPKNSLKIYRATIRNTMETGASYTLNSNKNKYKTMNSTINQALRKATGCT FT KTTPINTLHAIAAEIPFNIRSRFIVRKELAKDLVYSPIIRQQLQIHKRTKY FT KKKKKTIHETTYEKDNKILKQLYIAKNNDITTHIKINTKIKDTIETKTKTN FT TQILKQITIENMNKYENNRPTIYTDGSIAGNKVGIGIYIKNKPQHYYYSYR FT LKNFTSITTAELIAIEKALLLAQENNIDNPVIFTDSLTSCNILQKAMTENK FT IEEICYNIIKTATNIKADIVWIPSHVGIDGNDRADELAKKGTLTNWFYKNK FT IRYTDATQYYKKEMEEETKKWYTDLGRNKGKKFMQFQNVFKTELWHKQVNL FT NGNEVKTINKILAGHDLSEFWLHKMKIVENGTCEKCLVPETGRHKIFECQK FT YKRHKDISLDTLYEKWMETSGNTCKKIIEFIKENNITL" XX SQ Sequence 6411 BP; 3004 A; 1161 C; 837 G; 1409 T; 0 other; gatgactcct acacaaaccg ttgcacagag cagatcaatc ctatcaatag tttaaaaaaa 60 aaaaaaccct ttttcaaaca gaaccgataa atacccaatt ttggagtggc gaaaataaaa 120 acagtttaaa aacaaagtgt tgtgttgttc ctcttttcaa agtatcctta ctaaaaaaaa 180 atagatacat acatacagag tggaaaaaaa atcaagcaaa gtggtagaga ggggagagga 240 agatgggggt catagacccg ggaggaaata agacattcgg gtcttatttc agcatcttgg 300 gggaaacatc aaacaaaaac accaaaaaaa gaagaaaaat aagaggtgag cgtatcaatt 360 taaatttcag tgaaacagtg aagaatgaag aggtgtttat ggtgttggag agtaaaactc 420 ccggcaaaag tgtggcatgc tacaacccat ttttgttcgc aaaggcaact gaaatggcga 480 ttggatgtag gccattacaa acgtccatgc ttagagacgg caaaatactg atgcgtgtta 540 aaaatgaaac cgaagcaaaa aaactaaaaa atattaattt aaaacacggt gattgtgcca 600 tcgaagtgga tgtttacgaa cacaaaactt taaatcaaag caagggaata atcagaagtg 660 atgcgtgcag gtttttgagt gaagaagagc ttctagaagg tcttcaacct caaaatgtat 720 cagaggtata cataatgaag agaaaaagcc aagatggtgt tctcagaaac acaagaaccg 780 ccatcataac tttcaaatct acagtgcttc ccagaaacat agaaataggt tacttcaccg 840 aaaaagtaga attatttatc ccaaatccaa tgcggtgcat gaaatgcatg ctctttgggc 900 atacaaaaaa taattgcaat cgacaaaaaa tttgcgccaa atgtggagaa aactttcatg 960 agaattgcac aaaccctcta aaatgcacac aatgcggtga aaaccactca tctttagaca 1020 aagattgtcc agtatggaaa gacgaggtgg aaataaaaag aattcaaaca gaaaaaaaaa 1080 taacaataaa agaagcaaga aaaattcgaa gggaaacagt gccagctatt cctagaattt 1140 atataagaga aaaattttca tcaatagtga gaggcaatca aacgcctata atgaacgaaa 1200 aacgaaaaat agacgaaaca agcgaaaata atagtaacac accaaaatca actgaaaata 1260 caccaacaaa aattaacagc acaacaaata acaagtacct agataataac acagaaccca 1320 aaaaaaatac gaatttaaca acaattacca atgatttaga aaaccacaca gaaaaggaaa 1380 aacacacaga aatcgaaaac acagaaacac aaacgaatcc aaacgcaaca gatgaaagtg 1440 ccaccaatat cacttactat agtgatactg aaatgtatct caatgagata gacattcaag 1500 gaacaatcga aatagactac aaagaaacat cacagtaaaa attacaaaaa ttaaaataaa 1560 aaccctactt atatagccca ttacacaaac aatcctaata atacttaaca ttttctctat 1620 aaataattaa acatggaaca cctaaatata ctacaaacta acatacaaag catacggaaa 1680 aatagggacg aactaacaca catactacat gatcaaaaat accatgtagc ttgcttgcag 1740 gaaacatggc tcaaaaatga agataaaata acaattaaag gttttaacac aatacgaaca 1800 aacagggaag atggatatgg aggaagctgc attctaatta aaaaaggaat taaatacaaa 1860 ccaattaaat taatagacga aagtgatgaa attcaaataa caacaatact gatacactct 1920 ggaaatctaa ttataatatc aatatatata gctccaaata ctaccaagca gataataaaa 1980 gacaccttag caaaaataac acataacaca caaaattaca caaacataat aatagctggc 2040 gattttaatg cacatcatac atactgggga gatgatagaa tcgatcaaaa aggaaacaca 2100 ataatagaag aaatagacaa ttcaaacctc attatcctaa aaaataacac atacacttac 2160 gtcccaacag atcacaacaa gagacagacg tccattgatt taacaattat atcgaaaaaa 2220 atccataatg taatcaacaa aactatttta gaaaaacaca taggggcaag caaccacaaa 2280 ataataaaaa taacaattaa aaaacactcc acagaaccga tacaacatac tataataaac 2340 atgaataagg taataacaaa cattaaacat atgaaaggaa aagatataat aaacataaaa 2400 catttcacaa agaaagtata caaaatcata caaaatagta aacaaaaaat caattttaca 2460 ctaaaatcat ggtggaataa cgaaataaaa aaggccctgc aagacaaaaa tatagcaaga 2520 caaacattca acagtacaaa acttattgaa catgccatag aattccgcaa aaggacagca 2580 atcttcaaac ttaaaataaa acaagcaaaa gaaaaacaaa cggagagaaa catggaaaaa 2640 ataaacaaag acacaagtag taaagaacta tggaatctgt taggaaatat tagcaacatc 2700 aataccacca ataaagagtc aaacctaatt cacaatgaag agacctatgc taaggaattt 2760 atgaatttaa atttcacaaa caataaaaaa tccaaattta gaactttttt cagtacaaat 2820 ctaccgatag atacggaact tataaattca gatatatgga aaaacatact gaaacaaaaa 2880 aaaaatacaa cacccggttt agacaaaatt acataccaaa tgctaagaaa cataaacgac 2940 gaatcactaa cacaaatcat aacagacgga aacaaaatgt gggagaatgg taaaataaac 3000 aaagaactaa aacaaattaa aataatagca attcctaaac caaagaaaaa tataaatgac 3060 cccaacaact atagacctat agcacttata ccaacaataa ccaaagtcct aaactcagcg 3120 gtattattaa aattaaataa atatattgaa accaaaaata tactcccgga aaaatccttt 3180 ggattcagaa acaaaagaac cataaaccaa tgcattaact atttcaataa cgaaataaat 3240 cacaaccaaa gactaaatag aataagtgga gcaatattca tagatttgga aaaagcatat 3300 aataatgttt caacaaaaat cctcgtggaa caaatgataa aatcagacat cccacctcaa 3360 attataaaat ggacctactc ctttcttaga aatagaactt taataataga aagcaataaa 3420 aaaacataca aaatgttagt gacagatgga ctaccacaag gagatgttat gtcaccaaca 3480 ctttttaact tatataccaa agatatacat aaagctataa accaaacaaa caccaaaaat 3540 actataatac agtacgcgga tgatttcgtt attctcagca gtggagtaaa cgaaagagag 3600 ctacaaaata acctacaaaa cgctttaaat gctttcgcaa tagaaacaga acaactaaaa 3660 tttaacataa actcaaacaa aacaaaattt atgattttcg gaaaatcaca tcacaccata 3720 caactcaaca tacacaataa tagcatagaa cacacaaaca catacaaata tctaggaaca 3780 gtaatagatc caaaactcaa tttccacaaa cacatcgaac tactacgtaa caaggctaca 3840 aacagactaa atttcttgaa aattataagc acgcaaaaaa ataacataaa ccccaaaaac 3900 agcctaaaaa tataccgggc aacaatcaga aacactatgg aaacgggagc ttcatacact 3960 ctcaatagca acaaaaacaa atacaaaaca atgaattcaa ctatcaacca ggcactaaga 4020 aaagctacag gttgcaccaa aactaccccc ataaatactc tgcatgccat tgcagcagaa 4080 atacctttca atatcagaag tagatttata gtacgtaagg aactagccaa agatttagtc 4140 tactcaccaa tcattagaca acaactacaa atacataaaa gaacaaaata caaaaaaaag 4200 aaaaaaacaa tacatgaaac aacgtacgaa aaagacaata aaattctaaa acaattatac 4260 attgcaaaaa acaacgatat aaccacacac atcaaaataa atacaaaaat taaagacacg 4320 atagaaacaa aaacaaaaac aaacactcaa atcctgaaac aaattaccat tgaaaacatg 4380 aacaagtacg aaaataacag acccaccatc tacactgatg gaagcattgc cggaaataaa 4440 gtaggtatag ggatatacat aaaaaacaaa ccacaacatt actactatag ctatagatta 4500 aaaaatttca cctccataac caccgccgaa ctgatagcaa tcgaaaaggc cttactttta 4560 gcacaagaaa acaatataga caacccagta attttcacag acagtctaac atcatgtaac 4620 attctacaaa aagcaatgac agaaaacaaa atagaggaaa tctgttacaa tattataaaa 4680 acagcaacaa atattaaagc agatattgta tggattccat cacatgtagg cattgatgga 4740 aatgatagag cagatgaact ggcaaaaaaa ggcactttaa caaattggtt ctataagaac 4800 aaaatcagat acacagacgc tacccaatat tacaaaaaag aaatggaaga ggaaacaaaa 4860 aaatggtaca cagacctggg caggaataaa gggaaaaagt ttatgcaatt ccaaaatgtt 4920 ttcaaaacag aactatggca caaacaggtt aacctgaacg gaaatgaagt aaaaacaatt 4980 aacaaaatac tggcaggaca tgacctttcc gaattttggc ttcataaaat gaaaatagta 5040 gaaaatggaa cctgtgagaa atgtctagtc ccagaaacag gacgacacaa aatttttgag 5100 tgtcaaaaat ataaaagaca taaagacatc tccctagaca ctttgtacga aaaatggatg 5160 gaaaccagtg gaaatacctg caaaaaaatt atagaattca taaaagaaaa caatattaca 5220 ttatagtact tcatcacaca tatcttcaaa taaaaagaaa aattcaacaa ttttttttac 5280 acatttccta tatcttcaaa actcacaaca tagaacctgc actcaacact ccagcaaaaa 5340 tgatatagca ataggataag tacttactat ttatttatct atttattata tttattacat 5400 ctattgcatt tattatattt attctattta ttatatttat tatatttatc atatttatgt 5460 gttcaaagta tcaattttgt tcgtataatg aaacgttatt ccatgttttt ttttccttgt 5520 tagtcaatta tttaaaacta tttattcaat catttctcag caccctattg atttatttat 5580 ttatttatat actgtttttg gcatgactta tatatataca caaaaaatca cctaactgca 5640 cattccaaat ccaaacacca caccacaaaa aaaataacaa taacaataat aaaatttaat 5700 aaattcaata ataaatccga ttttacacgc atcagggggg cctcctaagt ttgggtactc 5760 cgtttgttct tgcaaacatg agaaatccaa agacatccat ttccatcact agaggataga 5820 aataagatgc caaaaggacc aactgaaagg gccttgtatc aaatcccccc cccccaaaaa 5880 aaaaaaaaaa aatccaaaat aaaaaaaaaa ataaaaaaaa aaaaaaccct tttggaacct 5940 tcacaaaaca agccatttca aaaataataa taataaaaat aaaaataaac aacccaatac 6000 aagaacaaga accggtacaa aaaaaaaaaa aacaaaattc ctcatcctct cccaagaaat 6060 aaaatatata tatatataca aacaccacaa aaaaaaaaga agaaaatcaa atccactaaa 6120 ccacaagcga acttaaagat ggaacaacgc tagcttttga atacaaactg aaaaggaaac 6180 acaaaccatc aaaaagaaaa aacgattggt gatatcacga aaaccaacac actgtgcatc 6240 aatcagaaag atctcaacca caacatataa acatcaacga aacagcaata taaacatcaa 6300 aggagaatgg atgatttccc ctccaaaaac tggtagtcaa acctctgatt gtaccttcaa 6360 aaacggatgg tataggaggc ctcaacctac caaccctatt aagaagaaga a 6411 // ID BEL12-I_AG repbase; DNA; ANG; 5887 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE BEL12-I_AG is an internal portion of the BEL12_AG LTR DE retrotransposon - a consensus sequence. XX KW 5-bp TSD; BEL12-I_AG; BEL12-LTR_AG; BEL12_AG; Bel clade; KW LTR retrotransposon; RING Zn-finger; integrase; peptidase; KW reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-5887 RA Kapitonov V.V., Pavlicek A., Drazkiewicz A. and Jurka J.; RT "BEL12_AG, a family of Bel/Pao-like LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(3), 31-31 (2003). XX DR [1] (Consensus) XX CC BEL12_AG is a young family of Bel/Pao-like LTR retrotransposons. CC BEL12-I_AG, an internal portion of BEL1_AG is flanked by CC BEL12-LTR_AG CC LTRs. The BEL12-I_AG consensus sequence was reconstructed based CC on CC multiple alignment of 20 copies; they are less than 1% divergent CC from CC the consensus sequence. CC The consensus sequence encodes a 1726-aa BEL12_AGp Bel-like CC protein CC (pos. 582-5759). CC BEL12_AGp is composed of the peptidase A16 (pos. 130-250), RING CC Zn-finger (pos. 358-410), reverse transcriptase (pos. 700-900) CC and CC integrase (pos. 1450-1600) domains. XX FH Key Location/Qualifiers FT CDS 0..0 FT /product="BEL12_AGp" FT /translation="MPAADKRVKMFNLKRVEIMNTLQDFEEFTKSFDATID FT AYQIPSRLEQLEELVSEFTELRKAFNETVDDSEAFDIMQKDRREFNKRSHE FT VRAFLLKNSSHSGASSGLNTTQVNTTISAGTQNHLRLPKVDLPSFDGEITK FT WLTFKDRFSSMVHDSTEMPEVLKLQYLLSALKGDAAHQFEHMQITADNYYV FT TWEALLKRYDNSKVLKREYFKAFYSLEKMKTDSTEELARIVNEANRLVRGL FT ERLNEPVDKWDTPLTSLLFYKLDSKTLVAWEQYSVDFKTDEFTNLVEFLEQ FT RVNILKSSAQNICNQYSANSIMVTGRQARRDGRNVALPVQQTNNTFKGYLK FT CPLCNEQHPLHVCERFERASVINREEIVRKHGLCFNCLRKGHSARECRSTY FT VCQQCKRKHHSKLCKIGRLSEVEVVPSTSRLTATAQANCSKKTVILSTAQI FT IILDVNDQPYKVRALLDNGSQLNFITERVAQELRLKRARVSEQIAGVGGAI FT MRVAGSVVGTIRSLTTEYTTCLEFLILPKIATDLPSETMDVRGWKLPKDVR FT LADPTFHERGSIDMLIGADTFVEMIKAKKIKLDHELPTLLETELGWIVSGA FT YKHNNLNQSMACTIVSQGGENDIASLMNTFFNIEEVQDQNLWNVEERECED FT HFQATTRRDENGRYVVRLPLKAERELGESKEVALRRLIGLERRFEREPKVK FT EAYEAFMQEYITLGHMSVRENENSSDGYYMPHHAVFKQDSTTTKCRVVFDG FT SCKTSNGRSLNDILKVGPTIQQDTTDILLRWRRRAIAVVGDVEKMYRQVWV FT HEEDRKFQRILWRSHSSEKIKTYELNTITYGTASAPFLAIRTLNQVLEDNK FT EKYPLAASRINDFYVDDFISGADSENEAKQLCEETKAALAMGGFPLRKWAS FT NCPHILPSETEIDNIQRVIELKSREGAVSTLGLVWNPILDTLGVKISEPET FT CEIYTKRSIIRTIAKIYDPLGIVDTVKAKAKQFMQRVWSLKKENGDSYGWD FT EEIPQQMRQEWEVFERQLTHLQEVQVPRCVTIVGARNIQIHGFCDASEEGY FT GACVYVRSTNGEEIVSRLFVSKSKVTPLATKHTIARLELCAAHLLGKLLVK FT LKRATEDPYETFCWTDSSTVIYWLKSSPSRWKTFVANRVSQIQNATKEFEW FT RHVPGIHNPADAVSRGRNPAEVVEDKLWWHGPDWLVKDPEHWPKNIESGNT FT CETAKEEKQTKTTLTCMVKEESFINKLCERVGSFTKLKRIVAYCHRFFDRK FT RIHRKSYFELRELKRAEKTIIRLVQNEVYATEYECIKQGQQVVRKSPLRVI FT RPILDKDNVMRVGGRLSNADIKDEQKHPVIIPGKHRIAELIADKYHKILRH FT AGAQLMINTMQLRFWIVGARNVAKRTVFNCVKCTRCRPKLIQQPMADLPEQ FT RVRQARPFSISGVDYAGPIMVKGTHRRAVPTKGYISIFVCFVTKAVHIELV FT SNLTSSAFLAALRRFVARRGHVTELHSDNGTNFRGANNKLRELYKLLNSDT FT HQDEVVGWCAERDMKWKFTPPAAPHFGGLWEAAVKSMKFHLKRVLGTGHLT FT FEDLSTLLAEIEACLNSRPITAISEDPNDMEALTPGHFLVGNHLQTVADVD FT IADVPTNRLNHWRLIQKHMQHIWNRWHREYLSTLQKRAKWNKNAISIEPGR FT LVILQEDNVAVSKWPMARVVDLHPGKDGVTRVVTLKCANGKEIRRPIHRIA FT PLPIES" XX SQ Sequence 5887 BP; 1982 A; 987 C; 1423 G; 1495 T; 0 other; tttggtcctt cgaaccggat ccgaattacg gatattcttt gctagttttt tgtggtttat 60 gctatttggt taccgagcgc caagtgggag tgcaaagaaa tattggattt cttagcagaa 120 cggtgtgttc tccactccat tcgtgttgtt gctactgctg caggagccgg acactcatcg 180 ttggtttgtg tgtgtgagag agagaggacg tgaagccacg aaaggagctg cgtatctgtg 240 ggctgtcggt aaacagcaaa agacgaactg tgcgacgacg aaaggtgcga agttgtaata 300 cggtgaatag ctgaatttag tgcgtgcgtg aactttccat taataacata tttgtggata 360 attagtgtca gcgcagaaca ttttcgaacg cgtgagtgtc gttcaaccgt agtagtggag 420 tgggtgtgcg tgagtgacta tcctagcgcc atacttggca cccaaagctg cgcgcgtatt 480 agttcaggtg tcgacaactt ccaggggtac ggaactcatt gctgtgcgtt ttattaaaaa 540 ttgtgttaag aagtgcctaa agaacaatta ttgggagcaa aatgccagca gcagataaac 600 gagtgaaaat gttcaattta aagagggtag aaattatgaa cactttgcaa gatttcgaag 660 agtttacgaa atcctttgat gcaaccatcg atgcatatca gatacctagt cggttggaac 720 agttagaaga gttggttagt gagttcacgg aattacgtaa agcattcaac gaaacggtag 780 atgattcgga agcgttcgat atcatgcaaa aagatcggcg tgaatttaac aaacggtctc 840 acgaagtaag ggcattttta ttaaaaaata gttcccattc tggggcgtcg agtgggttga 900 acactacaca ggttaacaca actattagtg caggaactca aaatcatctg cgccttccta 960 aggttgacct tccaagcttt gatggtgaaa taacaaaatg gcttacgttc aaagacagat 1020 tttcgtctat ggtgcatgac tcgacagaaa tgcctgaagt gttaaaattg caatatttat 1080 tatcggcgct taagggtgat gctgcgcatc aatttgaaca catgcaaata acggccgata 1140 attattatgt gacatgggaa gctttgttaa aacgttatga taattctaag gtgttaaaaa 1200 gggaatattt caaggcattt tattctctag aaaaaatgaa aaccgactcg acggaagaat 1260 tggcacgtat cgtgaacgaa gcaaatagat tagtcagagg gttagaacgt ttgaacgagc 1320 ctgtcgacaa gtgggacact ccgttaacaa gtttattgtt ttacaaattg gacagtaaaa 1380 ctttagtggc gtgggagcag tactcggtgg atttcaaaac agatgaattc acaaatttag 1440 tggaattttt ggaacagcga gtgaacattt taaagagctc tgcgcaaaat atttgcaatc 1500 aatattcggc taattcgatc atggtgaccg gcaggcaggc gagaagagat ggtaggaatg 1560 tggcattacc agtacagcaa acgaacaata catttaaagg gtatctcaag tgtccactgt 1620 gcaacgaaca gcatccgttg catgtgtgtg agagattcga aagagcgtca gtgataaatc 1680 gagaggagat agtaagaaaa catggcttat gttttaattg cttgcgaaag ggacactcag 1740 cacgtgagtg tagatcgacg tatgtgtgcc agcagtgtaa aagaaagcac cattcgaaac 1800 tgtgtaagat aggaagatta tctgaagtgg aagtggttcc gtcaacgtca agattaactg 1860 ctacggctca agcaaattgt tcgaagaaaa cagttatatt gtctaccgcg caaattataa 1920 ttctagatgt taacgatcag ccatacaaag tgagagcatt actcgataac ggctctcaat 1980 taaatttcat cacggagaga gtggcacaag aactcagatt gaagagagcc cgcgtgagtg 2040 aacagatagc tggtgtgggt ggagctatta tgagagttgc aggatcagtt gtgggtacca 2100 ttcgatcact caccactgag tacacaacat gcttagaatt tttaattttg ccaaaaattg 2160 ctaccgattt accatccgaa acaatggacg tacgaggttg gaagttacca aaagatgttc 2220 gattagcgga ccctacattc catgaaaggg gctcaataga tatgttgata ggggcagaca 2280 cctttgttga aatgataaag gcaaaaaaga taaagcttga tcatgagtta ccaacactac 2340 ttgaaacgga attaggttgg attgtgagtg gtgcatataa gcataataat ttaaatcaat 2400 caatggcatg cacaattgtt agtcaagggg gagaaaacga catagcttct ttgatgaaca 2460 cattttttaa tatcgaagaa gttcaagatc agaatttgtg gaacgttgag gaacgagaat 2520 gcgaagatca ttttcaagca acaacaaggc gtgatgagaa tggaagatac gtggtgcgat 2580 taccactcaa ggcggagagg gaattgggag agtccaagga agtagcctta cggcggctga 2640 ttggacttga gagaagattt gagagggaac cgaaggtgaa ggaagcatat gaagcattta 2700 tgcaggaata tatcactttg gggcacatga gtgtcagaga aaatgaaaat agtagtgacg 2760 gttactatat gccgcaccac gctgttttca agcaagatag caccacgaca aagtgtcgtg 2820 tagtttttga tggatcgtgc aaaacgtcaa atggtcgatc tctcaatgat atattaaaag 2880 taggtccaac aatacagcaa gacactacgg atattttatt aagatggcga cgtagagcca 2940 tagcagtggt cggtgatgtt gaaaaaatgt accgacaagt gtgggttcat gaggaggatc 3000 gaaagttcca acgaatactt tggagatcac attcaagcga aaaaataaaa acatatgagc 3060 ttaatacaat aacgtacgga acggcatcag cgccatttct tgctatacga accctaaatc 3120 aggtgctaga agacaataag gaaaaatacc cactagcagc atcgcgtata aatgactttt 3180 acgtggatga ttttatttct ggtgcggatt cagagaatga agcaaaacaa ttgtgcgaag 3240 aaaccaaggc agcgttagca atgggtgggt ttcctttacg caaatgggct tctaattgtc 3300 cccatatatt accatctgaa accgaaattg ataatataca aagggtaatt gaattgaagt 3360 caagagaggg tgcagtatca acattaggac ttgtgtggaa tccgatctta gacactctag 3420 gtgtaaaaat tagtgaacca gaaacttgtg agatatatac aaaaagatcg attataagaa 3480 caatcgcaaa aatctatgat ccattgggga ttgtggatac agttaaagca aaagcaaaac 3540 aattcatgca aagagtatgg tcattaaaaa aagaaaatgg tgactcatac gggtgggatg 3600 aagaaattcc acagcaaatg agacaagagt gggaagtgtt tgagaggcag ttaacacatt 3660 tacaagaagt acaagtaccg agatgcgtaa cgatagtagg agcacgtaat attcaaatac 3720 acggattttg tgatgcttct gaagagggtt atggagcttg cgtatatgtg agaagcacga 3780 atggagagga aatagtttcg cgattatttg tatcgaaatc aaaggtcacc ccattagcta 3840 caaaacacac aatagctaga ttagaactat gcgcagctca tttattagga aagctattgg 3900 tgaaactcaa aagggccaca gaagatccat acgaaacatt ttgttggaca gactctagca 3960 cagtaattta ttggttgaaa tcgtctccaa gtcgttggaa aacattcgtg gcgaatagag 4020 tatcacaaat acaaaatgca acaaaagaat ttgaatggag gcatgtgcct gggattcata 4080 atccagcaga tgcggtttcg agaggtagaa atcccgcaga ggttgttgag gataagcttt 4140 ggtggcatgg accagattgg ctagtcaaag acccagaaca ttggcctaaa aatatagagt 4200 caggaaacac ttgtgagaca gcgaaagaag aaaaacaaac gaaaactaca ttaacatgta 4260 tggtgaaaga ggaaagtttt ataaacaaac tatgcgagag agtaggttca ttcacaaaac 4320 taaaaaggat tgtcgcatat tgtcatcgtt tcttcgatcg taagcgaatc catcgcaaat 4380 cttattttga gttgagggaa ctaaaacgag ctgaaaagac aatcattcga ttggttcaaa 4440 atgaagtcta tgcaactgaa tacgagtgta tcaaacaagg gcaacaagta gtgcgaaaat 4500 caccattgag agtgattaga ccaatactgg acaaagataa tgtcatgaga gtaggaggtc 4560 ggttgtcaaa cgccgacata aaagacgaac aaaaacatcc tgttattatt ccaggaaagc 4620 acaggattgc agagttgatt gccgacaagt accataagat acttcgtcat gctggggctc 4680 aactgatgat aaacactatg cagttaaggt tttggatagt gggagcgcgc aatgtagcga 4740 aacgtacagt tttcaactgt gtgaaatgta ctcgttgtag accaaaactg attcagcagc 4800 caatggctga tcttccagag cagagggtga gacaagctag accgttctca attagcggtg 4860 tggactacgc aggaccgata atggtaaagg gcacacaccg acgggcggtg cccacaaaag 4920 gctatatttc aatatttgtt tgtttcgtaa caaaagcagt tcatatcgaa cttgtatcaa 4980 atctaacctc ttctgcattt ttagctgcac tgcgtcgatt cgttgcgagg agagggcatg 5040 ttacggaatt gcattcggat aacggcacaa acttccgagg tgcgaacaat aagttgcgcg 5100 aactgtataa attactaaat tctgatacac accaagacga ggttgtagga tggtgcgccg 5160 aacgagacat gaagtggaag tttacacccc cagctgcacc acattttgga ggtctgtggg 5220 aggccgcggt gaaatctatg aaatttcatt taaagcgcgt gttaggtaca gggcatttaa 5280 cgtttgaaga tttatcaacc ttattagccg aaatagaagc atgtctaaat tctcgaccaa 5340 ttacggcaat atcagaagat ccaaatgata tggaagcact taccccaggg cattttttgg 5400 tagggaatca cttacaaacg gtagcggacg tagacatcgc agatgtgcca acaaacagat 5460 taaaccattg gagactgata caaaaacaca tgcaacacat ttggaatcgt tggcatcgcg 5520 aatatttaag tacattgcag aagcgagcaa agtggaacaa aaatgcgata tcgattgagc 5580 caggaagatt agtaattcta caagaagaca atgttgcagt atctaaatgg ccgatggcaa 5640 gagtagtgga tttacatcca ggaaaagatg gtgttacacg agtagtaacg ttgaaatgcg 5700 caaatggcaa ggaaattcgt aggccaattc atagaatagc tcctttacct atagaatcgt 5760 aaattgaaat caataattgg aattatgtgg aattaagatg aggaatttta agaaatcaaa 5820 taggaatatg aaaatgaatt caatcattac tgaatattaa aaaacattcg tttttggtga 5880 ccgggaa 5887 // ID CR1-6_AG repbase; DNA; ANG; 4293 BP. XX AC . XX DT 03-APR-2003 (Rel. 8.03, Created) DT 03-APR-2003 (Rel. 8.03, Last updated, Version 1) XX DE CR1-6_AG is a CR1-like non-LTR retrotransposon - a consensus DE sequence. XX KW AP endonuclease; CR1 clade; CR1-6_AG; DNA/RNA-binding; PHD finger; KW Non-LTR retrotransposon; reverse transcriptase. XX OS Anopheles gambiae OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; OC Neoptera; Endopterygota; Diptera; Nematocera; Culicoidea; OC Culicidae; Anophelinae; Anopheles. XX RN [1] RP 1-4293 RA Kapitonov V.V. and Jurka J.; RT "CR1-6_AG, a family of CR1-like non-LTR retrotransposons from RT African malaria mosquito."; RL Repbase Reports 3(3), 57-57 (2003). XX DR [1] (Consensus) XX CC CR1-6_AG is a young family of CR1-like non-LTR retrotransposons. CC The CR1-6_AG consensus sequence was reconstructed based on CC multiple alignment of ~10 copies identified in the CC sequenced portion of the genome. Given the ~2% divergence CC of these copies from the consensus sequence, transposition of CC CR1-6_AG occurred less than 1 million years ago. CC The 3' terminus of CR1-6_AG is composed of the ATAAAC CC microsatellite. CC CR1-6_AG encodes two protein sequences: a 358-aa CR1-6_AG-ORF1p CC (positions 225-1299) and 998-aa CR1-6_AG-ORF2p (positions CC 1300-4293). CR1-6_AG_ORF1p is DNA/RNA binding protein composed CC of the PDH domain (aa positions 4-57). CR1-6_AG-ORF2p is composed CC of CC the AP endonuclease (aa positions 50-230) and reverse CC transcriptase CC (aa positions 550-780) domains. XX SQ Sequence 4293 BP; 915 A; 1227 C; 1026 G; 1125 T; 0 other; tctgacgtct aactgctcgg tcgcttatcg ctgtgctccg cgtttttcca aaaatattta 60 agtgtatttt ctaacgtacc cagttaatac tcgttcaact gactaccaaa ctgcttcctg 120 ttactgttgc gcgttagtgt gttatttcat tgacccacga ctgtgtatta acgtttttaa 180 tccgggttaa cggtggtgtg atcaaaaccg cacataattt cgcgatggcg gctatctgtt 240 tcgcgtgtgc tgtgtcactg gatgctgccg actgtatcgt cggctgtgcg tactgtgagg 300 ctacttttca ccgcggttgt tgcaggctgc cttccgagct gattgatgcg gtcctgactc 360 acatcgatct gcactggagc tgcactgggt gcaccaatat cctgaaaaat ccgcgctgcc 420 gatcagtcaa agagatcggg gctcaggtcg gttttcaagc tgctctcacc tcaactgttg 480 cggctgtggg gaagcttatt gaaccgatta tcgccgaggt gcgcagtgga tttactctac 540 tgcaaaatgc acccatacct cagctttgca ataccgatcc tcggcccgtt gcgggtagaa 600 agcggcggcg tatcatcgag gattctatgt cccctgatgt caacaaaaac gtaaacattc 660 gtcaaaataa catgtttgca gcgtcatcgc caagcgcgta cactaacact acagtcggca 720 tcccaccctc gtctacgcta ccggaagaac tcatgggaac cgattcgcta tcgtcaccgc 780 ttcgagcagc gtttccccag ccggccacag acagaatatg gatccgacta tctcgcttgt 840 ccactgccgt caccgtggag caagtggtcg cttctgtgaa acgccgttta gccaccgatg 900 acgtcctagc atactgcttg ctgaggaagg gagtcagtgt tgacagcgta aactggcttt 960 ctttcaaagt gagagtaccg gccgcccttc gtgacgcagc actcgcccca tcgtcctggc 1020 ctgtcggtat tggtgtacgt gaatttgtac aatcccgtca acgagagcat ggacactcat 1080 cgtcaccaat caccatcaaa caccgttctc tcacacgcac acctgttgtc atcgatcgcc 1140 gatcgatgcc tcgcacacca acatccactg tctatcacgc accggcacac gcatcaactt 1200 cacaggctca aacactaaca tcaccacaat tgggagaaca cacgctgaac gacactactc 1260 atggtcctaa ttcaacactc attgacggcc cgcttttaat tcgccgcact tccaacacca 1320 acttacaaca gaccacactt gaccgtttct ttcacgaata gacgaacctc tgttaagtct 1380 gcacaaacac ctacctgtta tgttcctgct gctaatgcgc ttcgcactgc ccgttccact 1440 gcctctgttt actatcaaaa cgtgcgggga ctgcgcacga aggtcgatga gtttcgcctg 1500 tcggtgttgg aatccaattt cgatgtaata gtgcttacgg aaacctggct cgatcctagt 1560 ctaccttcgg ctttgctgtt tgacgatagc ttccgagtct accgatgcga tcgtagtgtt 1620 gacaacagta catgttcccg cggtggaggt gtgttgattg cgtgctctca gtctctgacg 1680 tcacgggagc acacaacggt gcatccatcg cttgagctag tgtgcgttgt aatacaacta 1740 ggcaattccc gactattcat cattgctgca tacctcccgc ctagacttgc cgcaaatgct 1800 gccacgctcc gtgaaatcga aaattgcatt cgctcattat gctcaactat gcatcccgga 1860 gacggtttac tcctgttagg agacttcaac caacctctcg tctcctggtc agccgctcag 1920 catgatccgg atttgccatt tctgcattat gagccacgta cgcgatcggc tctctccgct 1980 ctgtttatgg acgagatgca tcatagcgga ctcttccaga tcaatggtca tcttaacacc 2040 agcggacgcg tcttagacct ggtgtttgca aataatgctg ttgcttctgt ttgccttccg 2100 cttgaacttt gcctcacccc gctgttagca attgacacct atcacccggc gcttgagttg 2160 gccatcccgc taccacgaga ggaatcagcc gttcctgcgc taacatcacg ccttgactac 2220 gcgcggacgg actttaacag gctcctgccg atgatcgcct cgttcgcgaa tgtcttcgat 2280 tgttcccatt acgccactct tgacctcgct gtgaaagatt ttgagcggtt catgttgcaa 2340 gccctgaatg aatgcacgcc ggtgaaacga cctaaacgag gccccccttg gggtgacaga 2400 acgctgcgca ggctcaaaac ggctaaagct gccgcgtatc gcgattattt gttgcgtagg 2460 tgtcccgctg cattgcgcaa ctataacact gcgcactctc tctatcggcg ctataacagg 2520 ttccgctacc tggggcacgt acggcgtacg gttttgcgct gccgcggcaa ttcacgtgta 2580 ctgtggaact tcgcaaacaa tcgtcggaaa tcctctggtt ttcctagctc cgtcagttac 2640 aacgggcgca acggcaacaa tccgtctgct gtatgtgata tttttgcctc gcgctttgcc 2700 gccaccttcc ttccagctgt aactgatgag aggcagatag ccgacgcatt atcaaacgta 2760 ccggtggatg ctatggcccc aaacttgccg atcatcgatg agtattcagt ctcgaaagcg 2820 attgacaggc tgaaatcttc gtttgctcca gggcctgacg ggatcccggc ttccacgctt 2880 aagcgttgtg gcaccaccat cgcgcctatc ctggcctcga tttttcgcga ctcgctgcgt 2940 tcgggcatct atcctgcctg ctggaaaact tcgtggcttg ttccagtgca caaaaagggg 3000 gacaaatcaa atgcatgtaa ttaccgtggc attacatcgc tttgtgcctg cgctaaggtc 3060 ttcgagctcc tggtgtacga acctctcctc gcagctgcct cgaactacat tagcacagct 3120 caacatggat ttgtccctca gcgttcaacc accaccaacc tggttgagtt cgttagcctc 3180 tgccacagga ccatcgatgc cggctcgcag atcgacgcag tctacacgga catcaaggct 3240 gccttcgata gcgttccgca cgccttgctg ctcgcaaagc tcgaaacgct tggtctgcct 3300 gtgcagctgc tggcctggat gcgctcctat cttactggtc gcacatactg cgtgaagatg 3360 ggaccccata cgtcgcgccg tatcgatgct tcttctgggg tgccgcaggg gagtaatctt 3420 ggaccgctgc tgtttgtcat tttcctgaat gacgtaacac ggttgctccc tcctgacggc 3480 cacctactgt atgcagacga cgcgaagctg ttcctgccta tccgcgaccg gtcagaccaa 3540 ctccgccttc aagccactct aagtgccttc cagtcatggt gctctttgaa tggtcttgaa 3600 ttgtgcgtcg agaagtgtgt tgtcgttacg tttgcgagaa agcggtgccc cttagtgtat 3660 gactatgcgt taaatggatc taccattggt cgcaaaagct gtgtcacgga tctaggagtg 3720 ctccttgacg aaaagttgag cttccacgac cagctagagc acgttgtcac taagggtaac 3780 caattgatcg gcttactaaa acaaatagca cgagacatca ctgacccgat ttgcatcaag 3840 acgctatact gtgctttggt gcgaccagtg ttagaatacg cttctgtagt atggtggcct 3900 acagctgctc gtcccctagc tcgtttagag tcgatccagc gcaaattcac gcggttcgct 3960 ttgcgctcct ggagtgtcca acttgactat gagggacgct gtgcgttgct tggcatcgag 4020 acgctgaagc agcggaactg caacgctcag aggctgtttg tcgcgggact tcttgacaat 4080 cggatcgact cgcccgcgct tctttcgagg ctcaacatgt atgtcccgcc gagatcgctc 4140 cgagctagat cgctacttga cgtggaggaa cgccgcactc gctttggctc ctctgatccg 4200 tttattcgta tgtgccgtga gtttaatgtc atttgtgatc gtcatcaacc tgacatgtcg 4260 cgcaccgcat tgttgaatag tattcgtgtc gtg 4293 //